Reserve the namespace starting with the NULL byte for internal
use-cases. Backend services will allow path names to include the NULL
byte in urls and validate names in the reserved namespace. Database
services will filter all names starting with the NULL byte from
responses unless the request includes the header:
X-Backend-Allow-Reserved-Names: true
The proxy server will not allow path names to include the NULL byte in
urls unless a middlware has set the X-Backend-Allow-Reserved-Names
header. Middlewares can use the reserved namespace to create objects
and containers that can not be directly manipulated by clients. Any
objects and bytes created in the reserved namespace will be aggregated
to the user's account totals.
When deploying internal proxys developers and operators may configure
the gatekeeper middleware to translate the X-Allow-Reserved-Names header
to the Backend header so they can manipulate the reserved namespace
directly through the normal API.
UpgradeImpact: it's not safe to rollback from this change
Change-Id: If912f71d8b0d03369680374e8233da85d8d38f85
Previously, our unit tests with socket servers would let eventlet
capitalize headers on the way out, which
- isn't something we want to have eventlet do, because it
- breaks unicode-in-header-names on py3, so it
- is already disabled in swift.common.wsgi.run_server() for real servers.
Include a test to make sure we don't forget about it in the future.
Change-Id: I0156d0059092ed414b296c65fb70fc18533b074a
I saw an encryption test fail with 10s node timeout when the py36
unittests ran in the gate. I believe there is an outsized potential for
a beneficial increase in test reliability when run in the gate compared
to relatively small chance of any negative side-effect in test failure
responsiveness.
Change-Id: Ia31912828d416d84c39782222e4636a97a8bfe44
You could *try* doing something similar to what we were doing
there over in email.message for py3, but you would end up
breaking pkg_resources (and therefor entrypoints) in the
process.
Drive-by: have mem_diskfile implement more of the diskfile API.
Change-Id: I1ece4b4500ce37408799ee634ed6d7832fb7b721
I can't imagine us *not* having a py3 proxy server at some point, and
that proxy server is going to need a ring.
While we're at it (and since they were so close anyway), port
* cli/ringbuilder.py and
* common/linkat.py
* common/daemon.py
Change-Id: Iec8d97e0ce925614a86b516c4c6ed82809d0ba9b
Make some json -> (text, xml) stuff in a common module, reference that in
account/container servers so we don't break existing clients (including
out-of-date proxies), but have the proxy controllers always force a json
listing.
This simplifies operations on listings (such as the ones already happening in
decrypter, or the ones planned for symlink and sharding) by only needing to
consider a single response type.
There is a downside of larger backend requests for text/plain listings, but
it seems like a net win?
Change-Id: Id3ce37aa0402e2d8dd5784ce329d7cb4fbaf700d
Recently out gate started blowing up intermittently with a strange
case of ports mixed up. Sometimes a functional tests tries to
authorize on a port that's clearly an object server port, and
the like. As it turns out, eventlet developers added an unavoidable
SO_REUSEPORT into listen(), which makes listen(("localhost",0)
to reuse ports.
There's an issue about it:
https://github.com/eventlet/eventlet/issues/411
This patch is working around the problem while eventlet people
consider the issue.
Change-Id: I67522909f96495a6a30e1acdb79835dce2189549
Add support for a 2+1 EC policy to be optionally used as default
policy when running in process functional tests.
The EC policy may be selected by setting the env var:
SWIFT_TEST_IN_PROCESS_CONF_LOADER=ec tox
when running .functests, or by using the new tox test env:
tox -e func-ec
Change-Id: I02e3553a74a024efdab91dcd609ac1cf4e4f3208
This patch enables efficent PUT/GET for global distributed cluster[1].
Problem:
Erasure coding has the capability to decrease the amout of actual stored
data less then replicated model. For example, ec_k=6, ec_m=3 parameter
can be 1.5x of the original data which is smaller than 3x replicated.
However, unlike replication, erasure coding requires availability of at
least some ec_k fragments of the total ec_k + ec_m fragments to service
read (e.g. 6 of 9 in the case above). As such, if we stored the
EC object into a swift cluster on 2 geographically distributed data
centers which have the same volume of disks, it is likely the fragments
will be stored evenly (about 4 and 5) so we still need to access a
faraway data center to decode the original object. In addition, if one
of the data centers was lost in a disaster, the stored objects will be
lost forever, and we have to cry a lot. To ensure highly durable
storage, you would think of making *more* parity fragments (e.g.
ec_k=6, ec_m=10), unfortunately this causes *significant* performance
degradation due to the cost of mathmetical caluculation for erasure
coding encode/decode.
How this resolves the problem:
EC Fragment Duplication extends on the initial solution to add *more*
fragments from which to rebuild an object similar to the solution
described above. The difference is making *copies* of encoded fragments.
With experimental results[1][2], employing small ec_k and ec_m shows
enough performance to store/retrieve objects.
On PUT:
- Encode incomming object with small ec_k and ec_m <- faster!
- Make duplicated copies of the encoded fragments. The # of copies
are determined by 'ec_duplication_factor' in swift.conf
- Store all fragments in Swift Global EC Cluster
The duplicated fragments increase pressure on existing requirements
when decoding objects in service to a read request. All fragments are
stored with their X-Object-Sysmeta-Ec-Frag-Index. In this change, the
X-Object-Sysmeta-Ec-Frag-Index represents the actual fragment index
encoded by PyECLib, there *will* be duplicates. Anytime we must decode
the original object data, we must only consider the ec_k fragments as
unique according to their X-Object-Sysmeta-Ec-Frag-Index. On decode no
duplicate X-Object-Sysmeta-Ec-Frag-Index may be used when decoding an
object, duplicate X-Object-Sysmeta-Ec-Frag-Index should be expected and
avoided if possible.
On GET:
This patch inclues following changes:
- Change GET Path to sort primary nodes grouping as subsets, so that
each subset will includes unique fragments
- Change Reconstructor to be more aware of possibly duplicate fragments
For example, with this change, a policy could be configured such that
swift.conf:
ec_num_data_fragments = 2
ec_num_parity_fragments = 1
ec_duplication_factor = 2
(object ring must have 6 replicas)
At Object-Server:
node index (from object ring): 0 1 2 3 4 5 <- keep node index for
reconstruct decision
X-Object-Sysmeta-Ec-Frag-Index: 0 1 2 0 1 2 <- each object keeps actual
fragment index for
backend (PyEClib)
Additional improvements to Global EC Cluster Support will require
features such as Composite Rings, and more efficient fragment
rebalance/reconstruction.
1: http://goo.gl/IYiNPk (Swift Design Spec Repository)
2: http://goo.gl/frgj6w (Slide Share for OpenStack Summit Tokyo)
Doc-Impact
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Idd155401982a2c48110c30b480966a863f6bd305
Relocates some test infrastructure in preparation for
use with encryption tests, in particular moves the test
server setup code from test/unit/proxy/test_server.py
to a new helpers.py so that it can be re-used, and adds
ability to specify additional config options for the
test servers (used in encryption tests).
Adds unit test coverage for extract_swift_bytes and functional
test coverage for container listings. Adds a check on the content
and metadata of reconciled objects in probe tests.
Change-Id: I9bfbf4e47cb0eb370e7a74d18c78d67b6b9d6645