* add get_object
* allow extra headers passthrough on HEAD/metadata reqeusts
* expose (account|container|get_object)_ring properties
Pipeline propety access to the auto_create_account_prefix also allows us to
bypass the early exit on a container HEAD for auto_create_accounts if the
container-updater hasn't cycled yet.
Allow overriding of storage policy index.
This is something the reconciler will need so that it can GET from one
policy, PUT in another, and then DELETE from the first one again.
DocImpact
Implements: blueprint storage-policies
Change-Id: I9b287d15f2426022d669d1186c9e22dd8ca13fb9
Objects now have a storage policy index associated with them as well;
this is determined by their filesystem path. Like before, objects in
policy 0 are in /srv/node/$disk/objects; this provides compatibility
on upgrade. (Recall that policy 0 is given to all existing data when a
cluster is upgraded.) Objects in policy 1 are in
/srv/node/$disk/objects-1, objects in policy 2 are in
/srv/node/$disk/objects-2, and so on.
* 'quarantined' dir already created 'objects' subdir so now there
will also be objects-N created at the same level
This commit does not address replicators, auditors, or updaters except
where method signatures changed. They'll still work if your cluster
has only one storage policy, though.
DocImpact
Implements: blueprint storage-policies
Change-Id: I459f3ed97df516cb0c9294477c28729c30f48e09
FakeLogger gets better log level handling
Parameterize logger on some daemons which were previously
unparameterized and try and use the interface in tests.
FakeRing use more real code
The existing FakeRing mock's implementation bit me on some pretty subtle
character encoding issue by-passing the hash_path code that is normally
part of get_part_nodes. This change tries to exercise more of the real
ring code paths when it makes sense and provide a better Fake for use in
testing.
Add write_fake_ring helper to test.unit for when you need a real ring.
DocImpact
Implements: blueprint storage-policies
Change-Id: Id2e3740b1dd569050f4e083617e7dd6a4249027e
As seen on #1174809, changes use of mutable types as default
arguments and defaults them within the method. Otherwise, those
defaults can be unexpectedly persisted with the function between
invocations and erupt into mass hysteria on the streets.
There was indeed a test (TestSimpleClient.test_get_with_retries)
that was erroneously relying on this behavior. Since previous tests
had populated their own instantiations with a token, this test only
passed because the modified headers dict from previous tests was
being overridden. As expected, with the mutable defaults fix in
SimpleClient, this test begain to fail since it never specified any
token, yet it has always passed anyway. This change also now provides
the expected token.
Change-Id: If95f11d259008517dab511e88acfe9731e5a99b5
Related-Bug: #1174809
When current code modifies the pipeline, it prints the entry point
names instead of the names used to construct the pipeline. This is
inconvenient because a sysadmin cannot copy and paste from the log.
We already save the pipeline name into contexts in most cases, so
the fix simply reuses that to provide friendly names.
Fixes bug: 1311802
Change-Id: Ic76baf1360cd521f140fa1980029ccbce58f1717
One can argue that it makes sense for the client-facing proxy server
to have certain middlewares like gatekeeper in its pipeline, but that
is not desirable for InternalClient. In particular, it prevents you
from passing in sysmeta headers using InternalClient, and I found
myself wanting to do that earlier today.
Now InternalClient's proxy application gets exactly what's configured;
no more, no less. This will mean that the object expirer can read and
write sysmeta headers, but I think we can trust it to keep our
secrets.
Change-Id: I17b4a89c24e600754701ee1645b40406421fa6f3
This is for the same reason that SLO got pulled into middleware, which
includes stuff like automatic retry of GETs on broken connection and
the multi-ring storage policy stuff.
The proxy will automatically insert the dlo middleware at an
appropriate place in the pipeline the same way it does with the
gatekeeper middleware. Clusters will still support DLOs after upgrade
even with an old config file that doesn't mention dlo at all.
Includes support for reading config values from the proxy server's
config section so that upgraded clusters continue to work as before.
Bonus fix: resolve 'after' vs. 'after_fn' in proxy's required filters
list. Having two was confusing, so I kept the more-general one.
DocImpact
blueprint multi-ring-large-objects
Change-Id: Ib3b3830c246816dd549fc74be98b4bc651e7bace
Middleware or core features may need to store metadata
against accounts or containers. This patch adds a
generic mechanism for system metadata to be persisted
in backend databases, without polluting the user
metadata namespace, by using the reserved header
namespace x-<server_type>-sysmeta-*.
Modifications are firstly that backend servers persist
system metadata headers alongside user metadata and
other system state.
For accounts and containers, system metadata in PUT
and POST requests is treated in a similar way to user
metadata. System metadata is not yet supported for
object requests.
Secondly, changes in the proxy controllers ensure that
headers in the system metadata namespace will pass through
in requests to backend servers.
Thirdly, system metadata returned from backend servers
in GET or HEAD responses is added to the cached info
dict, which middleware can access.
Finally, a gatekeeper middleware module is provided
which filters all system metadata headers from requests
and responses by removing headers with names starting
x-account-sysmeta-, x-container-sysmeta-. The gatekeeper
also removes headers starting x-object-sysmeta- in
anticipation of future support for system metadata being
set for objects. This prevents clients from writing or
reading system metadata.
The required_filters list in swift/proxy/server.py is
modified to include the gatekeeper middleware so that
if the gatekeeper has not been configured in the
pipeline then it will be automatically inserted close
to the start of the pipeline.
blueprint cluster-federation
Change-Id: I80b8b14243cc59505f8c584920f8f527646b5f45
This commit adds a hook for WSGI applications
(e.g. proxy.server.Application) to modify their WSGI pipelines. This
is currently used by the proxy server to ensure that catch_errors is
present; if it is missing, it is inserted as the first middleware in
the pipeline.
This lets us write new, mandatory middlewares for Swift without
breaking existing deployments on upgrade.
Change-Id: Ibed0f2edb6f80c25be182b3d4544e6a67c5050ad
PEP 333 (WSGI) says that if your iterable has a close() method, the
framework must call it.
WSGIContext._app_call pulls the first chunk off the returned iterable
to make sure that it gets status and headers, and then it would
itertools.chain() that first chunk back onto the iterable so the whole
body went out. swob.Response.call_application() does it too.
The problem is that an itertools.chain object doesn't have a close()
method, so your iterable's fancy-pants close() method has no chance of
getting called.
This patch adds a slightly smarter CloseableChain that works like
itertools.chain, but has a close() method that calls the underlying
iterables' close() methods, if any.
Change-Id: If975c93f53c27dfa0c2f52f4bbf599af25202f70
Mark Seger at HP has been looking at small objects, 1 and 2 KB size,
and with Rick Jones' help noticed that TCP protocol traces showed
effects from the Nagel algorithm client-to-server and
server-to-client.
This patch just addresses our WSGI server responses, but does not
address out-bound connections from the various servers.
Change-Id: I11f86df1f56fba1c6ab6084dc1f580c395f072dc
Signed-off-by: Peter Portante <peter.portante@redhat.com>
Red Hat's QA noticed that in case of the infamous "xattr>=0.4"
error, swift-init exits with a zero error code, which confuses
startup scripts (not Systemd though -- that one knows exactly
if processes run or not).
The easiest fix is to return the error code like Guido's blog
post suggested.
Change-Id: I7badd8742375a7cb2aa5606277316477b8083f8d
Fixes: rhbz#1020480
The module used to simply "import swift", and then reference
the classes:
swift.common.middleware.catch_errors.CatchErrorsMiddleware
swift.proxy.server.Application
in order to very that the WSGI services loaded the proper filters and
apps.
However, those references only happen to work, as the WSGI loading
would properly import the rest of the path so that the namespace
reference would be okay. If the WSGI configuration were to change, or
if the behavior of WSGI broke, instead of of seeing the actual failure
condition, a module attribute error would result instead:
AttributeError: 'module' object has no attribute 'middleware'
The referenced names are now properly imported with this change to
avoid misleading error conditions.
Change-Id: Ifff4271bc5be1136bf17e4e5b291b01033d608db
Signed-off-by: Peter Portante <peter.portante@redhat.com>
For this commit, ssync is just a direct replacement for how
we use rsync. Assuming we switch over to ssync completely
someday and drop rsync, we will then be able to improve the
algorithms even further (removing local objects as we
successfully transfer each one rather than waiting for whole
partitions, using an index.db with hash-trees, etc., etc.)
For easier review, this commit can be thought of in distinct
parts:
1) New global_conf_callback functionality for allowing
services to perform setup code before workers, etc. are
launched. (This is then used by ssync in the object
server to create a cross-worker semaphore to restrict
concurrent incoming replication.)
2) A bit of shifting of items up from object server and
replicator to diskfile or DEFAULT conf sections for
better sharing of the same settings. conn_timeout,
node_timeout, client_timeout, network_chunk_size,
disk_chunk_size.
3) Modifications to the object server and replicator to
optionally use ssync in place of rsync. This is done in
a generic enough way that switching to FutureSync should
be easy someday.
4) The biggest part, and (at least for now) completely
optional part, are the new ssync_sender and
ssync_receiver files. Nice and isolated for easier
testing and visibility into test coverage, etc.
All the usual logging, statsd, recon, etc. instrumentation
is still there when using ssync, just as it is when using
rsync.
Beyond the essential error and exceptional condition
logging, I have not added any additional instrumentation at
this time. Unless there is something someone finds super
pressing to have added to the logging, I think such
additions would be better as separate change reviews.
FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
CLUSTERS. Some of us will be in a limited fashion to look
for any subtle issues, tuning, etc. but generally ssync is
an experimental feature. In its current implementation it is
probably going to be a bit slower than rsync, but if all
goes according to plan it will end up much faster.
There are no comparisions yet between ssync and rsync other
than some raw virtual machine testing I've done to show it
should compete well enough once we can put it in use in the
real world.
If you Tweet, Google+, or whatever, be sure to indicate it's
experimental. It'd be best to keep it out of deployment
guides, howtos, etc. until we all figure out if we like it,
find it to be stable, etc.
Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6
Address all the "hacking" lines that are flagged, and all the modules
that just have one item flagged.
Change-Id: I372a4bdf9c7748f73e38c4fd55e5954f1afade5b
Signed-off-by: Peter Portante <peter.portante@redhat.com>
Change the default value of wsgi workers from 1 to auto. The new default
value for workers in the proxy, container, account & object wsgi servers will
spawn as many workers per process as you have cpu cores.
This will not be ideal for some configurations, but it's much more likely to
produce a successful out of the box deployment.
Inspect the number of cpu_cores using python's multiprocessing when available.
Multiprocessing was added in python 2.6, but I know I've compiled python
without it before on accident. The cpu_count method seems to be pretty system
agnostic, but it says it can raise NotImplementedError or sometimes return 0.
Add a new utility method 'config_auto_int_value' to pull an integer out of the
config which has a dynamic default.
* drive by s/container/proxy/ in proxy-server.conf.5
* fix misplaced max_clients in *-server.conf-sample
* update doc/development_saio to force workers = 1
DocImpact
Change-Id: Ifa563d22952c902ab8cbe1d339ba385413c54e95
The default value for the path arg to the function is None. However, if
the path arg is not set or is set to None an exception is generated when
the path value is passed to urllib.unquote(). The function doc states that
if path is not provided then the value in the env is used, so fix is to
implement this behaviour.
Change-Id: I085124acb6ef3cecb2375bb97d27996e0c6fd36e
Fixes: bug #1198149
Allows to run unit tests on BSD-based platforms. The commit
da99c339 that solved the problem doesn't contain fixes of unit
tests.
Change-Id: Iff1771c1cbb37c30a888cd1da4fca5636e5797e7
See review https://review.openstack.org/29836.
Change-Id: I8ec80202789707f723abfe93ccc9cf1e677e4dc6
Signed-off-by: Peter Portante <peter.portante@redhat.com>
Allow Swift daemons and servers to optionally accept a directory as the
configuration parameter. Directory based configuration leverages
ConfigParser's native multi-file support. Files ending in '.conf' in the
given directory are parsed in lexicographical order. Filenames starting with
'.' are ignored. A mixture of file and directory configuration paths is not
supported - if the configuration path is a file behavior is unchanged.
* update swift-init to search for conf.d paths when building servers
(e.g. /etc/swift/proxy-server.conf.d/)
* new script swift-config can be used to inspect the cumulative configuration
* pull a little bit of code out of run_wsgi and test separately
* fix example config bug for the proxy servers client_disconnect option
* added section on directory based configuration to deployment guide
DocImpact
Implements: blueprint confd
Change-Id: I89b0f48e538117f28590cf6698401f74ef58003b
differently (if at all)
Adding a swift.source to wsgi pre_auth funcs and all middleware that makes
subrequests to proxy server.
NOTE: This change will result in a change in the number of proxy logs made for
staticweb, formpost, tempurl, and any other middleware that performs sub
requests (including swauth and SOS).
Please see docs for details.
DocImpact
Change-Id: I80cf2806add1c3d34054147e2515944be340455b
There were bugs with SCRIPT_NAME and
swift.common.wsgi.make_pre_authed* functions.
1) SCRIPT_NAME wasn't copied with PATH_INFO, which it should've been.
2) When a new path was given, SCRIPT_NAME wasn't set to an empty
string but just omitted, which is allowed by spec, but really
should be set.
For completeness, if SCRIPT_NAME doesn't exist in the source env, it
will be created in the new env, but as an empty string.
Change-Id: Ifbc27ed8ff357038c54df4d37de46cfd5e31372e
This change replaces WebOb with a mostly compatible local library,
swift.common.swob. Subtle changes to WebOb's API over the years have been a
huge headache. Swift doesn't even run on the current version.
There are a few incompatibilities to simplify the implementation/interface:
* It only implements the header properties we use. More can be easily added.
* Casts header values to str on assignment.
* Response classes ("HTTPNotFound") are no longer subclasses, but partials
on Response, so things like isinstance no longer work on them.
* Unlike newer webob versions, will never return unicode objects.
Change-Id: I76617a0903ee2286b25a821b3c935c86ff95233f