swift

Author	SHA1	Message	Date
Matthew Oliver	f595a7e704	Add concurrent reads option to proxy This change adds 2 new parameters to enable and control concurrent GETs in swift, these are 'concurrent_gets' and 'concurrency_timeout'. 'concurrent_gets' allows you to turn on or off concurrent GETs, when on it will set the GET/HEAD concurrency to replica count. And in the case of EC HEADs it will set it to ndata. The proxy will then serve only the first valid source to respond. This applies to all account, container and object GETs except for EC. For EC only HEAD requests are effected. It achieves this by changing the request sending mechanism to using GreenAsyncPile and green threads with a time out between each request. 'concurrency_timeout' is related to concurrent_gets. And is the amount of time to wait before firing the next thread. A value of 0 will fire at the same time (fully concurrent), setting another value will stagger the firing allowing you the ability to give a node a shorter chance to respond before firing the next. This value is a float and should be somewhere between 0 and node_timeout. The default is conn_timeout. Meaning by default it will stagger the firing. DocImpact Implements: blueprint concurrent-reads Change-Id: I789d39472ec48b22415ff9d9821b1eefab7da867	2016-03-16 06:00:34 +00:00
Samuel Merritt	9430f4c9f5	Move HeaderKeyDict to avoid an inline import There was a function in swift.common.utils that was importing swob.HeaderKeyDict at call time. It couldn't import it at compilation time since utils can't import from swob or else it blows up with a circular import error. This commit just moves HeaderKeyDict into swift.common.header_key_dict so that we can remove the inline import. Change-Id: I656fde8cc2e125327c26c589cf1045cb81ffc7e5	2016-03-07 12:26:48 -08:00
Takashi Kajinami	8e4347afd5	Fix proxy-server's support for chunked transferring in GET object Proxy-server now requires Content-Length in the response header when getting object and does not support chunked transferring with "Transfer-Encoding: chunked" This doesn't matter in normal swift, but prohibits us from putting any middelwares to execute something like streaming processing of objects, which can't calculate the length of their response body before they start to send their response. Change-Id: I60fc6c86338d734e39b7e5f1e48a2647995045ef	2016-03-02 22:56:13 +09:00
Alistair Coles	2f4b79233e	Minor cleanup of repeated identical test assertions assertDictContainsSubset is being called multiple times with same arguments in a loop. Since assertDictContainsSubset is deprecated form python 3.2, replace it with checks on individual key, value pairs. Change-Id: I7089487710147021f26bd77c36accf5751855d68	2015-12-15 15:49:42 +00:00
Zack M. Davis	1b8b08039a	remove remaining simplejson uses, prefer standard library import a1c32702, 736cf54a, and 38787d0f remove uses of `simplejson` from various parts of Swift in favor of the standard libary `json` module (introduced in Python 2.6). This commit performs the remaining `simplejson` to `json` replacements, removes two comments highlighting quirks of simplejson with respect to Unicode, and removes the references to it in setup documentation and requirements.txt. There were a lot of places where we were importing json from swift.common.utils, which is less intuitive than a direct `import json`, so that replacement is made as well. (And in two more tiny drive-bys, we add some pretty-indenting to an XML fragment and use `super` rather than naming a base class explicitly.) Change-Id: I769e88dda7f76ce15cf7ce930dc1874d24f9498a	2015-11-16 12:34:24 -08:00
Bill Huber	faef717cd3	Add unit tests for swift.proxy.controllers.base This patch adds more unit tests to diminish missing pieces of the coverage in the proxy_controllers_base unit test. Change-Id: I85ba1955c681cc9d5b2a70ac31155678d2e5b6fd	2015-10-27 09:26:46 -05:00
Bill Huber	c35cc13b8a	pep8 fix: assertEquals -> assertEqual assertEquals is deprecated in py3 in the following dir: test/unit/proxy/* Change-Id: Ie2c7e73e1096233a10ee7fbf6f88386fa4d469d6	2015-08-06 13:36:26 -05:00
Jenkins	b2e79357bb	Merge "Replace dict.iteritems() with dict.items()"	2015-07-09 18:36:05 +00:00
Janie Richling	125238612f	Add CORS unit tests to base In earlier versions of swift when a request was made with an existing origin, but without any CORS settings in the container, it was possible to get an unhandled exception due to a method call on the "None" return of cors.get('allow_origin', ''). Unit tests have been added to assert that this problem cannot go undetected again. Change-Id: Ia74896dabe1cf5a307c551b15a43ab1fd789c213 Fixes: bug 1468782	2015-07-08 04:55:00 +00:00
Victor Stinner	e70b66586e	Replace dict.iteritems() with dict.items() The iteritems() of Python 2 dictionaries has been renamed to items() on Python 3. According to a discussion on the openstack-dev mailing list, the overhead of creating a temporary list using dict.items() on Python 2 is very low because most dictionaries are small: http://lists.openstack.org/pipermail/openstack-dev/2015-June/066391.html Patch generated by the following command: sed -i 's,iteritems,items,g' \ $(find swift -name ".py") \ $(find test -name ".py") Change-Id: I6070bb6c684be76e8e77222a7d280ec6edd43496	2015-06-24 09:39:55 +02:00
janonymous	09e7477a39	Replace it.next() with next(it) for py3 compat The Python 2 next() method of iterators was renamed to __next__() on Python 3. Use the builtin next() function instead which works on Python 2 and Python 3. Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d	2015-06-15 22:10:45 +05:30
Samuel Merritt	4f2ed8bcd0	EC: support multiple ranges for GET requests This commit lets clients receive multipart/byteranges responses (see RFC 7233, Appendix A) for erasure-coded objects. Clients can already do this for replicated objects, so this brings EC closer to feature parity (ha!). GetOrHeadHandler got a base class extracted from it that treats an HTTP response as a sequence of byte-range responses. This way, it can continue to yield whole fragments, not just N-byte pieces of the raw HTTP response, since an N-byte piece of a multipart/byteranges response is pretty much useless. There are a couple of bonus fixes in here, too. For starters, download resuming now works on multipart/byteranges responses. Before, it only worked on 200 responses or 206 responses for a single byte range. Also, BufferedHTTPResponse grew a readline() method. Also, the MIME response for replicated objects got tightened up a little. Before, it had some leading and trailing CRLFs which, while allowed by RFC 7233, provide no benefit. Now, both replicated and EC multipart/byteranges avoid extraneous bytes. This let me re-use the Content-Length calculation in swob instead of having to either hack around it or add extraneous whitespace to match. Change-Id: I16fc65e0ec4e356706d327bdb02a3741e36330a0	2015-06-03 11:42:00 -07:00
Kota Tsuyuzaki	e440d6aed8	Fix best response to return correct status Current best response could return "503 Internal Server Error". However, "503" means "Service Unavailable". (The status int of Internal Server Error is 500) This patch fixes the response status as "503 Service Unavailable" Change-Id: I88b8c52c26b19e9e76ba3375f1e16ced555ed54c	2015-04-16 12:55:40 +00:00
Samuel Merritt	decbcd24d4	Foundational support for PUT and GET of erasure-coded objects This commit makes it possible to PUT an object into Swift and have it stored using erasure coding instead of replication, and also to GET the object back from Swift at a later time. This works by splitting the incoming object into a number of segments, erasure-coding each segment in turn to get fragments, then concatenating the fragments into fragment archives. Segments are 1 MiB in size, except the last, which is between 1 B and 1 MiB. +====================================================================+ \| object data \| +====================================================================+ \| +------------------------+----------------------+ \| \| \| v v v +===================+ +===================+ +==============+ \| segment 1 \| \| segment 2 \| ... \| segment N \| +===================+ +===================+ +==============+ \| \| \| \| v v /=========\ /=========\ \| pyeclib \| \| pyeclib \| ... \=========/ \=========/ \| \| \| \| +--> fragment A-1 +--> fragment A-2 \| \| \| \| \| \| \| \| \| \| +--> fragment B-1 +--> fragment B-2 \| \| \| \| ... ... Then, object server A gets the concatenation of fragment A-1, A-2, ..., A-N, so its .data file looks like this (called a "fragment archive"): +=====================================================================+ \| fragment A-1 \| fragment A-2 \| ... \| fragment A-N \| +=====================================================================+ Since this means that the object server never sees the object data as the client sent it, we have to do a few things to ensure data integrity. First, the proxy has to check the Etag if the client provided it; the object server can't do it since the object server doesn't see the raw data. Second, if the client does not provide an Etag, the proxy computes it and uses the MIME-PUT mechanism to provide it to the object servers after the object body. Otherwise, the object would not have an Etag at all. Third, the proxy computes the MD5 of each fragment archive and sends it to the object server using the MIME-PUT mechanism. With replicated objects, the proxy checks that the Etags from all the object servers match, and if they don't, returns a 500 to the client. This mitigates the risk of data corruption in one of the proxy --> object connections, and signals to the client when it happens. With EC objects, we can't use that same mechanism, so we must send the checksum with each fragment archive to get comparable protection. On the GET path, the inverse happens: the proxy connects to a bunch of object servers (M of them, for an M+K scheme), reads one fragment at a time from each fragment archive, decodes those fragments into a segment, and serves the segment to the client. When an object server dies partway through a GET response, any partially-fetched fragment is discarded, the resumption point is wound back to the nearest fragment boundary, and the GET is retried with the next object server. GET requests for a single byterange work; GET requests for multiple byteranges do not. There are a number of things _not_ included in this commit. Some of them are listed here: * multi-range GET * deferred cleanup of old .data files * durability (daemon to reconstruct missing archives) Co-Authored-By: Alistair Coles <alistair.coles@hp.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: John Dickinson <me@not.mn> Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com> Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com> Co-Authored-By: Paul Luse <paul.e.luse@intel.com> Co-Authored-By: Christian Schwede <christian.schwede@enovance.com> Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com> Change-Id: I9c13c03616489f8eab7dcd7c5f21237ed4cb6fd2	2015-04-14 00:52:17 -07:00
Janie Richling	f96b8e412d	Included sysmeta in the object info The cached info object dict did not include the sysmeta. This patch fixes that, and adds a unit test. Change-Id: I092200e76586af322ed4ff7d194a1034b1ca0433	2015-02-09 18:28:22 -06:00
Pete Zaitcev	9b8dd0651d	Don't set the already set Connection: close Meaningless cleanup of the day. In my defence, I hope the next guy doesn't have to grep throughout (the test is adjusted to signify). Change-Id: I9e10dd977d4ca48db1393519068ce0e286705433	2015-01-27 13:23:08 -07:00
Matthew Oliver	f4d3facdf4	Treat 404s as 204 on object delete in proxy This change adds an optional overrides map to _make_request method in the base Controller class. def make_requests(self, req, ring, part, method, path, headers, query_string='', overrides=None) Which will be passed on the the best_response method. If set and no quorum it reached, the override map is used to attempt to find quorum. The overrides map is in the form: { <response>: <override response>, .. } The ObjectController, in the DELETE method now passes an override map to make_requests method in the base Controller class in the form of: { 404: 204 } Statuses/responses that have been overridden are used in calculation of the quorum but never returned to the user. They are replaced by: (STATUS, '', '', '') And left out of the search for best response. Change-Id: Ibf969eac3a09d67668d5275e808ed626152dd7eb Closes-Bug: 1318375	2014-09-10 11:52:06 +10:00
Paul Luse	0800668557	Fix potential missing key error in container_info If upgrading from a non-storage policy enabled version of swift to a storage policy enabled version its possible that memcached will have an info structure that does not contain the 'storage_policy" key resulting in an unhandled exception during the lookup. The fix is to simply make sure we never return the dict without a storage_policy key defined; if it doesn't exist its safe to make it '0' as this means you're in the update scenario and there's xno other possibility. Change-Id: If8e8f66d32819c5bfb2d1308e14643f3600ea6e9	2014-07-02 15:58:22 -07:00
Clay Gerrard	8bec50838c	Extend interface on InternalClient * add get_object * allow extra headers passthrough on HEAD/metadata reqeusts * expose (account\|container\|get_object)_ring properties Pipeline propety access to the auto_create_account_prefix also allows us to bypass the early exit on a container HEAD for auto_create_accounts if the container-updater hasn't cycled yet. Allow overriding of storage policy index. This is something the reconciler will need so that it can GET from one policy, PUT in another, and then DELETE from the first one again. DocImpact Implements: blueprint storage-policies Change-Id: I9b287d15f2426022d669d1186c9e22dd8ca13fb9	2014-06-18 17:31:39 -07:00
Clay Gerrard	3824ff3df7	Add Storage Policy support to Object Server Objects now have a storage policy index associated with them as well; this is determined by their filesystem path. Like before, objects in policy 0 are in /srv/node/$disk/objects; this provides compatibility on upgrade. (Recall that policy 0 is given to all existing data when a cluster is upgraded.) Objects in policy 1 are in /srv/node/$disk/objects-1, objects in policy 2 are in /srv/node/$disk/objects-2, and so on. * 'quarantined' dir already created 'objects' subdir so now there will also be objects-N created at the same level This commit does not address replicators, auditors, or updaters except where method signatures changed. They'll still work if your cluster has only one storage policy, though. DocImpact Implements: blueprint storage-policies Change-Id: I459f3ed97df516cb0c9294477c28729c30f48e09	2014-06-18 17:31:38 -07:00
anc	6164fa246d	Generic means for persisting system metadata. Middleware or core features may need to store metadata against accounts or containers. This patch adds a generic mechanism for system metadata to be persisted in backend databases, without polluting the user metadata namespace, by using the reserved header namespace x-<server_type>-sysmeta-*. Modifications are firstly that backend servers persist system metadata headers alongside user metadata and other system state. For accounts and containers, system metadata in PUT and POST requests is treated in a similar way to user metadata. System metadata is not yet supported for object requests. Secondly, changes in the proxy controllers ensure that headers in the system metadata namespace will pass through in requests to backend servers. Thirdly, system metadata returned from backend servers in GET or HEAD responses is added to the cached info dict, which middleware can access. Finally, a gatekeeper middleware module is provided which filters all system metadata headers from requests and responses by removing headers with names starting x-account-sysmeta-, x-container-sysmeta-. The gatekeeper also removes headers starting x-object-sysmeta- in anticipation of future support for system metadata being set for objects. This prevents clients from writing or reading system metadata. The required_filters list in swift/proxy/server.py is modified to include the gatekeeper middleware so that if the gatekeeper has not been configured in the pipeline then it will be automatically inserted close to the start of the pipeline. blueprint cluster-federation Change-Id: I80b8b14243cc59505f8c584920f8f527646b5f45	2014-01-06 22:29:37 +00:00
Samuel Merritt	ace2aa33b1	Fix obj versioning w/non-ASCII container name If you create a container with a non-ASCII name, and then make another container with X-Versions-Location: first-cøntåîner, and you're serializing stuff in memcache as json (the default), when the proxy tries to make a versioned object, it will crash. The fix is to make sure that get_container_info() always returns strs, not unicodes. The long-term fix would be to get rid of simplejson entirely, as its decoder can't make up its mind whether JSON strings should be Python strs or unicodes, and that makes it really really easy to write bugs like this. Change-Id: Ib20ea5fb884484a4246d7a21a9f1e2ffd82eb04f	2013-12-18 10:38:34 -08:00
Samuel Merritt	3530708619	Stop mutating PATH_INFO in proxy server The proxy server was calling swob.Request.path_info_pop() prior to instantiating a controller so that req.path_info was just /a/c/o (sans /v1). The version got moved over into SCRIPT_NAME. This lead to some unfortunate behavior when trying to re-use a request from middleware. Something like this: # Imagine we're a WSGIContext object here. # # To start, SCRIPT_NAME = '' and PATH_INFO='/v1/a/c/o' resp_iter = self._app_call(env, start_response) # Now SCRIPT_NAME='/v1' and PATH_INFO ='/a/c/o' if something_special in self._response_headers: env['REQUEST_METHOD'] = 'GET' env.pop('HTTP_RANGE', None) # 404 SURPRISE! The proxy calls path_info_pop() again, # and now SCRIPT_NAME='/v1/a' and PATH_INFO='/c/o', so this # gets treated as a container request. Yikes. resp_iter = self._app_call(env, start_response) Now we just leave SCRIPT_NAME and PATH_INFO alone. To make life easy for everyone who does want just /a/c/o, I defined swob.Request.swift_entity_path, which just strips off the /v1. Note that there's still one call to path_info_pop() in tempauth, but that's only for requests going to /auth, so it won't affect Swift API requests. It might be a good idea to remove that one later, but let's do one thing at a time. Change-Id: I87557a11c01f3f3889b610578cda6ba7d3933e7a	2013-12-06 10:57:37 -08:00
David Goetz	f5648638ee	Get retry. If a source times out on read try another one of them with a modified range. There had to be a lot of moved around code to get this working but it should all make sense. Change-Id: Ieaf045690a8823927a6f38098a95b37a4d4adb70	2013-11-22 21:03:11 +00:00
Michael Barton	f0c0855ec8	early quorum responses Allow the proxy to respond to many types of requests as soon as it has a quorum. This can help speed up responses (without changing the results), especially when one node is acting up. I had to fix a few unit tests that no longer match the backend http requests made by our proxy. Change-Id: Ieb070dc3019e217e717b96154a7a809409bf40a5	2013-11-07 01:48:14 +00:00
Jenkins	0b594bc3af	Merge "Change OpenStack LLC to Foundation"	2013-10-07 16:09:37 +00:00
Samuel Merritt	d8e0492ea8	Fix internal swift.source tracking. In 1.8.0 (Grizzly), your proxy logs would indicate which middleware was responsible for an internal request, e.g. TU for tempurl or BD for bulk delete. At some point, those all turned into GET_INFO, which does not give you any idea which specific middleware was responsible, only that it came from a get_account_info/get_container_info call. This commit puts it back to how it was in 1.8.0. Also, the new-since-1.8.0 function get_object_info() got swift_source plumbing added to it, so source tracking for the quota middlewares' get_object_info() calls will happen now too. Note that due to the new-since-1.8.0 in-environment caching of account/container info, you may not see as many lines in the proxy log as you would with 1.8.0. This is because there are actually fewer internal requests being made. Change-Id: I2b2ff7823c612dc7ed7f268da979c4500bbbe911	2013-09-25 10:21:56 -07:00
ZhiQiang Fan	f72704fc82	Change OpenStack LLC to Foundation Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58 Closes-bug: #1214176	2013-09-20 01:02:31 +08:00
Fabien Boucher	fffc95c3cc	Handle X-Copy-From header in account_quota mw Content length of the copied object is checked before allowing the copy request according to the account quota set by Reseller. Fixes: bug #1200271 Change-Id: Ie4700f23466dd149ea5a497e6c72438cf52940fd	2013-09-10 17:12:46 +02:00
Peter Portante	be1cff4f1f	Pep8 unit test modules w/ <= 10 violations (5 of 12) Change-Id: I8e82c14ada52d44df5a31e08982ac79cd7e5c969 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-09-01 15:12:48 -04:00
Alex Gaynor	ff5a6d0111	Corrected many style violations in the tests. I focussed primarily on F-category violations, they are all but all fixed with this patch. Change-Id: I343f6945c97984ed1093bc347b6def6994297041	2013-07-24 10:18:47 -07:00
David Hadas	58f4e0f2e0	get_info - removes duplicate code (Take 3) Consolidate the different ways in which info of account/container is gathered, cached, used, updated, etc. This refactoring increases code reuse and is a basis for later addition of account ACLs. Changing the get_info users is left for future. This staged approach ensures the behaviour is unchanged. Change-Id: I67b58030d3f9e3bc86bcd7ece0f1dc693c4e08c3 Fixes: Bug #1162199	2013-06-08 14:20:37 +03:00
Chmouel Boudjnah	b62299376a	Account and container info fixes and improvement. - Fixes bug 1119282. - Allow middleware accessing metadata of an account without having to store it separately in a new memcache namespace. - Add tests for get_container_info that was previously missed. - Add get_account_info method based on get_container_info, a function for other middleware to query accounts. - Rename container_info['count'] as container_info['object_count']. Change-Id: I43787916c7a812cb08d278edf45370521f12c912	2013-02-16 23:32:27 +01:00
Michael Barton	af031138be	re-use headers_to_container_info on container GET Currently, a container's info can be cached without cors data intact after a container GET. I made headers_to_container_info a function instead of a method and I crammed all container metadata into container_info. This is so e.g. staticweb can eventually re-use the same container info cache. Fix pep8 in swift/proxy/controllers/container.py Change-Id: I4bbb042dde79afac48395efc38bd80f0ff240e1f	2012-11-01 18:46:47 -07:00

34 Commits