swift

Author	SHA1	Message	Date
Samuel Merritt	f48350865e	Add tests for metadata on 304 and 412 responses Commit 1f67eb74 added support for If-[None-]Match on DLOs and SLOs. It also made the 304 and 412 responses have the Content-Type and X-Object-Meta-* headers from the object instead of just having the Etag. Someone showed up in IRC today looking for this behavior, and was happy to learn it's in newer Swift versions than the one they were running. If we've got clients depending on this, we should have some unit tests to make sure we don't accidentally take it out again. Change-Id: If06149d13140148463004d426cb7ba4c5601404a	2014-12-12 12:29:31 -08:00
Nicolas Trangez	2a0a8ae00f	Rework `splice` and `tee` This patch reworks the bindings to `splice` and `tee` in order to fix a (potential) bug when using the old `splice` binding and passing offsets (see https://review.openstack.org/#/c/135319/ for a discussion of the issue). The new binding code (based on https://github.com/NicolasT/tee-n-splice) uses more `ctypes` features w.r.t. parameter and return value handling. It also introduces a test suite for both calls. Change-Id: Ib8084ca20fe7a199a00067da9386c2ccf618755c	2014-11-20 02:30:51 +01:00
Jenkins	58e5dc9983	Merge "When a filesystem does't support xattr return a 507"	2014-11-11 18:40:44 +00:00
Matthew Oliver	2659888c92	When a filesystem does't support xattr return a 507 Currently when the object server tries to write an object's metadata to a filesystem that doesn't support xattr, it errors with a stacktrace and returns a 500 error back to the user with no information. This patch catches the resulting IOError when attempting to read or write the xattr metadata, logs the error nicely and then returns a 507 error back to the user. Seeing as this change is sending back a 507, it also catches and logs the out of disk space errors (ENOSPC and EDQUOT). Change-Id: I31932b57582817a0b3b58dd315a996bd0bcbc99b Closes-Bug: #966671	2014-10-28 18:34:17 +11:00
Jenkins	2ccb09c3a7	Merge "Ssync does not replicate custom object headers"	2014-10-13 21:02:48 +00:00
David Goetz	7c1c8c0b4d	Ssync does not replicate custom object headers Closes-Bug: #1329093 Change-Id: Ie9d80089a38d7a9b3464c66237d4d2d23331ebd5	2014-10-10 11:55:56 -07:00
Prashanth Pai	772eace020	Fix unit tests failing in some cases TestZeroCopy used to fail when 'localhost' resolved to an ipv6 address. https://github.com/eventlet/eventlet/issues/8 Also, "test_container_sync_realms.py:TestUtils.test_os_error" used to fail when unit tests were run as root user. This is because despite os.chmod(), a root user still has permission to access the file and hence OSError is not raised. Change-Id: Ife80b203358557999734515261814ce76c0e00cd Signed-off-by: Prashanth Pai <ppai@redhat.com>	2014-10-09 22:04:53 +05:30
Keshava Bharadwaj	0f93fff46a	Fixes unit tests to clean up temporary directories This patch fixes the unit tests to remove the temporary directories created during run of unit tests. Some of unit tests did not tear down correctly, whatever it had set it up for running. This would over period of time bloat up the tmp directory. As on date, there were around 49 tmp directories left uncleared per round of unit tests. This patch fixes it. Change-Id: If591375ca9cc87d52c7c9c6dc16c9fb4b49e99fc	2014-09-26 22:39:48 +05:30
Jenkins	0d502de2f5	Merge "Zero-copy object-server GET responses with splice()"	2014-09-26 15:48:25 +00:00
Rafael Rivero	c1f6569c00	Fixes several typos (Swift) Corrects spelling errors found in comments. Change-Id: I228a888e3f256569ea32ef1613092dbd63e13c62	2014-09-18 21:18:50 -07:00
Samuel Merritt	7d0e5ebe69	Zero-copy object-server GET responses with splice() This commit lets the object server use splice() and tee() to move data from disk to the network without ever copying it into user space. Requires Linux. Sorry, FreeBSD folks. You still have the old mechanism, as does anyone who doesn't want to use splice. This requires a relatively recent kernel (2.6.38+) to work, which includes the two most recent Ubuntu LTS releases (Precise and Trusty) as well as RHEL 7. However, it excludes Lucid and RHEL 6. On those systems, setting "splice = on" will result in warnings in the logs but no actual use of splice. Note that this only applies to GET responses without Range headers. It can easily be extended to single-range GET requests, but this commit leaves that for future work. Same goes for PUT requests, or at least non-chunked ones. On some real hardware I had laying around (not a VM), this produced a 37% reduction in CPU usage for GETs made directly to the object server. Measurements were done by looking at /proc/<pid>/stat, specifically the utime and stime fields (user and kernel CPU jiffies, respectively). Note: There is a Python module called "splicetee" available on PyPi, but it's licensed under the GPL, so it cannot easily be added to OpenStack's requirements. That's why this patch uses ctypes instead. Also fixed a long-standing annoyance in FakeLogger: >>> fake_logger.warn('stuff') >>> fake_logger.get_lines_for_level('warn') [] >>> This, of course, is because the correct log level is 'warning'. Now you get a KeyError if you call get_lines_for_level with a bogus log level. Change-Id: Ic6d6b833a5b04ca2019be94b1b90d941929d21c8	2014-09-18 16:02:47 -07:00
David Goetz	0abd2cba03	Shard expiring object container All the expiring objects for a given X-Delete-At are funnelled into the same expiring object container- this can act as a bottleneck. Change-Id: I288a177a7ae3e213c727a2a81fa76d4ef9cf7eb3	2014-08-15 08:42:49 -07:00
Jenkins	7036936c45	Merge "Add X-Backend-Timestamp on more object server responses"	2014-08-02 03:01:41 +00:00
anc	4286f36a60	Enable object system metadata on PUTs This patch takes a first step towards support for object system metadata by enabling headers in the x-object-sysmeta- namespace to be persisted when objects are PUT. This should be useful for other pending patches such as on demand migration and server side encryption (https://review.openstack.org/#/c/64430/ and https://review.openstack.org/#/c/76578/1). The x-object-sysmeta- namespace is already reserved/protected by the gatekeeper and passed through the proxy. This patch modifies the object server to persist these headers alongside user metadata when an object is PUT. This patch will preserve existing object system metadata and ignore any new system metadata when handling object POSTs, including POST-as-copy operations. Support for modification of object system metadata with a POST request requires further work as discussed in the blueprint. This patch will preserve existing object system metadata and update it with new system metadata when copying an object. A new probe test is added which makes use of the BrainSplitter class that has been moved from test_container_merge_policy_index.py to a new module brain.py. blueprint object-system-metadata Change-Id: If716bc15730b7322266ebff4ab8dd31e78e4b962	2014-08-01 16:41:33 -07:00
Jenkins	8aefe579b4	Merge "Object services user-agent string uses full name"	2014-07-16 21:09:03 +00:00
John Dickinson	7ab2afe5bd	added process pid to the end of storage node log lines Change-Id: I1c2709d85575fc7d4868fafd9ee757fd01868436	2014-07-09 12:12:33 -07:00
Steven Lang	7573fbd498	Object services user-agent string uses full name It does not appear that, aside from the user-agent string, the strings "obj-server", "obj-updater", or "obj-replicator" (or "obj-<anything>") appear in the swift code base, aside from the directory containing the object services code being named "obj". Furthermore, the container, account, and proxy services construct their user-agent string, as reported in the logs, using their full name. In addition, this full name also shows up as the name of the process via "ps" or "top", etc., which can make it easier for admins to match log entries with other tools. For consistency, we update the object services to use an "object-" prefix rather than "obj-" in its user agent string. obj-etag does appear in a unit test, but not part of the regular code. Change-Id: I914fc189514207df2535731eda10cb4b3d30cc6c	2014-07-02 18:35:49 -07:00
Clay Gerrard	3cad20570c	Add X-Backend-Timestamp on more object server responses It's particularly interesting on writes (PUT, POST, DELETE) where the current on-disk timestamp would prevent the object server from serving the incoming request and returns 409 Conflict. The FakeConn has also been updated to respond in kind for 409's on expect and all responses generaly just cause it's good to keep fakes in line with the reals - not that I expected any existing tests to break because of the new headers. Change-Id: Iac6fbd2f872a9521bb2db84a333365b69f54fb6c	2014-07-01 12:06:29 -07:00
Paul Luse	873c52e608	Replace POLICY and POLICY_INDEX with string literals Replaced throughout code base & tox'd. Functional as well as probe tests pass with and without policies defined. POLICY --> 'X-Storage-Policy' POLICY_INDEX --> 'X-Backend-Storage-Policy-Index' Change-Id: Iea3d06de80210e9e504e296d4572583d7ffabeac	2014-06-23 12:52:50 -07:00
Clay Gerrard	c1dc2fa624	Add two vector timestamps The normalized form of the X-Timestamp header looks like a float with a fixed width to ensure stable string sorting - normalized timestamps look like "1402464677.04188" To support overwrites of existing data without modifying the original timestamp but still maintain consistency a second internal offset vector is append to the normalized timestamp form which compares and sorts greater than the fixed width float format but less than a newer timestamp. The internalized format of timestamps looks like "1402464677.04188_0000000000000000" - the portion after the underscore is the offset and is a formatted hexadecimal integer. The internalized form is not exposed to clients in responses from Swift. Normal client operations will not create a timestamp with an offset. The Timestamp class in common.utils supports internalized and normalized formatting of timestamps and also comparison of timestamp values. When the offset value of a Timestamp is 0 - it's considered insignificant and need not be represented in the string format; to support backwards compatibility during a Swift upgrade the internalized and normalized form of a Timestamp with an insignificant offset are identical. When a timestamp includes an offset it will always be represented in the internalized form, but is still excluded from the normalized form. Timestamps with an equivalent timestamp portion (the float part) will compare and order by their offset. Timestamps with a greater timestamp portion will always compare and order greater than a Timestamp with a lesser timestamp regardless of it's offset. String comparison and ordering is guaranteed for the internalized string format, and is backwards compatible for normalized timestamps which do not include an offset. The reconciler currently uses a offset bump to ensure that objects can move to the wrong storage policy and be moved back. This use-case is valid because the content represented by the user-facing timestamp is not modified in way. Future consumers of the offset vector of timestamps should be mindful of HTTP semantics of If-Modified and take care to avoid deviation in the response from the object server without an accompanying change to the user facing timestamp. DocImpact Implements: blueprint storage-policies Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549	2014-06-19 10:18:06 -07:00
Clay Gerrard	8d20e0e927	Fix object-expirer for missing objects Currently if the object-expirer goes to delete an object and the primary nodes are unavailable, or the object is on handoffs - the object servers are unable to verify the x-if-delete-at timestamp and return 412, without writing a tombstone or updating the containers. The expirer treats 412 as success and the dark data is not removed form the object servers nor the object removed in the listing. As a side effect of this bug, if the expirer encounters split brain the delete would never get processed in the correct storage policy. It seems it's just not correct to treat the lack of data as success. Now the object server will treat x-if-delete at against a non-existent object as a 404, and to distinguish from a successfull process of an x-if-delete-at request, will return 204. The expirer will treat a 404 response from swift as a failure, and will continue to attempt to expire the object until it is older that it's configurable reclaim age. However swift will only return 404 if the majority of nodes are able to return success, or if only even a single node is able to accept the x-if-delete-at request the containers will get updated and replicaiton will settle the tombstone - the subsequent x-if-delete-at request will 412 and be removed from the queue. It's worth noting that if an object with x-delete-at meta is DELETED (by a client request) an async update for the expiring update containers will be processed to remove the queue entry - but if no primary nodes handle the DELETE request replication will never remove the expiring entry and assuming it's scheduled for beyond the tombstones reclaim age - the queue entry will not be processable. In this case the expirer will attempt to DELETE the object (and get 404s) in vain until the queue entry passes the configurable reclaim age. DocImpact Implements: blueprint storage-policies Change-Id: I66260e99fda37e97d6d2470971b6f811ee9e01be	2014-06-18 21:09:54 -07:00
Clay Gerrard	0015019ccd	Put X-Backend-Timestamp in object 404 responses This way the container reconciler can tell (sometimes) that an object was deleted at a certain time. DocImpact Implements: blueprint storage-policies Change-Id: Idaba3255f4109e5150d6c457f913c600fd8923eb	2014-06-18 17:31:38 -07:00
Samuel Merritt	d5ca365965	Add Storage Policy support to Object Updates The object server will now send its storage policy index to the container server synchronously and asynchronously (via async_pending). Each storage policy gets its own async_pending directory under /srv/node/$disk/objects-$N, so there's no need to change the on-disk pickle format; the policy index comes from the async_pending's filename. This avoids any hassle on upgrade. (Recall that policy 0's objects live in /srv/node/$disk/objects, not objects-0.) Per-policy tempdir as well. Also clean up a couple little things in the object updater. Now it won't abort processing when it encounters a file (not directory) named "async_pending-\d+", and it won't process updates in a directory that does not correspond to a storage policy. That is, if you have policies 1, 2, and 3, but there's a directory on your disk named "async_pending-5", the updater will now skip over that entirely. It won't even bother doing directory listings at all. This is a good idea, believe it or not, because there's nothing good that the container server can do with an update from some unknown storage policy. It can't update the listing, it can't move the object if it's misplaced... all it can do is ignore the request, so it's better to just not send it in the first place. Plus, if this is due to a misconfiguration on one storage node, then the updates will get processed once the configuration is fixed. There's also a drive by fix to update some backend http mocks for container update tests that we're not fully exercising their their request fakes. Because the object server container update code is resilient to to all manor of failure from backend requests the general intent of the tests was unaffected but this change cleans up some confusing logging in the debug logger output. The object-server will send X-Storage-Policy-Index headers with all requests to container severs, including X-Delete containers and all object PUT/DELETE requests. This header value is persisted in the pickle file for the update and sent along with async requests from the object-updater as well. The container server will extract the X-Storage-Policy-Index header from incoming requests and apply it to container broker calls as appropriate defaulting to the legacy storage policy 0 to support seemless migration. DocImpact Implements: blueprint storage-policies Change-Id: I07c730bebaee068f75024fa9c2fa9e11e295d9bd add to object updates Change-Id: Ic97a422238a0d7bc2a411a71a7aba3f8b42fce4d	2014-06-18 17:31:38 -07:00
Clay Gerrard	3824ff3df7	Add Storage Policy support to Object Server Objects now have a storage policy index associated with them as well; this is determined by their filesystem path. Like before, objects in policy 0 are in /srv/node/$disk/objects; this provides compatibility on upgrade. (Recall that policy 0 is given to all existing data when a cluster is upgraded.) Objects in policy 1 are in /srv/node/$disk/objects-1, objects in policy 2 are in /srv/node/$disk/objects-2, and so on. * 'quarantined' dir already created 'objects' subdir so now there will also be objects-N created at the same level This commit does not address replicators, auditors, or updaters except where method signatures changed. They'll still work if your cluster has only one storage policy, though. DocImpact Implements: blueprint storage-policies Change-Id: I459f3ed97df516cb0c9294477c28729c30f48e09	2014-06-18 17:31:38 -07:00
Jenkins	77b8d42dc8	Merge "Support If-[Un]Modified-Since for object HEAD"	2014-04-14 10:55:15 +00:00
Greg Lange	d32dc8d49c	Unify backend logging Make account, object, and container servers construct log lines using the same utility function so they will produce identically formatted lines. This change reorders the fields logged for the account server. This change also adds the "additional info" field to the two servers that didn't log that field. This makes the log lines identical across all 3 servers. If people don't like that, I can take that out. I think it makes the documentation, parsing of the log lines, and the code a tad cleaner. DocImpact Change-Id: I268dc0df9dd07afa5382592a28ea37b96c6c2f44 Closes-Bug: 1280955	2014-04-07 18:38:04 +00:00
Chuck Thier	0b893825eb	Add "If-None-Match: " support to PUT A common pattern that we see clients do is send a HEAD request before a PUT to see if it exists. This can slow things down quite a bit especially since 404s on HEAD are currently a bit expensive. This change will allow a client to include a "If-None-Match: " header with a PUT request. In combination with "Expect: 100-Continue" this allows the server to return that it already has a copy of the object before any data is sent. I attempted to also include etag support with the If-None-Match header, but that turned up having too many hairy edge cases, so was left as a future excercise. DocImpact Change-Id: I94e3754923dbe5faba065719c7a9afa9969652dd	2014-04-01 10:42:00 -07:00
Samuel Merritt	62a1e7e059	Support If-[Un]Modified-Since for object HEAD We already supported it for object GET requests, but not for HEAD. This lets clients keep metadata up-to-date without having to either fetch the whole object when it's changed or do their own date parsing. They can just treat Last-Modified as opaque and update their idea of metadata when they get a 200. Change-Id: Iff25d8989a93d651fd2c327e1e58036e79e1bde1	2014-03-14 17:55:43 -07:00
Samuel Merritt	a69789fa06	Allow pre-1970 dates in If-[Un]Modified-Since If I want to fetch an object only if it is newer than the first moon landing, I send a GET request with header: If-Modified-Since: Sun, 20 Jul 1969 20:18:00 UTC Since that date is older than Swift, I expect a 2xx response. However, I get a 412, which isn't even a valid thing to do for If-Modified-Since; it should either be 2xx or 304. This is because of two problems: (a) Swift treats pre-1970 dates as invalid, and (b) Swift returns 412 when a date is invalid instead of ignoring it. This commit makes it so any time between datetime.datetime.min and datetime.datetime.max is an acceptable value for If-Modified-Since and If-Unmodified-Since. Dates outside that date range are treated as invalid headers and thus are ignored, as RFC 2616 section 14.28 requires ("If the specified date is invalid, the header is ignored"). This only works for dates that the Python standard library can parse, which on my machine is 01 Jan 1 to 31 Dec 9999. Eliminating those restrictions would require implementing our own date parsing and comparison, and that's almost certainly not worth it. Change-Id: I4cb4903c4e5e3b6b3c9506c2cabbfbda62e82f35	2014-03-14 17:55:42 -07:00
madhuri	fc499f3092	Added a test for empty metadata Here added a test for setting empty object metadata and checking its value in response headers. Change-Id: I460302661d150364a95bcd7f0ebbbc2a1e95507a	2014-03-07 10:31:24 +05:30
Samuel Merritt	09ef06fd99	Convert all old-style classes to new-style This cleanup has been slowly happening for a while; let's finish it. Change-Id: I1561e3540d524834e0cc5bc725ab80936eae1f0e	2014-03-03 17:28:48 -08:00
gholt	e82d40da46	Object server PUTs should respect client_timeout It seems the object server never respected the client_timeout value since the beginning of Swift. This is normally fine since the proxy does and will normally hang up on the backends. But if the proxy has a bug or if there's network issues or whatever, the object server should be smart enough to enforce this timeout as well. Our operations guys noticed this problem when older processes would never exit after a reload. They started investigating and saw that the object server had open tmp files that hadn't been touched in quite some time. Sometimes the tmp files didn't even exist anymore since the reclaimer deletes really old untouched tmp files. Change-Id: Iba0397203a2dccca4a28a8c8cbfc5a60e429837a	2014-02-28 17:10:03 +00:00
Samuel Merritt	1f67eb7403	Support If-[None-]Match for object HEAD, SLO, and DLO I moved the checking of If-Match and If-None-Match out of the object server's GET method and into swob so that everyone can use it. The interface is similar to the Range handling; make a response with conditional_response=True, and you get handing of If-Match and If-None-Match. Since the only users of conditional_response are object GET, object HEAD, SLO, and DLO, this has the effect of adding support for If-Match and If-None-Match to just the latter three places and nowhere else. This makes object GET and HEAD consistent for any kind of object, large or small. This also fixes a bug where various conditional headers (If-) were passed through to the object server on segment requests, which could cause segment requests to fail with a 304 or 412 response. Now only certain headers are copied to the segment requests, and that doesn't include the conditional ones, so they can't goof up the segment retrieval. Note that I moved SegmentedIterable to swift.common.request_helpers because it sprouted a transitive dependency on swob, and leaving it in utils caused a circular import. Bonus fix: unified the handling of DiskFileQuarantined and DiskFileNotFound in object server GET and HEAD. Now in either case, a 412 will be returned if the client said "If-Match: ". If not, the response is a 404, just like before. Closes-Bug: 1279076 Closes-Bug: 1280022 Closes-Bug: 1280028 Change-Id: Id2ee78346244d516b980202e990aa38ce6812de5	2014-02-20 14:54:26 -08:00
Samuel Merritt	6acea29fa6	Move all DLO functionality to middleware This is for the same reason that SLO got pulled into middleware, which includes stuff like automatic retry of GETs on broken connection and the multi-ring storage policy stuff. The proxy will automatically insert the dlo middleware at an appropriate place in the pipeline the same way it does with the gatekeeper middleware. Clusters will still support DLOs after upgrade even with an old config file that doesn't mention dlo at all. Includes support for reading config values from the proxy server's config section so that upgraded clusters continue to work as before. Bonus fix: resolve 'after' vs. 'after_fn' in proxy's required filters list. Having two was confusing, so I kept the more-general one. DocImpact blueprint multi-ring-large-objects Change-Id: Ib3b3830c246816dd549fc74be98b4bc651e7bace	2014-02-03 18:29:48 -08:00
gholt	b45e9a97ee	Skip delete_at_update for replication requests Requests through the object server that are from backend replication should not send x-delete-at updates to the .expiring_objects account. Replication is just moving data around or making new replicas, not creating new data from nothing. Change-Id: I324864face3ff559822c7a50c50e675e8b889b48	2014-01-23 01:05:16 +00:00
gholt	a3f2400cba	Consolidating and standardizing x-delete-at format Change-Id: Idc916da1c7fe1cc43a2c26f7f7ee1d4fcdd52c89	2014-01-14 15:40:35 +00:00
Jenkins	7f456ef35f	Merge "change the last-modified header value with valid one"	2013-12-20 00:57:36 +00:00
Kiyoung Jung	d69e013519	change the last-modified header value with valid one the Last-Modified header in Response didn't have a suitable value - an integer part of object's timestamp. This leads that the the if-[un]modified-since header with the value from last-modified is always earlier than timestamp and results the content is always newer than value of these conditional headers. Patched code returns math.ceil() of object's timestamp in Last-Modified header so the later conditional header works correctly Closes-Bug: #1248818 Change-Id: I1ece7d008551bf989da74d23f0ed6307c45c5436	2013-12-19 09:31:17 +00:00
Peter Portante	1bb6563a19	Handle non-integer values for if-delete-at If a client passes us a non-integer value for if-delete-at we'll now properly report a 400 error instead of a 503. Closes-Bug: 1259300 Change-Id: I8bb0bb9aa158d415d4f491b5802048f0cd4d8ef6	2013-12-14 22:28:56 -05:00
Peter Portante	d26e8b25a7	Bring obj server unit tests to > 98% This set of changes attempts to bring the unit test coverage to over 98% for the object server module. Two changes to the object server are made with this patch: 1. The try/except block around diskfile.write_metadata() was removed at the end of the POST method The write_metadata() method of the DiskFile module does not raise either the DiskFileNotExist or DiskFileQuarantined exceptions on that code path. 2. The conditional container_update() call was removed at the end of the PUT method The container_update() calls is performed when a new object is created or when an exist object is updated. Since we already report old timestamps as 409s (Conflict) we always perform the update. We also fix an existing test to clear the hash prefix so that it can actually detect the async pending pickle file creation for a failure mode. Change-Id: I71ec9dcf7c0ac86e56aa0f06993d501fdfa22d5b	2013-12-11 17:18:13 -05:00
Clay Gerrard	74b51c9c06	fix expired object deletion fixes bug #1257330 Change-Id: I49f645abdeba97eafb3ae42ef9e3684c912cfdc6	2013-12-04 23:09:02 -08:00
Alex Gaynor	87cd559847	Account for a platform difference in semaphores On OS X (and probably other Operating Systems) it isn't possible to introspect the value of a semaphore. Account for this by skipping a test about this. Change-Id: I97824f9fc4e36de4f7a62c8ce53865e6977dfdfe	2013-11-27 14:34:06 -06:00
Clay Gerrard	9e80fd45a0	Add a DebugLogger for wsgi server tests Change-Id: Ifd2528be443ba3879bf4921f6c5f4ef31f29044b	2013-11-21 01:35:58 -08:00
gholt	a80c720af5	Object replication ssync (an rsync alternative) For this commit, ssync is just a direct replacement for how we use rsync. Assuming we switch over to ssync completely someday and drop rsync, we will then be able to improve the algorithms even further (removing local objects as we successfully transfer each one rather than waiting for whole partitions, using an index.db with hash-trees, etc., etc.) For easier review, this commit can be thought of in distinct parts: 1) New global_conf_callback functionality for allowing services to perform setup code before workers, etc. are launched. (This is then used by ssync in the object server to create a cross-worker semaphore to restrict concurrent incoming replication.) 2) A bit of shifting of items up from object server and replicator to diskfile or DEFAULT conf sections for better sharing of the same settings. conn_timeout, node_timeout, client_timeout, network_chunk_size, disk_chunk_size. 3) Modifications to the object server and replicator to optionally use ssync in place of rsync. This is done in a generic enough way that switching to FutureSync should be easy someday. 4) The biggest part, and (at least for now) completely optional part, are the new ssync_sender and ssync_receiver files. Nice and isolated for easier testing and visibility into test coverage, etc. All the usual logging, statsd, recon, etc. instrumentation is still there when using ssync, just as it is when using rsync. Beyond the essential error and exceptional condition logging, I have not added any additional instrumentation at this time. Unless there is something someone finds super pressing to have added to the logging, I think such additions would be better as separate change reviews. FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION CLUSTERS. Some of us will be in a limited fashion to look for any subtle issues, tuning, etc. but generally ssync is an experimental feature. In its current implementation it is probably going to be a bit slower than rsync, but if all goes according to plan it will end up much faster. There are no comparisions yet between ssync and rsync other than some raw virtual machine testing I've done to show it should compete well enough once we can put it in use in the real world. If you Tweet, Google+, or whatever, be sure to indicate it's experimental. It'd be best to keep it out of deployment guides, howtos, etc. until we all figure out if we like it, find it to be stable, etc. Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6	2013-11-07 16:52:01 +00:00
Peter Portante	5202b0e586	DiskFile API, with reference implementation Refactor on-disk knowledge out of the object server by pushing the async update pickle creation to the new DiskFileManager class (name is not the best, so suggestions welcome), along with the REPLICATOR method logic. We also move the mount checking and thread pool storage to the new ondisk.Devices object, which then also becomes the new home of the audit_location_generator method. For the object server, a new setup() method is now called at the end of the controller's construction, and the _diskfile() method has been renamed to get_diskfile(), to allow implementation specific behavior. We then hide the need for the REST API layer to know how and where quarantining needs to be performed. There are now two places it is checked internally, on open() where we verify the content-length, name, and x-timestamp metadata, and in the reader on close where the etag metadata is checked if the entire file was read. We add a reader class to allow implementations to isolate the WSGI handling code for that specific environment (it is used no-where else in the REST APIs). This simplifies the caller's code to just use a "with" statement once open to avoid multiple points where close needs to be called. For a full historical comparison, including the usage patterns see: https://gist.github.com/portante/5488238 (as of master, 2b639f5, Merge "Fix 500 from account-quota This Commit middleware") --------------------------------+------------------------------------ DiskFileManager(conf) Methods: .pickle_async_update() .get_diskfile() .get_hashes() Attributes: .devices .logger .disk_chunk_size .keep_cache_size .bytes_per_sync DiskFile(a,c,o,keep_data_fp=) DiskFile(a,c,o) Methods: Methods: .__iter__() .close(verify_file=) .is_deleted() .is_expired() .quarantine() .get_data_file_size() .open() .read_metadata() .create() .create() .write_metadata() .delete() .delete() Attributes: Attributes: .quarantined_dir .keep_cache .metadata DiskFileReader() Methods: .__iter__() .close() Attributes: +.was_quarantined DiskWriter() DiskFileWriter() Methods: Methods: .write() .write() .put() .put() * Note that the DiskFile class * Note that the DiskReader() object implements all the methods returned by the necessary for a WSGI app DiskFileOpened.reader() method iterator implements all the methods necessary for a WSGI app iterator + Note that if the auditor is refactored to not use the DiskFile class, see https://review.openstack.org/44787 then we don't need the was_quarantined attribute A reference "in-memory" object server implementation of a backend DiskFile class in swift/obj/mem_server.py and swift/obj/mem_diskfile.py. One can also reference https://github.com/portante/gluster-swift/commits/diskfile for the proposed integration with the gluster-swift code based on these changes. Change-Id: I44e153fdb405a5743e9c05349008f94136764916 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-17 15:03:31 -04:00
Peter Portante	9411a24ba7	Revert "Refactor common/utils methods to common/ondisk" This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32 Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-07 17:18:09 -04:00
ZhiQiang Fan	f72704fc82	Change OpenStack LLC to Foundation Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58 Closes-bug: #1214176	2013-09-20 01:02:31 +08:00
Peter Portante	7760f41c3c	Refactor common/utils methods to common/ondisk Place all the methods related to on-disk layout and / or configuration into a new common module that can be shared by the various modules using the same on-disk layout. Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-09-17 17:32:04 -04:00
Clay Gerrard	d51e873423	Remove keep_data_fp argument from DiskFile constructor All access to the data_file fp for DiskFile is moved after the new "open" method. This prepares to move some additional smarts into DiskFile and reduce the surface area of the abstraction and the exposure of the underlying implementation in the object-server. Future work: * Consolidate put_metadata to DiskWriter * Add public "update_metdata" method to DiskFile * Create DiskReader class to gather all access of methods under "open" Change-Id: I4de2f265bf099a810c5f1c14b5278d89bd0b382d	2013-09-10 17:51:56 -07:00
Peter Portante	9d98070f7b	Remove reference to 'file' built-in Change-Id: Ie79e8ede393e92824fd906df1ff1933193c00943 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-09-06 13:44:09 -04:00

... 2 3 4 5 6

265 Commits