swift

Author	SHA1	Message	Date
Nirmal Thacker	b61fce6cba	Container Auditor should log a warning if the devices path contains a non-directory. If the devices path configured in container-server.conf contains a file then an uncaught exception is seen in the logs. For example if file foo exists as such /srv/1/node/foo then when the container-auditor runs, the exception that foo/containers is not a directory is seen in the logs This patch was essentially clayg and can be found in the bug I tested it and wanted to get a feel of the openstack workflow so going through the commit process I have added a unit test as well as cleaned up and improved the unit test coverage for this module. - unit test for above fix is added - unit test to verify exceptions that are raised in the module - unit test to verify the logger's behavior - unit test to verify mount_check behavior Change-Id: I903b2b1e11646404cfb0551ee582a514d008c844 Closes-Bug: #1317257	2014-07-01 04:56:39 +00:00
Samuel Merritt	48d94d96b9	Don't count 412 or 416 as errors in stats The backend HTTP servers emit StatsD metrics of the form <server>.<method>.timing and <server>.<method>.errors.timing (e.g. object-server.GET.timing). Whether something counts as an error or not is based on the response's HTTP status code. Prior to this commit, "success" was 2xx, 3xx, or 404, while everything else was considered "error". This adds 412 and 416 to the "success" set. Like 404, these status codes indicate that we got the request and processed it without error, but the response was "no". They shouldn't be in the same category as statuses like 507 that indicate something stopped the request from being processed. Change-Id: I5582a51cf6f64aa22c890da01aaaa077f3a54202	2014-06-24 11:04:45 -07:00
Clay Gerrard	c1dc2fa624	Add two vector timestamps The normalized form of the X-Timestamp header looks like a float with a fixed width to ensure stable string sorting - normalized timestamps look like "1402464677.04188" To support overwrites of existing data without modifying the original timestamp but still maintain consistency a second internal offset vector is append to the normalized timestamp form which compares and sorts greater than the fixed width float format but less than a newer timestamp. The internalized format of timestamps looks like "1402464677.04188_0000000000000000" - the portion after the underscore is the offset and is a formatted hexadecimal integer. The internalized form is not exposed to clients in responses from Swift. Normal client operations will not create a timestamp with an offset. The Timestamp class in common.utils supports internalized and normalized formatting of timestamps and also comparison of timestamp values. When the offset value of a Timestamp is 0 - it's considered insignificant and need not be represented in the string format; to support backwards compatibility during a Swift upgrade the internalized and normalized form of a Timestamp with an insignificant offset are identical. When a timestamp includes an offset it will always be represented in the internalized form, but is still excluded from the normalized form. Timestamps with an equivalent timestamp portion (the float part) will compare and order by their offset. Timestamps with a greater timestamp portion will always compare and order greater than a Timestamp with a lesser timestamp regardless of it's offset. String comparison and ordering is guaranteed for the internalized string format, and is backwards compatible for normalized timestamps which do not include an offset. The reconciler currently uses a offset bump to ensure that objects can move to the wrong storage policy and be moved back. This use-case is valid because the content represented by the user-facing timestamp is not modified in way. Future consumers of the offset vector of timestamps should be mindful of HTTP semantics of If-Modified and take care to avoid deviation in the response from the object server without an accompanying change to the user facing timestamp. DocImpact Implements: blueprint storage-policies Change-Id: Id85c960b126ec919a481dc62469bf172b7fb8549	2014-06-19 10:18:06 -07:00
Clay Gerrard	fbcfb83566	Add LRUCache to common.utils This decorator will memonize a function using a fixed size cache that evicts the oldest entries. It also supports a maxtime paramter to configure a "time-to-live" for entries in the cache. The reconciler code uses this to cache computations of the correct storage policy index for a container for 30 seconds. DocImpact Implements: blueprint storage-policies Change-Id: I0f220869e33c461a4100b21c6324ad725da864fa	2014-06-18 21:09:53 -07:00
Clay Gerrard	3fc4d6f91d	Add container-reconciler daemon This daemon will take objects that are in the wrong storage policy and move them to the right ones, or delete requests that went to the wrong storage policy and apply them to the right ones. It operates on a queue similar to the object-expirer's queue. Discovering that the object is in the wrong policy will be done in subsequent commits by the container replicator; this is the daemon that handles them once they happen. Like the object expirer, you only need to run one of these per cluster see etc/container-reconciler.conf. DocImpact Implements: blueprint storage-policies Change-Id: I5ea62eb77ddcbc7cfebf903429f2ee4c098771c9	2014-06-18 17:31:39 -07:00
Jenkins	06f58c67ef	Merge "Container sync no longer sending swift_bytes value"	2014-06-12 18:31:55 +00:00
Jenkins	5c7298b7bd	Merge "Add ability to remove subsections from /info"	2014-05-28 01:38:30 +00:00
gholt	2d00f7b7ba	New log_max_line_length option. Log lines can get quite large, as we previously noticed with rsync error log lines. We added a setting to cap those, but it really looks like we should have just done this overall limit. We noticed the issue when we switched to UDP syslogging and it would occasionally blow past the 16436 lo MTU! This causes Python's logging code to get an error and hilarity ensues. Change-Id: I44bdbe68babd58da58c14360379e8fef8a6b75f7	2014-05-22 20:30:34 +00:00
Jenkins	2aee5737ab	Merge "Add targeted config loading to swift-init"	2014-05-21 17:43:03 +00:00
David Goetz	9cbf8a3f5b	Add ability to remove subsections from /info Change-Id: Ic881065962cf5f69f7a5b64f6e38d9e6e1f8fd18	2014-05-21 08:49:05 -07:00
gholt	4350152828	Container sync no longer sending swift_bytes value Container sync had a bug where it'd send out the trailing "; swift_bytes=xxx" part of the content-type header. That trailing part is just for internal cluster usage by SLO. Since that needed to be stripped in two places now, I separated it out to a function that both spots call. Change-Id: Ibd6035d7a6b78205344bcc9d98bc1b7a9d463427	2014-05-21 13:32:17 +00:00
Chuck Thier	0a122c1575	Add targeted config loading to swift-init This allows an easier and more explicit way to tell swift-init to run on specific servers. For example with an SAIO, this allows you to do something like: swift-init object-server.1 reload to reload just the 1st object server. A more real world example is when you are running separate servers for replication. In this example you might have an object-server/public.conf and object-server/replication.conf. With this change you can do something like: swift-init object-server.replication reload to just reload the replication server. DocImpact Change-Id: I5c6046b5ee28e17dadfc5fc53d1d872d9bb8fe48	2014-05-19 14:43:50 +00:00
Brian Cline	b4c5a13664	Uses None instead of mutables for function param defaults As seen on #1174809, changes use of mutable types as default arguments and defaults them within the method. Otherwise, those defaults can be unexpectedly persisted with the function between invocations and erupt into mass hysteria on the streets. There was indeed a test (TestSimpleClient.test_get_with_retries) that was erroneously relying on this behavior. Since previous tests had populated their own instantiations with a token, this test only passed because the modified headers dict from previous tests was being overridden. As expected, with the mutable defaults fix in SimpleClient, this test begain to fail since it never specified any token, yet it has always passed anyway. This change also now provides the expected token. Change-Id: If95f11d259008517dab511e88acfe9731e5a99b5 Related-Bug: #1174809	2014-05-10 11:15:56 +00:00
anc	1fb0709d2f	Unit test for common.utils.cache_from_env The behavior of common.utils.cache_from_env was changed by https://review.openstack.org/#/c/89488/. This patch adds unit test for that function. Change-Id: If757e12990c971325f7705731ef529a7e2a9eee7	2014-04-25 15:09:15 +01:00
Greg Lange	d32dc8d49c	Unify backend logging Make account, object, and container servers construct log lines using the same utility function so they will produce identically formatted lines. This change reorders the fields logged for the account server. This change also adds the "additional info" field to the two servers that didn't log that field. This makes the log lines identical across all 3 servers. If people don't like that, I can take that out. I think it makes the documentation, parsing of the log lines, and the code a tad cleaner. DocImpact Change-Id: I268dc0df9dd07afa5382592a28ea37b96c6c2f44 Closes-Bug: 1280955	2014-04-07 18:38:04 +00:00
Jenkins	556667087c	Merge "Fix TestGreenThreadSafeIterator test"	2014-04-01 20:39:34 +00:00
Peter Portante	5bb7603fce	Mock out time and sleep to avoid races We mock out time.time(), time.sleep() and eventlet.sleep() so that we avoid test problems caused by exceedingly long delays during the execution of the test. We also make sure to convert the units used in the tests to milliseconds for a bit more clarity. Closes bug: 1298154 Change-Id: I803d06cbf205a02a4f7bb1e0c467d276632cd6a3	2014-03-27 23:57:36 -04:00
Peter Portante	b8f73a171d	Move like unit tests together; add comments Some simple code movement to move the utils.ratelimit_sleep() unit tests together so that they can be viewed all at once. We also add some comments to document the behavior of utils.ratelimit_sleep(); small modification to max_rate parameter checking to match intended use. Change-Id: I3b11acfb6634d16a4b3594dba8dbc7a2d3ee8d1a	2014-03-27 15:58:13 -04:00
Greg Lange	a26b705d39	Fix TestGreenThreadSafeIterator test Change-Id: Ic121ae5c99b4e5c5e3d1efc432b55b762d50f3fb Closes-Bug: 1190244	2014-03-18 21:17:52 +00:00
Eamonn O'Toole	793489b80d	Allow specification of object devices for audit In object audit "once" mode we are allowing the user to specify a sub-set of devices to audit using the "--devices" command-line option. The sub-set is specified as a comma-separated list. This patch is taken from a larger patch to enable parallel processing in the object auditor. We've had to modify recon so that it will work properly with this change to "once" mode. We've modified dump_recon_cache() so that it will store nested dictionaries, in other words it will store a recon cache entry such as {'key1': {'key2': {...}}}. When the object auditor is run in "once" mode with "--devices" set the object_auditor_stats_ALL and ZBF entries look like: {'object_auditor_stats_ALL': {'disk1disk2..diskn': {...}}}. When swift-recon is run, it hunts through the nested dicts to find the appropriate entries. The object auditor recon cache entries are set to {} at the beginning of each audit cycle, and individual disk entries are cleared from cache at the end of each disk's audit cycle. DocImpact Change-Id: Icc53dac0a8136f1b2f61d5e08baf7b4fd87c8123	2014-03-11 14:17:08 +00:00
Samuel Merritt	09ef06fd99	Convert all old-style classes to new-style This cleanup has been slowly happening for a while; let's finish it. Change-Id: I1561e3540d524834e0cc5bc725ab80936eae1f0e	2014-03-03 17:28:48 -08:00
Jenkins	076634e23c	Merge "Fix fd leak from get_logger() in python 2.6"	2014-02-27 01:56:04 +00:00
Jenkins	9034558f0b	Merge "Add secondary groups to user during privilege escalation"	2014-01-28 00:37:49 +00:00
Jenkins	9b3c86fb70	Merge "ismount should not raise exceptions"	2014-01-24 21:07:09 +00:00
David Moreau Simard	c656e18949	Add secondary groups to user during privilege escalation setgid provides the primary group, setgroups sets the secondary groups. Prior to this patch, we would do a setgroups with an empty list, effectively wiping secondary groups. We now verify which secondary groups the user is member of and escalate the privileges accordingly. Change-Id: I33a10edd448b3ac5aa758a8d1d70e582cf421c7d Closes-Bug: 1269473	2014-01-23 12:43:08 -05:00
Peter Portante	ca87827db9	Use a tempfile.mkdtemp() based temporary directory Change-Id: Ie0e8137615348b130b67323d0d2913dc5ebfd5fb	2014-01-19 16:29:45 -05:00
gholt	39b822873f	ismount should not raise exceptions The changes from using os.path.ismount to using swift.common.utils.ismount has caused problems since the new one raises exceptions in cases where the old one did not. Daemons have been encountering this and exiting; servers have been 500ing instead of 507ing in this case, changing handoff behaviors, etc. Since the new one was specifically written and tested for this new behavior, I left that original function as ismount_raw and made ismount do what it did before. If there really isn't some reason for this new behavior, I'll be glad to get rid of ismount_raw and just keep ismount. I couldn't see any reason for the new behavior myself. Change-Id: I2b5b17f9ed9656cd8804a5ed568170697d0b183d	2014-01-16 08:01:10 +00:00
gholt	a3f2400cba	Consolidating and standardizing x-delete-at format Change-Id: Idc916da1c7fe1cc43a2c26f7f7ee1d4fcdd52c89	2014-01-14 15:40:35 +00:00
Jenkins	d698c21ab3	Merge "New container sync configuration option"	2014-01-14 02:40:39 +00:00
Samuel Merritt	1901719542	Move all SLO functionality to middleware This way, with zero additional effort, SLO will support enhancements to object storage and retrieval, such as: * automatic resume of GETs on broken connection (today) * storage policies (in the near future) * erasure-coded object segments (in the far future) This also lets SLOs work with other sorts of hypothetical third-party middleware, for example object compression or encryption. Getting COPY to work here is sort of a hack; the proxy's object controller now checks for "swift.copy_response_hook" in the request's environment and feeds the GET response (the source of the new object's data) through it. This lets a COPY of a SLO manifest actually combine the segments instead of merely copying the manifest document. Updated ObjectController to expect a response's app_iter to be an iterable, not just an iterator. (PEP 333 says "When called by the server, the application object must return an iterable yielding zero or more strings." ObjectController was just being too strict.) This way, SLO can re-use the same response-generation logic for GET and COPY requests. Added a (sort of hokey) mechanism to allow middlewares to close incompletely-consumed app iterators without triggering a warning. SLO does this when it realizes it's performed a ranged GET on a manifest; it closes the iterable, removes the range, and retries the request. Without this change, the proxy logs would get 'Client disconnected on read' in them. DocImpact blueprint multi-ring-large-objects Change-Id: Ic11662eb5c7176fbf422a6fc87a569928d6f85a1	2014-01-13 10:52:29 -08:00
gholt	f60d05686f	New container sync configuration option Summary of the new configuration option: The cluster operators add the container_sync middleware to their proxy pipeline and create a container-sync-realms.conf for their cluster and copy this out to all their proxy and container servers. This file specifies the available container sync "realms". A container sync realm is a group of clusters with a shared key that have agreed to provide container syncing to one another. The end user can then set the X-Container-Sync-To value on a container to //realm/cluster/account/container instead of the previously required URL. The allowed hosts list is not used with this configuration and instead every container sync request sent is signed using the realm key and user key. This offers better security as source hosts can be faked much more easily than faking per request signatures. Replaying signed requests, assuming it could easily be done, shouldn't be an issue as the X-Timestamp is part of the signature and so would just short-circuit as already current or as superceded. This also makes configuration easier for the end user, especially with difficult networking situations where a different host might need to be used for the container sync daemon since it's connecting from within a cluster. With this new configuration option, the end user just specifies the realm and cluster names and that is resolved to the proper endpoint configured by the operator. If the operator changes their configuration (key or endpoint), the end user does not need to change theirs. DocImpact Change-Id: Ie1704990b66d0434e4991e26ed1da8b08cb05a37	2014-01-10 23:48:00 +00:00
Samuel Merritt	0ff39c14c3	Fix fd leak from get_logger() in python 2.6 Calling get_logger({}) instantiates a logging.handlers.SyslogHandler, which opens and keeps a socket around (either /dev/log or UDP or whatever; not important). Under Python 2.6, all logging handlers instantiated anywhere at all will live for the entire lifetime of the program; they get stored in logging._handlerList and logging._handlers. Python 2.7 is very similar, but uses weakrefs instead of strong references in those module-level variables, so logging handlers can actually get cleaned up prior to program exit. The net effect is that any program that calls get_logger() more than a fixed number of times will leak file descriptors under Python 2.6. This commit throws encapsulation out the window and, under 2.6 only, replaces strong references with weakrefs in logging._handlerList and logging._handlers, thus avoiding the leak. Change-Id: I5dc0d1619c5a4500f892b898afd9e3668ec0ee7c	2013-12-19 11:19:40 -08:00
Samuel Merritt	16204c706d	Preserve tracebacks from run_in_thread Now the traceback goes all the way down to where the exception came from, not just down to run_in_thread. Better for debugging. Change-Id: Iac6acb843a6ecf51ea2672a563d80fa43d731f23	2013-12-17 16:18:35 -08:00
Jenkins	34b4bf34d2	Merge "Added discoverable capabilities."	2013-11-28 00:11:35 +00:00
Michael Barton	7207926cff	slightly less early quorum The early quorum change has maybe added a little bit too much eventual to the consistency of requests in Swift, and users can sometimes get unexpected results. This change gives us a knob to turn in finding the right balance, by adding a timeout where pending requests can finish after quorum is achieved. Change-Id: Ife91aaa8653e75b01313bbcf19072181739e932c	2013-11-25 21:25:55 +00:00
Jenkins	5989849512	Merge "Per device replication_lock"	2013-11-22 23:04:33 +00:00
Richard (Rick) Hawkins	2c4bf81464	Added discoverable capabilities. Swift can now optionally be configured to allow requests to '/info', providing information about the swift cluster. Additionally a HMAC signed requests to '/info?swiftinfo_sig=<sign>&swiftinfo_expires=<expires>' can be configured allowing privileged access to more sensitive information not meant to be public. DocImpact Change-Id: I2379360fbfe3d9e9e8b25f1dc34517d199574495 Implements: blueprint capabilities Closes-Bug: #1245694	2013-11-22 15:54:13 -06:00
gholt	c859ebf5ce	Per device replication_lock New replication_one_per_device (True by default) that restricts incoming REPLICATION requests to one per device, replication_currency allowing. Also has replication_lock_timeout (15 by default) to control how long a request will wait to obtain a replication device lock before giving up. This should be very useful in that you can be assured any concurrent REPLICATION requests are each writing to distinct devices. If you have 100 devices on a server, you can set replication_concurrency to 100 and be confident that, even if 100 replication requests were executing concurrently, they'd each be writing to separate devices. Before, all 100 could end up writing to the same device, bringing it to a horrible crawl. NOTE: This is only for ssync replication. The current default rsync replication still has the potentially horrible behavior. Change-Id: I36e99a3d7e100699c76db6d3a4846514537ff685	2013-11-22 21:40:29 +00:00
Samuel Merritt	b5b0b78fc7	Remove obsolete future imports The with statement has been standard since Python 2.5, so we can get rid of these imports. Change-Id: I280971c3d8c01e94cc2c17cacaedcbe9d9c8a3c3	2013-11-22 12:23:58 -08:00
Michael Barton	f0c0855ec8	early quorum responses Allow the proxy to respond to many types of requests as soon as it has a quorum. This can help speed up responses (without changing the results), especially when one node is acting up. I had to fix a few unit tests that no longer match the backend http requests made by our proxy. Change-Id: Ieb070dc3019e217e717b96154a7a809409bf40a5	2013-11-07 01:48:14 +00:00
Yuan Zhou	ed5101b200	Adding more unit tests for audit_location_generator Change-Id: I40410fbbb79cea8647074f703e4675364c69d930	2013-10-19 11:40:35 +08:00
Peter Portante	9411a24ba7	Revert "Refactor common/utils methods to common/ondisk" This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32 Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-10-07 17:18:09 -04:00
Jenkins	0b594bc3af	Merge "Change OpenStack LLC to Foundation"	2013-10-07 16:09:37 +00:00
Jenkins	10bb74a872	Merge "SLOs broken for range requests"	2013-09-20 19:09:23 +00:00
ZhiQiang Fan	92ae497800	Fix unsuitable assertTrue assertTrue accepts a parameter msg which will be printed when assertion fails, usually msg is a str. This patch fixes unsuitable usage of assertTrue which set msg to bool type True. Change-Id: I731f8ea553c935eba0e112ffded16f41a5ea86c0 Fixes-Bug: #1226374	2013-09-20 23:52:56 +08:00
David Goetz	01f58d6826	SLOs broken for range requests Change-Id: I21175a4be0cda9a8a98c425bff11c80895cd6d3e	2013-09-20 08:51:21 -07:00
ZhiQiang Fan	f72704fc82	Change OpenStack LLC to Foundation Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58 Closes-bug: #1214176	2013-09-20 01:02:31 +08:00
Peter Portante	7760f41c3c	Refactor common/utils methods to common/ondisk Place all the methods related to on-disk layout and / or configuration into a new common module that can be shared by the various modules using the same on-disk layout. Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9 Signed-off-by: Peter Portante <peter.portante@redhat.com>	2013-09-17 17:32:04 -04:00
Jenkins	de4da072aa	Merge "Perform fewer stat calls when doing auditing of objects"	2013-09-09 23:04:44 +00:00
Dirk Mueller	3d36a76156	Use Python 3.x compatible except construct except x,y: was deprected and is removed in Python 3.x. Use "except x as y:" instead which works in any Python version >= 2.6. Change-Id: I7008c74b807340f3457d3a0c8bd0b83f23169d14	2013-09-07 10:50:54 +02:00

1 2 3 4

156 Commits