87 Commits

Author SHA1 Message Date
Samuel Merritt
a69789fa06 Allow pre-1970 dates in If-[Un]Modified-Since
If I want to fetch an object only if it is newer than the first moon
landing, I send a GET request with header:

    If-Modified-Since: Sun, 20 Jul 1969 20:18:00 UTC

Since that date is older than Swift, I expect a 2xx response. However,
I get a 412, which isn't even a valid thing to do for
If-Modified-Since; it should either be 2xx or 304. This is because of
two problems:

(a) Swift treats pre-1970 dates as invalid, and

(b) Swift returns 412 when a date is invalid instead of ignoring it.

This commit makes it so any time between datetime.datetime.min and
datetime.datetime.max is an acceptable value for If-Modified-Since and
If-Unmodified-Since. Dates outside that date range are treated as
invalid headers and thus are ignored, as RFC 2616 section 14.28
requires ("If the specified date is invalid, the header is ignored").

This only works for dates that the Python standard library can parse,
which on my machine is 01 Jan 1 to 31 Dec 9999. Eliminating those
restrictions would require implementing our own date parsing and
comparison, and that's almost certainly not worth it.

Change-Id: I4cb4903c4e5e3b6b3c9506c2cabbfbda62e82f35
2014-03-14 17:55:42 -07:00
madhuri
fc499f3092 Added a test for empty metadata
Here added a test for setting empty object metadata
and checking its value in response headers.

Change-Id: I460302661d150364a95bcd7f0ebbbc2a1e95507a
2014-03-07 10:31:24 +05:30
Samuel Merritt
09ef06fd99 Convert all old-style classes to new-style
This cleanup has been slowly happening for a while; let's finish it.

Change-Id: I1561e3540d524834e0cc5bc725ab80936eae1f0e
2014-03-03 17:28:48 -08:00
gholt
e82d40da46 Object server PUTs should respect client_timeout
It seems the object server never respected the client_timeout value
since the beginning of Swift. This is normally fine since the proxy does
and will normally hang up on the backends. But if the proxy has a bug or
if there's network issues or whatever, the object server should be smart
enough to enforce this timeout as well.

Our operations guys noticed this problem when older processes would
never exit after a reload. They started investigating and saw that the
object server had open tmp files that hadn't been touched in quite some
time. Sometimes the tmp files didn't even exist anymore since the
reclaimer deletes really old untouched tmp files.

Change-Id: Iba0397203a2dccca4a28a8c8cbfc5a60e429837a
2014-02-28 17:10:03 +00:00
Samuel Merritt
1f67eb7403 Support If-[None-]Match for object HEAD, SLO, and DLO
I moved the checking of If-Match and If-None-Match out of the object
server's GET method and into swob so that everyone can use it. The
interface is similar to the Range handling; make a response with
conditional_response=True, and you get handing of If-Match and
If-None-Match.

Since the only users of conditional_response are object GET, object
HEAD, SLO, and DLO, this has the effect of adding support for If-Match
and If-None-Match to just the latter three places and nowhere
else. This makes object GET and HEAD consistent for any kind of
object, large or small.

This also fixes a bug where various conditional headers (If-*) were
passed through to the object server on segment requests, which could
cause segment requests to fail with a 304 or 412 response. Now only
certain headers are copied to the segment requests, and that doesn't
include the conditional ones, so they can't goof up the segment
retrieval.

Note that I moved SegmentedIterable to swift.common.request_helpers
because it sprouted a transitive dependency on swob, and leaving it in
utils caused a circular import.

Bonus fix: unified the handling of DiskFileQuarantined and
DiskFileNotFound in object server GET and HEAD. Now in either case, a
412 will be returned if the client said "If-Match: *". If not, the
response is a 404, just like before.

Closes-Bug: 1279076
Closes-Bug: 1280022
Closes-Bug: 1280028

Change-Id: Id2ee78346244d516b980202e990aa38ce6812de5
2014-02-20 14:54:26 -08:00
Samuel Merritt
6acea29fa6 Move all DLO functionality to middleware
This is for the same reason that SLO got pulled into middleware, which
includes stuff like automatic retry of GETs on broken connection and
the multi-ring storage policy stuff.

The proxy will automatically insert the dlo middleware at an
appropriate place in the pipeline the same way it does with the
gatekeeper middleware. Clusters will still support DLOs after upgrade
even with an old config file that doesn't mention dlo at all.

Includes support for reading config values from the proxy server's
config section so that upgraded clusters continue to work as before.

Bonus fix: resolve 'after' vs. 'after_fn' in proxy's required filters
list. Having two was confusing, so I kept the more-general one.

DocImpact

blueprint multi-ring-large-objects

Change-Id: Ib3b3830c246816dd549fc74be98b4bc651e7bace
2014-02-03 18:29:48 -08:00
gholt
b45e9a97ee Skip delete_at_update for replication requests
Requests through the object server that are from backend replication
should not send x-delete-at updates to the .expiring_objects account.
Replication is just moving data around or making new replicas, not
creating new data from nothing.

Change-Id: I324864face3ff559822c7a50c50e675e8b889b48
2014-01-23 01:05:16 +00:00
gholt
a3f2400cba Consolidating and standardizing x-delete-at format
Change-Id: Idc916da1c7fe1cc43a2c26f7f7ee1d4fcdd52c89
2014-01-14 15:40:35 +00:00
Jenkins
7f456ef35f Merge "change the last-modified header value with valid one" 2013-12-20 00:57:36 +00:00
Kiyoung Jung
d69e013519 change the last-modified header value with valid one
the Last-Modified header in Response didn't have a suitable
value - an integer part of object's timestamp.
This leads that the the if-[un]modified-since header with the
value from last-modified is always earlier than timestamp
and results the content is always newer than value of these
conditional headers.
Patched code returns math.ceil() of object's timestamp
in Last-Modified header so the later conditional header works
correctly

Closes-Bug: #1248818
Change-Id: I1ece7d008551bf989da74d23f0ed6307c45c5436
2013-12-19 09:31:17 +00:00
Peter Portante
1bb6563a19 Handle non-integer values for if-delete-at
If a client passes us a non-integer value for if-delete-at we'll now
properly report a 400 error instead of a 503.

Closes-Bug: 1259300
Change-Id: I8bb0bb9aa158d415d4f491b5802048f0cd4d8ef6
2013-12-14 22:28:56 -05:00
Peter Portante
d26e8b25a7 Bring obj server unit tests to > 98%
This set of changes attempts to bring the unit test coverage to over
98% for the object server module.

Two changes to the object server are made with this patch:

1. The try/except block around diskfile.write_metadata() was removed
   at the end of the POST method

The write_metadata() method of the DiskFile module does not raise
either the DiskFileNotExist or DiskFileQuarantined exceptions on that
code path.

2. The conditional container_update() call was removed at the end of
   the PUT method

The container_update() calls is performed when a new object is created
or when an exist object is updated. Since we already report old
timestamps as 409s (Conflict) we always perform the update.

We also fix an existing test to clear the hash prefix so that it can
actually detect the async pending pickle file creation for a failure
mode.

Change-Id: I71ec9dcf7c0ac86e56aa0f06993d501fdfa22d5b
2013-12-11 17:18:13 -05:00
Clay Gerrard
74b51c9c06 fix expired object deletion
fixes bug #1257330

Change-Id: I49f645abdeba97eafb3ae42ef9e3684c912cfdc6
2013-12-04 23:09:02 -08:00
Alex Gaynor
87cd559847 Account for a platform difference in semaphores
On OS X (and probably other Operating Systems) it isn't possible to
introspect the value of a semaphore. Account for this by skipping a
test about this.

Change-Id: I97824f9fc4e36de4f7a62c8ce53865e6977dfdfe
2013-11-27 14:34:06 -06:00
Clay Gerrard
9e80fd45a0 Add a DebugLogger for wsgi server tests
Change-Id: Ifd2528be443ba3879bf4921f6c5f4ef31f29044b
2013-11-21 01:35:58 -08:00
gholt
a80c720af5 Object replication ssync (an rsync alternative)
For this commit, ssync is just a direct replacement for how
we use rsync. Assuming we switch over to ssync completely
someday and drop rsync, we will then be able to improve the
algorithms even further (removing local objects as we
successfully transfer each one rather than waiting for whole
partitions, using an index.db with hash-trees, etc., etc.)

For easier review, this commit can be thought of in distinct
parts:

1)  New global_conf_callback functionality for allowing
    services to perform setup code before workers, etc. are
    launched. (This is then used by ssync in the object
    server to create a cross-worker semaphore to restrict
    concurrent incoming replication.)

2)  A bit of shifting of items up from object server and
    replicator to diskfile or DEFAULT conf sections for
    better sharing of the same settings. conn_timeout,
    node_timeout, client_timeout, network_chunk_size,
    disk_chunk_size.

3)  Modifications to the object server and replicator to
    optionally use ssync in place of rsync. This is done in
    a generic enough way that switching to FutureSync should
    be easy someday.

4)  The biggest part, and (at least for now) completely
    optional part, are the new ssync_sender and
    ssync_receiver files. Nice and isolated for easier
    testing and visibility into test coverage, etc.

All the usual logging, statsd, recon, etc. instrumentation
is still there when using ssync, just as it is when using
rsync.

Beyond the essential error and exceptional condition
logging, I have not added any additional instrumentation at
this time. Unless there is something someone finds super
pressing to have added to the logging, I think such
additions would be better as separate change reviews.

FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
CLUSTERS. Some of us will be in a limited fashion to look
for any subtle issues, tuning, etc. but generally ssync is
an experimental feature. In its current implementation it is
probably going to be a bit slower than rsync, but if all
goes according to plan it will end up much faster.

There are no comparisions yet between ssync and rsync other
than some raw virtual machine testing I've done to show it
should compete well enough once we can put it in use in the
real world.

If you Tweet, Google+, or whatever, be sure to indicate it's
experimental. It'd be best to keep it out of deployment
guides, howtos, etc. until we all figure out if we like it,
find it to be stable, etc.

Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6
2013-11-07 16:52:01 +00:00
Peter Portante
5202b0e586 DiskFile API, with reference implementation
Refactor on-disk knowledge out of the object server by pushing the
async update pickle creation to the new DiskFileManager class (name is
not the best, so suggestions welcome), along with the REPLICATOR
method logic. We also move the mount checking and thread pool storage
to the new ondisk.Devices object, which then also becomes the new home
of the audit_location_generator method.

For the object server, a new setup() method is now called at the end
of the controller's construction, and the _diskfile() method has been
renamed to get_diskfile(), to allow implementation specific behavior.

We then hide the need for the REST API layer to know how and where
quarantining needs to be performed. There are now two places it is
checked internally, on open() where we verify the content-length,
name, and x-timestamp metadata, and in the reader on close where the
etag metadata is checked if the entire file was read.

We add a reader class to allow implementations to isolate the WSGI
handling code for that specific environment (it is used no-where else
in the REST APIs). This simplifies the caller's code to just use a
"with" statement once open to avoid multiple points where close needs
to be called.

For a full historical comparison, including the usage patterns see:
https://gist.github.com/portante/5488238

(as of master, 2b639f5, Merge
 "Fix 500 from account-quota     This Commit
 middleware")
--------------------------------+------------------------------------
                                 DiskFileManager(conf)

                                   Methods:
                                     .pickle_async_update()
                                     .get_diskfile()
                                     .get_hashes()

                                   Attributes:
                                     .devices
                                     .logger
                                     .disk_chunk_size
                                     .keep_cache_size
                                     .bytes_per_sync

DiskFile(a,c,o,keep_data_fp=)    DiskFile(a,c,o)

  Methods:                         Methods:
   *.__iter__()
    .close(verify_file=)
    .is_deleted()
    .is_expired()
    .quarantine()
    .get_data_file_size()
                                     .open()
                                     .read_metadata()
    .create()                        .create()
                                     .write_metadata()
    .delete()                        .delete()

  Attributes:                      Attributes:
    .quarantined_dir
    .keep_cache
    .metadata
                                *DiskFileReader()

                                   Methods:
                                     .__iter__()
                                     .close()

                                   Attributes:
                                    +.was_quarantined

DiskWriter()                     DiskFileWriter()

  Methods:                         Methods:
    .write()                         .write()
    .put()                           .put()

* Note that the DiskFile class   * Note that the DiskReader() object
  implements all the methods       returned by the
  necessary for a WSGI app         DiskFileOpened.reader() method
  iterator                         implements all the methods
                                   necessary for a WSGI app iterator

                                 + Note that if the auditor is
                                   refactored to not use the DiskFile
                                   class, see
                                   https://review.openstack.org/44787
                                   then we don't need the
                                   was_quarantined attribute

A reference "in-memory" object server implementation of a backend
DiskFile class in swift/obj/mem_server.py and
swift/obj/mem_diskfile.py.

One can also reference
https://github.com/portante/gluster-swift/commits/diskfile for the
proposed integration with the gluster-swift code based on these
changes.

Change-Id: I44e153fdb405a5743e9c05349008f94136764916
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-10-17 15:03:31 -04:00
Peter Portante
9411a24ba7 Revert "Refactor common/utils methods to common/ondisk"
This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32

Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-10-07 17:18:09 -04:00
ZhiQiang Fan
f72704fc82 Change OpenStack LLC to Foundation
Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58
Closes-bug: #1214176
2013-09-20 01:02:31 +08:00
Peter Portante
7760f41c3c Refactor common/utils methods to common/ondisk
Place all the methods related to on-disk layout and / or configuration
into a new common module that can be shared by the various modules
using the same on-disk layout.

Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-17 17:32:04 -04:00
Clay Gerrard
d51e873423 Remove keep_data_fp argument from DiskFile constructor
All access to the data_file fp for DiskFile is moved after the new "open"
method.  This prepares to move some additional smarts into DiskFile and reduce
the surface area of the abstraction and the exposure of the underlying
implementation in the object-server.

Future work:

 * Consolidate put_metadata to DiskWriter
 * Add public "update_metdata" method to DiskFile
 * Create DiskReader class to gather all access of methods under "open"

Change-Id: I4de2f265bf099a810c5f1c14b5278d89bd0b382d
2013-09-10 17:51:56 -07:00
Peter Portante
9d98070f7b Remove reference to 'file' built-in
Change-Id: Ie79e8ede393e92824fd906df1ff1933193c00943
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-06 13:44:09 -04:00
Peter Portante
22451b22cb Pep8 final two unit test modules and enforce (12 of 12)
We also fix up any other pep8 failures that snuck in from merges along
the way.

Change-Id: I4ea984780ac2eac458c98fe181684eef4e04beaf
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-04 23:35:46 -04:00
Peter Portante
c067abd21e Pep8 unit test modules for hacking and one liners (4 of 12)
Address all the "hacking" lines that are flagged, and all the modules
that just have one item flagged.

Change-Id: I372a4bdf9c7748f73e38c4fd55e5954f1afade5b
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-01 15:12:39 -04:00
Peter Portante
35b991aab1 Some how DELETE not using _parse_path()
It seems as this conversion was missed, as a git blame says the first
few lines of DELETE date all the way back to commit 001407b9 "(creiht)
2010-07-12 Initial commit of Swift code".

While we were in here, we moved _parse_path() to a module method as
there appears to be need to keep it as an object controller method.

We also fixed up as many of the tests that directly invoked the object
controller methods to use get_response(), addressing a few
inconsistencies along the way.

Change-Id: If491c7129d61d6fc7d81401fbc3650c29ed80465
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-08-16 17:20:36 -04:00
Peter Portante
6b9806e0e8 Fix handling of DELETE obj reqs with old timestamp
The DELETE object REST API was creating tombstone files with old
timestamps, potentially filling up the disk, as well as sending
container updates.

Here we now make DELETEs with a request timestamp return a 409 (HTTP
Conflict) if a data file exists with a newer timestamp, only creating
tombstones if they have a newer timestamp.

The key fix is to actually read the timestamp metadata from an
existing tombstone file (thanks to Pete Zaitcev for catching this),
and then only create tombstone files with newer timestamps.

We also prevent PUT and POST operations using old timestamps as well.

Change-Id: I631957029d17c6578bca5779367df5144ba01fc9
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-08-06 12:44:42 -04:00
Jenkins
3e82858a21 Merge "Catch swob responses that are raised." 2013-07-25 17:51:43 +00:00
Samuel Merritt
df7fc9658b Catch swob responses that are raised.
This lets us get rid of some really repetitive exception conversion
code, like everybody that called common.utils.get_param() had to catch
a UnicodeDecodeError and turn that into returning HTTPBadRequest. Now
get_param() just raises HTTPBadRequest directly, and the __call__
methods in the account/container/object servers catch and return
it. All that "except UnicodeDecodeError" stuff goes away.

Refactored the path splitting and device validation in the object
server too.

There are other things that can benefit from this as well, but this
patch is big enough.

Change-Id: I2be96ef757d04bfd6af180cd9c92393c841db21f
2013-07-24 16:59:45 -07:00
Alex Gaynor
ff5a6d0111 Corrected many style violations in the tests.
I focussed primarily on F-category violations, they are all but all fixed with
this patch.

Change-Id: I343f6945c97984ed1093bc347b6def6994297041
2013-07-24 10:18:47 -07:00
Clay Gerrard
c9de9f2b8d Forklift the DiskFile interface into it's own module
* new module swift.obj.diskfile

I parameterized two constants from obj.server into the DiskFile's __init__

 * DATADIR -> obj_dir
 * DISALLOWED_HEADERS -> disallowed_metadata_keys

I'm not sure if this is the right long term abstraction but for now it avoids
circular imports.

Change-Id: I3962202c07c4b2fbfc26f9776c8a5c96292ae199
2013-07-18 08:00:14 -07:00
Jenkins
f805abb1fc Merge "Move replication allow method to decorators" 2013-07-16 19:27:59 +00:00
Peter Portante
bc99f58c76 Fix unit tests to properly marked deleted files
The unit tests were playing fast and loose with the tombstone marker,
where the test framework was setting up a DiskFile object which had
its data written to the .ts file, not the .data file. This behavior
did not reflect how the interfaces to DiskFile were supposed to
work.

Change-Id: Idd6e8882e062ba2e13489f14189223ab4158677c
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-07-15 21:07:08 +00:00
Vladimir Vechkanov
bc08215f83 Move replication allow method to decorators
Remove logic of allowed methods list from object, container and account
servers. Instead of it add replicator decorator to utils and use new
decorator for REPLICATE methods in object/account/container servers.
This decorator mark method as special for usfor use only by the
replication.

If the option replication_server is not used, then this mechanism is not
enabled. If the replicaton_server option is set (not None) then the
respective server is a replicator (option value is True) and should use
ONLY the methods marked for replication server using the decorator, or
it is a normal server type and should NOT use methods marked for the
replication server.

Change-Id: I1041b31413cd0c39000317cc57a8c27816e1dfe8
2013-07-12 11:38:17 +04:00
Jenkins
b545fbe2af Merge "Add UT for checking special chars in object name" 2013-06-13 06:22:44 +00:00
Jenkins
b63b5d590a Merge "Use threadpools in the object server for performance." 2013-06-11 22:47:07 +00:00
gholt
fef2afd927 Fixed Bug 1187200
See Bug 1187200 for a full description of the problem.

Part 1:

X-Delete-At-Container added to X-Delete-At-* info

This fixes the bug by passing the expiring-objects-account's
container name onward to the backend object servers. This is in case
the object servers' expiring_objects_container_divisor happens to be
different than the proxy server's, we want to make sure the host,
partition, and device match up with the container name. Different
container names would be fine, but not with mismatched host,
partition, and device info.

Part 2:

The db_replicator now double checks the disk path's partition against
the partition the ring gives back. If they don't match, it logs the
problem but continues to replicate the database to where it should be
and, on success to all proper nodes, removes the local out of place
database.

Bug 1187200

Change-Id: Id0873a3f2198ce285fe0b0c777738eff38bc2438
2013-06-08 20:00:32 +00:00
Kun Huang
eec98c15cd Add UT for checking special chars in object name
Add special chars could help checking quote/unquote, encode/decode
problems. Those two is one of most common mistakes.

Change-Id: Ife1c0b481f08c1666d62b4fb51b7fdcdfdbf2ba6
2013-06-09 01:16:49 +08:00
Samuel Merritt
b491549ac2 Use threadpools in the object server for performance.
Without a (per-disk) threadpool, requests to a slow disk would affect
all clients by blocking the entire eventlet reactor on
read/write/etc. The slower the disk, the worse the performance. On an
object server, you frequently have at least one slow disk due to
auditing and replication activity sucking up all the available IO. By
kicking those blocking calls out to a separate OS thread, we let the
eventlet reactor make progress in other greenthreads, and by having a
per-disk pool, we ensure that one slow disk can't suck up all the
resources of an entire object server.

There were a few blocking calls that were done with eventlet.tpool,
but that's a fixed-size global threadpool, so I moved them to the
per-disk threadpools. If the object server is configured not to use
per-disk threadpools, (i.e. threads_per_disk = 0, which is the
default), those call sites will still ultimately end up using
eventlet.tpool.execute. You won't end up blocking a whole object
server while waiting for a huge fsync.

If you decide not to use threadpools, the only extra overhead should
be a few extra Python function calls here and there. This is
accomplished by setting threads_per_disk = 0 in the config.

blueprint concurrent-disk-io
Change-Id: I490f8753d926fdcee3a0c65c5aaf715bc2b7c290
2013-06-07 13:06:04 -07:00
Peter Portante
5174b7f85d Rework to support RFC 2616 Sec 4.4 Message Length
RFC 2616 Sec 4.4 Message Length describes how the content-length and
transfer-encoding headers interact. Basically, if chunked transfer
encoding is used, the content-length header value is ignored and if
the content-length header is present, and the request is not using
chunked transfer-encoding, then the content-length must match the body
length.

The only Transfer-Coding value we support in the Transfer-Encoding
header (to date) is "chunked". RFC 2616 Sec 14.41 specifies that if
"multiple encodings have been applied to an entity, the
transfer-codings MUST be listed in the order in which they were
applied." Since we only supported "chunked". If the Transfer-Encoding
header value has multiple transfer-codings, we return a 501 (Not
Implemented) (see RFC 2616 Sec 3.6) without checking if chunked is the
last one specified. Finally, if transfer-encoding is anything but
"chunked", we return a 400 (Bad Request) to the client.

This patch adds a new method, message_length, to the swob request
object which will apply an algorithm based on RFC 2616 Sec 4.4
leveraging the existing content_length property.

In addition to these changes, the proxy server will now notice when
the message length specified by the content-length header is greater
than the configured object maximum size and fail the request with a
413, "Request Entity Too Large", before reading the entire body.

This work flows from https://review.openstack.org/27152.

Change-Id: I5d2a30b89092680dee9d946e1aafd017eaaef8c0
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-05-25 21:05:56 -04:00
Peter Portante
d0a27f477b Hide the file descriptor and disk write methodology for PUTs
Towards moving the DiskFile class into place as the API definition for
pluggable DiskFile backends, we hide the file descriptor and the
method of writing data to disks. The mkstemp() method has been renamed
to writer(), and no longer returns an fd, but a new object that
encapsulates the state tracked for writes. This new object is then
used directly to perform the reminder of the write operations and
application of required semantics.

Change-Id: Ib37ed37b34a2ce6b442d69f83ca011c918114434
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-05-22 16:01:22 +00:00
Sergey Kraynev
ea7858176b Implementation of replication servers
Support separate replication ip address:
- Added new function in utils. This function provides ability
  to select separate IP address for replication service.
- Db_replicator and object replicators were changed.
  Replication process uses new function now.

Replication network parameters:
- Replication network fields (replication_ip, replication_port)
  support was added to device dictionary in swift-ring-builder script.
- Changes were made to support new fields in search, show and set_info
  functions.

Implementation of replication servers:
- Separate replication servers use the same code as normal replication
  servers, but with replication_server parameter = True.  When using a
  separate replication network, the non-replication servers set
  replication_server = False.  When there is no separate replication
  network (the default case), replication_server is not included in the config.

DocImpact
Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0
Implements: blueprint dedicated-replication-network
2013-04-21 18:14:42 -04:00
Peter Portante
8825c9c74a Enhance log msg to report referer and user-agent
Enhance internally logged messages to report referer and user-agent.

Pass the referering URL and METHOD between internal servers (when
known), and set the user-agent to be the server type (obj-server,
container-server, proxy-server, obj-updater, obj-replicator,
container-updater, direct-client, etc.) with the process PID. In
conjunction with the transaction ID, it helps to track down which PID
from a given system was responsible for initiating the request and
what that server was working on to make this request.

This has been helpful in tracking down interactions between object,
container and account servers.

We also take things a bit further performaing a bit of refactoring to
consolidate calls to transfer_headers() now that we have a helper
method for constructing them.

Finally we performed further changes to avoid header key duplication
due to string literal header key values and the various objects
representing headers for requests and responses. See below for more
details.

====

Header Keys

There seems to be a bit of a problem with the case of the various
string literals used for header keys and the interchangable way
standard Python dictionaries, HeaderKeyDict() and HeaderEnvironProxy()
objects are used.

If one is not careful, a header object of some sort (one that does not
normalize its keys, and that is not necessarily a dictionary) can be
constructed containing header keys which differ only by the case of
their string literals. E.g.:

   { 'x-trans-id': '1234', 'X-Trans-Id': '5678' }

Such an object, when passed to http_connect() will result in an
on-the-wire header where the key values are merged together, comma
separated, that looks something like:

   HTTP_X_TRANS_ID: 1234,5678

For some headers in some contexts, this is behavior is desirable. For
example, one can also use a list of tuples which enumerate the multiple
values a single header should have.

However, in almost all of the contexts used in the code base, this is
not desirable.

This behavior arises from a combination of factors:

   1. Header strings are not constants and different lower-case and
      title-case header strings values are used interchangably in the
      code at times

      It might be worth the effort to make a pass through the code to
      stop using string literals and use constants instead, but there
      are plusses and minuses to doing that, so this was not attempted
      in this effort

   2. HeaderEnvironProxy() objects report their keys in ".title()"
      case, but normalize all other key references to the form
      expected by the Request class's environ field

      swob.Request.headers fields are HeaderEnvironProxy() objects.

   3. HeaderKeyDict() objects report their keys in ".lower()" case,
      and normalize all other key references to ".lower()" case

      swob.Response.headers fields are HeaderKeyDict() objects.

Depending on which object is used and how it is used, one can end up
with such a mismatch.

This commit takes the following steps as a (PROPOSED) solution:

   1. Change HeaderKeyDict() to normalize using ".title()" case to
      match HeaderEnvironProxy()

   2. Replace standard python dictionary objects with HeaderKeyDict()
      objects where possible

      This gives us an object that normalizes key references to avoid
      fixing the code to normalize the string literals.

   3. Fix up a few places to use title case string literals to match
      the new defaults

Change-Id: Ied56a1df83ffac793ee85e796424d7d20f18f469
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-05-13 17:39:02 +00:00
Jenkins
b9a6bcb431 Merge "Add an explicit unit test for handling content-length: 0" 2013-05-03 11:47:19 +00:00
Peter Portante
d62a2a832e Push fallocate() down into mkstemp(); use known size
Towards defining the DiskFile class, or something like it, as an API
for the low level disk acesses, we push the fallocate() system call
down into the DiskFile.mkstemp() method. This allows another
implementation of DiskFile to decide to use or not use fallocate().

Change-Id: Ib4d2ee1f971e4e20e53ca4b41892c5e44ecc88d5
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-04-23 16:09:13 -04:00
Peter Portante
960f01b4ba Add an explicit unit test for handling content-length: 0
Change-Id: I3568d4dc1900e6ddb4860589ca6a7b7039cc8c2d
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-04-22 12:23:23 -04:00
David Hadas
caa01cd81e objects md5-collisions
This patch identifies md5 collisions on objects and sends a 403
from the object server.

Credits for originating this fix are to Michael Factor.

Change-Id: I4f1b32183e2be6bbea56eaff86b9a4c7f440804a
Fix: Bug #1157454
2013-04-09 23:20:33 +03:00
Jenkins
ab355e349a Merge "Fix reading xattrs in object-server's unittests." 2013-04-03 20:55:06 +00:00
Greg Lange
44f00a23c1 fixed some minor things in tests that pyflakes complained about
Change-Id: Ifeab56a964630bcf941e932fcbe39e6572e62975
2013-03-26 20:42:26 +00:00
David Hadas
a979c8007b Add support for Hash Prefix
A new configuration parameter is added to /etc/swift/swift.conf
[swift-hash]
swift_hash_path_prefix = 'random unique string'

New installations are advised to set this parameter to a random secret,
which would not be disclosed ouside the organization.
The same secret needs to be used by all swift servers of the same cluster.

Existing installations should set this parameter to an empty string
(the default)

DocImpact

Fixes: Bug #1157454

Change-Id: I63b10d0b7d6dd3f74e0f10bb41b5f240fa03578a
2013-03-22 19:41:55 +02:00
Vladimir Vechkanov
9e3d2f6ea8 Fix reading xattrs in object-server's unittests.
Use for reading metadata in unit tests function from object-server.

Change-Id: I2bfeb76fdd775442a0e614fef740b0987fba4a22
Fixes: bug #1079131
2013-03-22 17:02:13 +04:00