55 Commits

Author SHA1 Message Date
Samuel Merritt
9430f4c9f5 Move HeaderKeyDict to avoid an inline import
There was a function in swift.common.utils that was importing
swob.HeaderKeyDict at call time. It couldn't import it at compilation
time since utils can't import from swob or else it blows up with a
circular import error.

This commit just moves HeaderKeyDict into swift.common.header_key_dict
so that we can remove the inline import.

Change-Id: I656fde8cc2e125327c26c589cf1045cb81ffc7e5
2016-03-07 12:26:48 -08:00
Takashi Kajinami
8e4347afd5 Fix proxy-server's support for chunked transferring in GET object
Proxy-server now requires Content-Length in the response header
when getting object and does not support chunked transferring with
"Transfer-Encoding: chunked"

This doesn't matter in normal swift, but prohibits us from putting
any middelwares to execute something like streaming processing of
objects, which can't calculate the length of their response body
before they start to send their response.

Change-Id: I60fc6c86338d734e39b7e5f1e48a2647995045ef
2016-03-02 22:56:13 +09:00
Jenkins
d7529726b4 Merge "Fix missing txn_id logs in GreenAsyncPile's spawned functions" 2016-02-17 00:20:03 +00:00
Janonymous
4906b4c431 Fix missing txn_id logs in GreenAsyncPile's spawned functions
This commit ensures that the logger thread_locals
value is passed to and set in _get_conn_response methods
executed in a green thread.

Added partial bug tag because in bug description a more
relevant fix is suggested which would fix the bug completely
but for now this makes sense to add this commit for logging.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I13bbf174fdca89318d69bb0674ed23dc9ec25b9a
Partial-Bug: #1409302
2016-02-15 08:49:48 +05:30
Kota Tsuyuzaki
b173995666 Fix missing Accept-Ranges
Since commit 4f2ed8bcd0468f3b69d5fded274d8d6b02ac3d10, the response
header for GET EC object doesn't include the Accept-Ranges header.

This patch fixes it and also adds a few unittests to prevent regression.

Closes-Bug: #1542168

Change-Id: Ibafe56ac87b14bc0028953e620a653cd68dd3f84
2016-02-04 21:53:58 -08:00
Samuel Merritt
4dc4cbfc1c Fix 503 on zero-byte replicated PUT with incorrect Etag
Closes-Bug: 1516579
Change-Id: Iac91ed61254d3ca232521191fec25c19acb66413
2015-11-16 14:28:42 -08:00
Kota Tsuyuzaki
fd30df6e65 Small cleanup for unit/proxy/controllers/test_obj
Follow up for https://review.openstack.org/#/c/236007

This fixes following minor items:

- Fix a 'raise Exception class' syntax to 'raise Exception instance'
- Use original eventlet.Timeout instead of swift.exceptions.Timeout
  imported from eventlet.Timeout
- Change Timeout to initiate w/o args (1st arguments should be timeout
  second and we don't have to set None if we don't want to set the sec)
- Add a message argument to some Exception instances

Change-Id: Iab608cd8a1f4d3f5b4963c26b94ab0501837ffe1
2015-11-11 09:03:54 -08:00
Bill Huber
66dc1eebb1 ObjectControllers return application errors as 499 on bad read
In the _transfer_data method, we translate all (Exception, Timeout)
into a 499 whereas we should consider translating them to 500 on
particular returning error scenarios.

This affects both ReplicatedObjectController and ECObjectControllear.

Change-Id: I571bbc5b1451243907b094a5718c8735fd824268
Closes-Bug: 1504299
2015-11-10 13:31:39 -06:00
Jenkins
c49f71585b Merge "Close ECAppIter's sub-generators before propagating GeneratorExit" 2015-10-19 11:47:50 +00:00
Jenkins
cc46ab0b8f Merge "py3: Replace gen.next() with next(gen)" 2015-10-12 19:45:29 +00:00
janonymous
f5f9d791b0 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3, replacing it.

Change-Id: Ida206abbb13c320095bb9e3b25a2b66cc31bfba8
Co-Authored-By: Ondřej Nový <ondrej.novy@firma.seznam.cz>
2015-10-11 12:57:25 +02:00
Victor Stinner
8f85427939 py3: Replace gen.next() with next(gen)
The next() method of Python 2 generators was renamed to __next__().
Call the builtin next() function instead which works on Python 2 and
Python 3.

The patch was generated by the next operation of the sixer tool.

Change-Id: Id12bc16cba7d9b8a283af0d392188a185abe439d
2015-10-08 15:40:06 +02:00
Clay Gerrard
752ceb266b Close ECAppIter's sub-generators before propagating GeneratorExit
... which ensures no Timeouts remain pending after the parent generator
is closed when a client disconnects before being able to read the entire
body.

Also tighten up a few tests that may have left some open ECAppIter
generators lying about after the tests themselves had finished.  This
has the side effect of preventing the extraneous printing of the Timeout
errors being raised by the eventlet hub in the background while our
unittests are running.

Change-Id: I156d873c72c19623bcfbf39bf120c98800b3cada
2015-10-05 13:31:09 -07:00
Clay Gerrard
a31ee07bda Make sure we have enough .durable's for GETs
Increase the number of nodes from which we require a final successful
HTTP responses before we return success to the client on a write - to
the same number of nodes we'll require successful responses from to
service a client request for a read.

Change-Id: Ifd36790faa0a5d00ec79c23d1f96a332a0ca0f0b
Related-Bug: #1469094
2015-10-02 13:53:27 -07:00
Kota Tsuyuzaki
3f943cfcf2 Fix missing container update
At PUT object request, proxy server makes backend headers (e.g.
X-Container-Partition) which help object-servers to determine
the container-server they should update. In addition, the backend
headers are created as many as the number of container replicas.
(i.e. 3 replica in container ring, 3 backend headers will be created)

On EC case, Swift fans out fragment archives to backend object-servers.
Basically the number of fragment archives will be more than the container
replica number and proxy-server assumes a request as success when quorum
number of object-server succeeded to store. That would cause to make an
orphaned object which is stored but not container updated.

For example, assuming k=10, m=4, container replica=3 case:

Assuming, proxy-server attempts to make 14 backend streams but
unfortunately first 3 nodes returns 507 (disk failure) and then
the Swift doesn't have any other disks.

In the case, proxy keeps 11 backend streams to store and current Swift
assumes it as sufficient because it is more than or equals quorum (right
now k+1 is sufficient i.e. 11 backend streams are enough to store)
However, in the case, the 11 streams doesn't have the container update
header so that the request will succeed but container will be never updated.

This patch allows to extract container updates up to object quorum_size
+ 1 to more nodes to ensure the updates. This approach sacrifices the
container update cost a bit because duplicated updates will be there but
quorum sizes + 1 seems reasonable (even if it's reaplicated case) to pay
to ensure that instead of whole objects incude the update headers.

Now Swift will work like as follows:

For example:
k=10, m=4, qurum_size=11 (k+1), 3 replica for container.
CU: container update
CA: commit ack

That result in like as
 CU   CU   CU   CU   CU   CU   CU   CU   CU   CU   CU   CU
[507, 507, 507, 201, 201, 201, 201, 201, 201, 201, 201, 201, 201, 201]
                                              CA   CA   CA   CA   CA

In this case, at least 3 container updates are saved.

For another example:
7 replicated objects, qurum_size=4 (7//2+1), 3 replica for container.
CU: container update
CA: commit ack (201s for successful PUT on replicated)

 CU   CU   CU   CU   CU
[507, 507, 507, 201, 201, 201, 201]
                 CA   CA   CA   CA

In this replicated case, at least 2 container updates are saved.

Cleaned up some unit tests so that modifying policies doesn't leak
between tests.

Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Sam Merritt <sam@swiftstack.com>

Closes-Bug: #1460920
Change-Id: I04132858f44b42ee7ecf3b7994cb22a19d001d70
2015-09-25 15:23:24 -07:00
Jenkins
e6bd65682f Merge "Fix EC range GET/COPY handling" 2015-09-02 12:27:42 +00:00
paul luse
893f30c61d EC GET path: require fragments to be of same set
And if they are not, exhaust the node iter to go get more.  The
problem without this implementation is a simple overwrite where
a GET follows before the handoff has put the newer obj back on
the 'alive again' node such that the proxy gets n-1 fragments
of the newest set and 1 of the older.

This patch bucketizes the fragments by etag and if it doesn't
have enough continues to exhaust the node iterator until it
has a large enough matching set.

Change-Id: Ib710a133ce1be278365067fd0d6610d80f1f7372
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Closes-Bug: 1457691
2015-08-27 21:09:41 -07:00
Bill Huber
b75d2a4e37 Quorum on durable response is too low
Increase the .durable quorum from 2 to "parity + 1" to guarantee
that we will never fail to rebuild an object.  Otherwise, with
low durable responses back (< parity + 1), the putter objects
return with failed attribute set to true, thereby failing the
rebuild of fragments for an object.

Change-Id: I80d666f61273e589d0990baa78fd657b3470785d
Closes-Bug: 1484565
2015-08-17 16:05:56 -05:00
Kota Tsuyuzaki
5b246e875f Fix EC range GET/COPY handling
When range GET (or COPY) for an EC object requested, if the requested range
starts from more than last segments alignment (i.e.
ceil(object_size/segment_size) * segment_size), proxy server will return
the original content length w/o body, though Swift should return an error
massage as a body and the length of message as the content length.
The current behavior will cause stuck on some client. (e.g. curl)

This patch fixes that proxy enables to return correct response, even if such
an over range requested.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I21f81c842f563ac4dddc69011ed759b744bb20bd
Closes-Bug: #1475499
2015-08-12 07:41:36 -07:00
Bill Huber
c35cc13b8a pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3 in the following dir:
test/unit/proxy/*

Change-Id: Ie2c7e73e1096233a10ee7fbf6f88386fa4d469d6
2015-08-06 13:36:26 -05:00
Clay Gerrard
7071762d36 Fix TypeError if backend response doesn't have expected headers
There was some debug logging mixed in with some error handling on PUTs
that was relying on a very specific edge would only encounter a set of
backend responses that included the expected set of headers to diagnoise
the failure.

But the backend responses may not always have the expected headers.

The proxy debug logging should be more robust to missing headers.

It's a little hard to follow, but if you look `_connect_put_node` in
swift.proxy.controller.obj - you'll see that only a few connections can
make their way out of the initial put connection handling with a "resp"
attribute that is not None.  In the happy path (e.g. 100-Continue) it's
explictly set to None, and in most errors (Timeout, 503, 413, etc) a new
connection will be established to the next node in the node iter.

Some status code will however allow a conn to be returned for validation
in `_check_failure_put_connections`, i.e.

  * 2XX (e.g. 0-byte PUT would not send Expect 100-Continue)
  * 409 - Conflict with another timestamp
  * 412 - If-None-Match that encounters another object

... so I added tests for those - fixing a TypeError along the way.

Change-Id: Ibdad5a90fa14ce62d081e6aaf40aacfca31b94d2
2015-08-04 23:45:40 -07:00
Jenkins
6511100a56 Merge "Modify zip usage for python3 where necessary." 2015-07-28 12:52:52 +00:00
Kota Tsuyuzaki
99d052772a Fix 499 client disconnected on COPY EC object
Currently, a COPY request for an EC object might go to fail as 499 Client
disconnected because of the difference between destination request content
length and actual transferred bytes.

That is because the conditional response status and content length for
an EC object range GET is handled at calling the response instance on
proxy server. Therefore the calling response instance (resp()) will change
the conditional status from 200 (HTTP_OK) to 206 (PartialContent) and will
change the content length for the range GET.

In EC case, sometimes Swift needs whole stored contents to decode a segment.
It will make 200 HTTP OK response from object-server and proxy-server
will unfortunately set whole content length to the destination content
length and it makes the bug 1467677.

This patch introduces a new method "fix_conditional_response" for
swift.common.swob.Response that calling _response_iter() and cached the
iter in the Response instance. By calling it, Swift can set correct condtional
response any time after setting whole content_length to the response
instance like EC case.

Change-Id: If85826243f955d2f03c6ad395215c73daab509b1
Closes-Bug: #1467677
2015-07-22 02:01:32 -07:00
janonymous
dde34c4b44 Modify zip usage for python3 where necessary.
py2 zip() is eager but py3 zip() and six.moves.zip() are lazy,
changing ones that require eager evaluation.

Change-Id: Ic9f6bccd7f57772158581905794f8d23b05f4223
2015-07-17 23:47:23 +05:30
Victor Stinner
e5c962a28c Replace xrange() with six.moves.range()
Patch generated by the xrange operation of the sixer tool:
https://pypi.python.org/pypi/sixer

Manual changes:

* Fix indentation for pep8 checks
* Fix TestGreenthreadSafeIterator.test_access_is_serialized of
  test.unit.common.test_utils:
  replace range(1, 11) with list(range(1, 11))
* Fix UnsafeXrange docstring, revert change

Change-Id: Icb7e26135c5e57b5302b8bfe066b33cafe69fe4d
2015-06-23 07:29:15 +00:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
Samuel Merritt
4f2ed8bcd0 EC: support multiple ranges for GET requests
This commit lets clients receive multipart/byteranges responses (see
RFC 7233, Appendix A) for erasure-coded objects. Clients can already
do this for replicated objects, so this brings EC closer to feature
parity (ha!).

GetOrHeadHandler got a base class extracted from it that treats an
HTTP response as a sequence of byte-range responses. This way, it can
continue to yield whole fragments, not just N-byte pieces of the raw
HTTP response, since an N-byte piece of a multipart/byteranges
response is pretty much useless.

There are a couple of bonus fixes in here, too. For starters, download
resuming now works on multipart/byteranges responses. Before, it only
worked on 200 responses or 206 responses for a single byte
range. Also, BufferedHTTPResponse grew a readline() method.

Also, the MIME response for replicated objects got tightened up a
little. Before, it had some leading and trailing CRLFs which, while
allowed by RFC 7233, provide no benefit. Now, both replicated and EC
multipart/byteranges avoid extraneous bytes. This let me re-use the
Content-Length calculation in swob instead of having to either hack
around it or add extraneous whitespace to match.

Change-Id: I16fc65e0ec4e356706d327bdb02a3741e36330a0
2015-06-03 11:42:00 -07:00
Jenkins
f66e9797be Merge "Remove confusable query string on post as copy" 2015-05-28 15:20:20 +00:00
Samuel Merritt
666bf06c26 EC: don't 503 on marginally-successful PUT
On EC PUT in an M+K scheme, we require M+1 fragment archives to
durably land on disk. If we get that, then we go ahead and ask the
object servers to "commit" the object by writing out .durable
files. We only require 2 of those.

When we got exactly M+1 fragment archives on disk, and then one
connection timed out while writing .durable files, we should still be
okay (provided M is at least 3). However, we'd take our M > 2
remaining successful responses and pass that off to best_response()
with a quorum size of M+1, thus getting a 503 even though everything
worked well enough.

Now we pass 2 to best_response() to avoid that false negative.

There was also a spot where we were getting the quorum size wrong. If
we wrote out 3 fragment archives for a 2+1 policy, we were only
requiring 2 successful backend PUTs. That's wrong; the right number is
3, which is what the policy's .quorum() method says. There was a spot
where the right number wasn't getting plumbed through, but it is now.

Change-Id: Ic658a199e952558db329268f4d7b4009f47c6d03
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Closes-Bug: 1452468
2015-05-26 11:38:57 -07:00
Kota Tsuyuzaki
025c4c4339 Remove confusable query string on post as copy
Current post as copy routine (i.e. POST object with post_as_copy option
turned on) on Object Controller uses "multipart-manifest" query string
which is feeded to env['copy_hook'] to decide which data (the manifest or
object pointed by the manifest) should be copied.

However, the way using the query string will confuse operators looking at
logging system (or analyzing the log) because whole POST object requests
have 'multipart-manifest=get' like as:

POST /v1/AUTH_test/d4c816b24d38489082f5118599a67920/manifest-abcde%3Fmultipart-manifest%3Dget

We cannot know whether the query string was added by hand
(from user) or not. In addition, the query isn't needed by the
backend conversation between proxy-server and object-server.
(Just needed by "copy_hook" on the proxy controller!)

To remove the confusable query string and to keep the log to be clean,
this patch introduces new environment variable "swift.post_as_copy"
and changes proxy controller and the copy_hook to use the new env.

This item was originally discussed at
https://review.openstack.org/#/c/177132/

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>

Change-Id: I0cd37520eea1825a10ebd27ccdc7e9162647233e
2015-05-13 13:09:07 -07:00
Samuel Merritt
decbcd24d4 Foundational support for PUT and GET of erasure-coded objects
This commit makes it possible to PUT an object into Swift and have it
stored using erasure coding instead of replication, and also to GET
the object back from Swift at a later time.

This works by splitting the incoming object into a number of segments,
erasure-coding each segment in turn to get fragments, then
concatenating the fragments into fragment archives. Segments are 1 MiB
in size, except the last, which is between 1 B and 1 MiB.

+====================================================================+
|                             object data                            |
+====================================================================+

                                   |
          +------------------------+----------------------+
          |                        |                      |
          v                        v                      v

+===================+    +===================+         +==============+
|     segment 1     |    |     segment 2     |   ...   |   segment N  |
+===================+    +===================+         +==============+

          |                       |
          |                       |
          v                       v

     /=========\             /=========\
     | pyeclib |             | pyeclib |         ...
     \=========/             \=========/

          |                       |
          |                       |
          +--> fragment A-1       +--> fragment A-2
          |                       |
          |                       |
          |                       |
          |                       |
          |                       |
          +--> fragment B-1       +--> fragment B-2
          |                       |
          |                       |
         ...                     ...

Then, object server A gets the concatenation of fragment A-1, A-2,
..., A-N, so its .data file looks like this (called a "fragment archive"):

+=====================================================================+
|     fragment A-1     |     fragment A-2     |  ...  |  fragment A-N |
+=====================================================================+

Since this means that the object server never sees the object data as
the client sent it, we have to do a few things to ensure data
integrity.

First, the proxy has to check the Etag if the client provided it; the
object server can't do it since the object server doesn't see the raw
data.

Second, if the client does not provide an Etag, the proxy computes it
and uses the MIME-PUT mechanism to provide it to the object servers
after the object body. Otherwise, the object would not have an Etag at
all.

Third, the proxy computes the MD5 of each fragment archive and sends
it to the object server using the MIME-PUT mechanism. With replicated
objects, the proxy checks that the Etags from all the object servers
match, and if they don't, returns a 500 to the client. This mitigates
the risk of data corruption in one of the proxy --> object connections,
and signals to the client when it happens. With EC objects, we can't
use that same mechanism, so we must send the checksum with each
fragment archive to get comparable protection.

On the GET path, the inverse happens: the proxy connects to a bunch of
object servers (M of them, for an M+K scheme), reads one fragment at a
time from each fragment archive, decodes those fragments into a
segment, and serves the segment to the client.

When an object server dies partway through a GET response, any
partially-fetched fragment is discarded, the resumption point is wound
back to the nearest fragment boundary, and the GET is retried with the
next object server.

GET requests for a single byterange work; GET requests for multiple
byteranges do not.

There are a number of things _not_ included in this commit. Some of
them are listed here:

 * multi-range GET

 * deferred cleanup of old .data files

 * durability (daemon to reconstruct missing archives)

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I9c13c03616489f8eab7dcd7c5f21237ed4cb6fd2
2015-04-14 00:52:17 -07:00
Clay Gerrard
3782bf1b56 Remove the X-Newest pre-flight request on X-Timestamp
There is a standard LBYL race that can better be addressed by making the
EAFP case safer.

Capture 409 Conflict during expect on PUT

Similarly to how the proxy handles 412 on PUT, we will gather 409
responses in the proxy during _connect_put_node.  Rather than skipping
backend servers that already have a synced copy of an object we will
accept their response and return 202 immediately.

This is particularly useful to internal clients who are using
X-Timestamp to sync transfers (e.g. container-sync and
container-reconciler).

No observable change in client facing behavior except that swift is
faster to respond Accepted when it already has the data the client is
purposing to send.

Change-Id: Ie400d5bfd9ab28b290abce2e790889d78726095f
2015-01-07 18:16:47 -08:00
Jenkins
997f4d4b32 Merge "Consistently apply node error limiting rules in proxy" 2015-01-05 12:51:55 +00:00
Clay Gerrard
99d501831e Consistently apply node error limiting rules in proxy
All GET or HEAD requests consistently error limit nodes that return 507
and increment errors for nodes responding with any other 5XX.

There were two places in the object PUT path where the proxy was error
limiting nodes and their behavior was inconsistent.  During expect-100
connect we would only error_limit nodes on 507, and during response we
would increment errors for all 5XX series responses.  This was pretty
hard to reason about and the divergence in behavior of questionable
value.

An audit of base controller highlighted where make_requests would apply
error_limit's on 507 but not increment errors on other 5XX responses.

Now anywhere we track errors on nodes we use error_limit on 507 and
error_occurred on any other 5XX series request.  Additionally a Timeout
or Exception that is logged through exception_occurred will bump errors -
which is consistent with the approach in "Add Error Limiting to slow
nodes" [1].

1. https://review.openstack.org/#/c/112424/

Change-Id: I67e489d18afd6bdfc730bfdba76f85a2e3ca74f0
2014-11-17 20:49:43 -08:00
Clay Gerrard
b98fe3b77b Prefer X-Backend-Timestamp for X-Newest
When a X-Backend-Timestamp is available it would generally preferred
over a less specific value and sorts correctly against any X-Timestamp
values anyway.

Change-Id: I08b7eb37ab8bd6eb3afbb7dee44ed07a8c69b57e
2014-11-12 12:18:45 -08:00
Jenkins
962209c001 Merge "Treat 404s as 204 on object delete in proxy" 2014-09-16 14:29:35 +00:00
Matthew Oliver
f4d3facdf4 Treat 404s as 204 on object delete in proxy
This change adds an optional overrides map to _make_request method
in the base Controller class.

  def make_requests(self, req, ring, part, method, path, headers,
                    query_string='', overrides=None)

Which will be passed on the the best_response method. If set and
no quorum it reached, the override map is used to attempt to find
quorum.

The overrides map is in the form:

    { <response>: <override response>, .. }

The ObjectController, in the DELETE method now passes an override map
to make_requests method in the base Controller class in the form of:

    { 404: 204 }

Statuses/responses that have been overridden are used in calculation
of the quorum but never returned to the user. They are replaced by:

    (STATUS, '', '', '')

And left out of the search for best response.

Change-Id: Ibf969eac3a09d67668d5275e808ed626152dd7eb
Closes-Bug: 1318375
2014-09-10 11:52:06 +10:00
Thiago da Silva
9dcf15f8b5 moving object validation checks to top of PUT method
This adds a sanity check on x-delete headers as
part of check_object_creation method

Change-Id: If5069469e433189235b1178ea203b5c8a926f553
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2014-09-08 10:15:21 +01:00
Clay Gerrard
21adf82cf1 code shuffle post expired headers refactor
Change-Id: I62248d7d3d7e0a3696a30e3d567ac6c2bea3c8eb
2014-08-21 10:45:22 -04:00
Paul Luse
0800668557 Fix potential missing key error in container_info
If upgrading from a non-storage policy enabled version of
swift to a storage policy enabled version its possible that
memcached will have an info structure that does not contain
the 'storage_policy" key resulting in an unhandled exception
during the lookup.  The fix is to simply make sure we never
return the dict without a storage_policy key defined; if it
doesn't exist its safe to make it '0' as this means you're
in the update scenario and there's xno other possibility.

Change-Id: If8e8f66d32819c5bfb2d1308e14643f3600ea6e9
2014-07-02 15:58:22 -07:00
Clay Gerrard
0f0c0e5553 Fix KeyError on Storage Policy Container Sync
In the proxy, container_info can return a 'storage_policy' of None.  When
you set a header value on a swob.Request to None that effectively just
delete's the key.  One path through the proxy during container sync was
counting on the the 'X-Backend-Storage-Policy-Index' being set which isn't
the case if the cached container_info if for a pre-policies container.

Also clean up some test cruft, tighten up the interface on FakeConn, and add
some object controller tests to exercise more interesting failure and handoff
code paths.

Change-Id: Ic379fa62634c226cc8a5a4c049b154dad70696b3
2014-06-30 13:28:24 -07:00
Clay Gerrard
3824ff3df7 Add Storage Policy support to Object Server
Objects now have a storage policy index associated with them as well;
this is determined by their filesystem path. Like before, objects in
policy 0 are in /srv/node/$disk/objects; this provides compatibility
on upgrade. (Recall that policy 0 is given to all existing data when a
cluster is upgraded.) Objects in policy 1 are in
/srv/node/$disk/objects-1, objects in policy 2 are in
/srv/node/$disk/objects-2, and so on.

 * 'quarantined' dir already created 'objects' subdir so now there
   will also be objects-N created at the same level

This commit does not address replicators, auditors, or updaters except
where method signatures changed. They'll still work if your cluster
has only one storage policy, though.

DocImpact
Implements: blueprint storage-policies
Change-Id: I459f3ed97df516cb0c9294477c28729c30f48e09
2014-06-18 17:31:38 -07:00
Clay Gerrard
7624b198cf Update FakeRing and FakeLogger
FakeLogger gets better log level handling

Parameterize logger on some daemons which were previously
unparameterized and try and use the interface in tests.

FakeRing use more real code

The existing FakeRing mock's implementation bit me on some pretty subtle
character encoding issue by-passing the hash_path code that is normally
part of get_part_nodes.  This change tries to exercise more of the real
ring code paths when it makes sense and provide a better Fake for use in
testing.

Add write_fake_ring helper to test.unit for when you need a real ring.

DocImpact
Implements: blueprint storage-policies
Change-Id: Id2e3740b1dd569050f4e083617e7dd6a4249027e
2014-06-18 17:31:37 -07:00
Peter Portante
9d0067a0f5 Attempt to ensure connect always timesout
It seems that the test_connect_put_timeout() test does not always fail
when it is expected. Sometimes, not very often, the attempt to connect
succeeds, resulting in a failed test.

This might be because the fake-connection infrastructure uses a
sleep(0.1) and the test uses a connect timeout of 0.1. There might be a
case where the two values result in the exact time where the entries
happen to be added in the wrong order such that the sleep() completes
first before the connect timeout fires, where the connect completes
successfully.

Closes bug 1302781

Change-Id: Ie23e40cf294170eccdf0713e313f9a31a92f9071
2014-04-04 15:26:32 -04:00
Chuck Thier
0b893825eb Add "If-None-Match: *" support to PUT
A common pattern that we see clients do is send a HEAD request before a
PUT to see if it exists.  This can slow things down quite a bit
especially since 404s on HEAD are currently a bit expensive.

This change will allow a client to include a "If-None-Match: *" header
with a PUT request.  In combination with "Expect: 100-Continue" this
allows the server to return that it already has a copy of the object
before any data is sent.

I attempted to also include etag support with the If-None-Match header,
but that turned up having too many hairy edge cases, so was left as a
future excercise.

DocImpact

Change-Id: I94e3754923dbe5faba065719c7a9afa9969652dd
2014-04-01 10:42:00 -07:00
Jenkins
57603b7c58 Merge "Sanify handoff search depth with non-integer replica counts" 2014-02-04 20:25:33 +00:00
Fabien Boucher
8e1e67c02d Fix container quota MW for handling a bad source path
The copy source must be container/object.
This patch avoids the server to return
an internal server error when user provides
a path without a container.

Fixes: bug #1255049
Change-Id: I1a85c98d9b3a78bad40b8ceba9088cf323042412
2014-01-13 13:25:02 +01:00
Samuel Merritt
cfd9d055a6 Sanify handoff search depth with non-integer replica counts
On GET, the proxy will go search the primary nodes plus some number of
handoffs for the account/container/object before giving up and
returning a 404. That number is, by default, twice the ring's replica
count. This was fine if your ring had an integral number of replicas,
but could lead to some slightly-odd behavior if you have fractional
replicas.

For example, imagine that you have 3.49 replicas in your object ring;
perhaps you're migrating a cluster from 3 replicas to 4, and you're
being smart and doing it a bit at a time.

On an object GET that all the primary nodes 404ed, the proxy would
then compute 2 * 3.49 = 6.98, round it up to 7, and go look at 7
handoff nodes. This is sort of weird; the intent was to look at 6
handoffs for objects with 3 replicas, and 8 handoffs for objects with
4, but the effect is 7 for everybody.

You also get little latency cliffs as you scale up replica counts. If,
instead of 3.49, you had 3.51 replicas, then the proxy would look at 8
handoff nodes in every case [ceil(2 * 3.51) = 8], so there'd be a
small-but-noticeable jump in the time it takes to produce a 404.

The fix is to compute the number of handoffs based on the number of
primary nodes for the partition, not the ring's replica count. This
gets rid of the little latency cliffs and makes the behavior more like
what you get with integral replica counts.

If your ring has an integral number of replicas, there's no behavior
change here.

Change-Id: I50538941e571135299fd6b86ecd9dc780cf649f5
2013-12-19 19:06:21 -08:00
Jenkins
0b594bc3af Merge "Change OpenStack LLC to Foundation" 2013-10-07 16:09:37 +00:00
gholt
4a5c2fa0c6 Log x-copy-from when it could be useful
Change-Id: Ia28a9b47213f848ab5ea59572e14ac710ed881e3
2013-09-19 21:05:46 +00:00