381 Commits

Author SHA1 Message Date
Jenkins
0b594bc3af Merge "Change OpenStack LLC to Foundation" 2013-10-07 16:09:37 +00:00
Samuel Merritt
def37fb56a Stop reading from object server when client disconnects.
If a client were in the middle of an object GET request and then
disconnected, the proxy would wait a while (default 60s) and then time
out the connection. As part of the teardown for this, the proxy would
attempt to close the connection to the object server, then drain any
associated buffers. However, this didn't work particularly well,
resulting in the proxy reading the entire remainder of the object for
no gain.

Now, the proxy closes the connection hard, by calling .close() on the
underlying socket._socket object. This is different from calling
.close() on a socket._socketobject object, which is what you get back
from socket.socket() and similar methods. Calling .close() on a
socket._socketobject simply decrements a reference counter on the
socket._socket, which has been observed in the past to result in
socket leaks when something holds onto a reference. However, calling
.close() on a socket._socket actually closes the socket regardless of
who else has a reference to it.

I had to delete a test assertion that said the object server never got
SIGPIPE after a GET w/X-Newest. Well, you get a SIGPIPE when you write
to a closed socket, and now the proxy is actually closing the sockets
early, so now you *do* get a SIGPIPE.

closes-bug: 1174660

Note that this will cause a regression on bug 1037337; unfortunately,
the cure is worse than the disease, so out it goes.

Change-Id: I9c7a2e7fdb8b4232e53ea96f86b50e8d34c27221
2013-10-02 15:57:24 -07:00
anc
fa308d60bd Fix utf-8 handling in object versions.
Fixes object versioning when object name and/or version
container name contain multibyte utf-8 characters.

When object names containing non-ASCII characters
are PUT multiple times into a container with an
x-versions-location set, subsequent DELETE of the
object results in a 500 response status.

When the versions container name contains
non-ASCII characters the first delete of an object
succeeds but fails to restore previous version of
object, so second delete incorrectly returns 404.

Fixes bug 1229142

Change-Id: I425440f76b8328f8e119d390bfa4c7022181e89e
2013-09-24 17:18:58 +01:00
Peter Portante
4e8b2ffc2e Use created container in unit test
Change-Id: I2573be1ac14f65b8008611edf940363b31c8d86e
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-24 02:29:54 -04:00
David Goetz
01f58d6826 SLOs broken for range requests
Change-Id: I21175a4be0cda9a8a98c425bff11c80895cd6d3e
2013-09-20 08:51:21 -07:00
ZhiQiang Fan
f72704fc82 Change OpenStack LLC to Foundation
Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58
Closes-bug: #1214176
2013-09-20 01:02:31 +08:00
Peter Portante
7760f41c3c Refactor common/utils methods to common/ondisk
Place all the methods related to on-disk layout and / or configuration
into a new common module that can be shared by the various modules
using the same on-disk layout.

Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-17 17:32:04 -04:00
Peter Portante
6ae4e17af1 Pep8 account and proxy server unit tests (10 of 12)
Change-Id: Ib83d164997b0d98be921c8b4857caa2429344aa4
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-01 16:12:51 -04:00
Clay Gerrard
27c7abfb69 Add constructor args to swob.Request.blank
I knew webob.Request.blank could take most of the attributes on the class as
kwargs to blank, so I went and looked how.  It seems to work ok and is pretty
nice.

Change-Id: I72fae7c28f81c97768ee98b8ebcd69789a4c2e84
2013-08-29 07:28:32 -07:00
Samuel Merritt
e19cf5c047 Clean up some of the proxy unit tests.
Since we've got machinery to catch HTTPException errors that happens
after the proxy server's __call__ method but before the individual
GET()/POST()/etc. methods, we shouldn't just call
GET()/POST()/etc. Instead, we should use swob.Request's get_response()
method, or otherwise actually call the proxy application. That way we
can have the code raise HTTPBadRequest, but the test just gets a
plain old response with a 400 status code, just like you'd expect.

Earlier changes fixed some of the tests; this commit fixes some
more. In the interest of not having a gigantic diff, this only gets
about half the proxy's unit tests, and the others will be done in a
subsequent commit.

Change-Id: Iaca32e74a8f4499440e655c207a1012e14486c74
2013-08-28 17:40:37 -07:00
Jenkins
396d2c91ed Merge "Refactor how we pick listings' content type." 2013-08-27 05:50:23 +00:00
Jenkins
bc75061740 Merge "Run a more GC iterations to make sure weakrefs are collected" 2013-08-23 12:59:41 +00:00
Jenkins
1f8632a2cb Merge "Fix range GET w/If-None-Match." 2013-08-22 22:41:02 +00:00
Alex Gaynor
85475b2fbd Run a more GC iterations to make sure weakrefs are collected
This is needed to make sure the weakref callback is fired under PyPy.

Change-Id: I5d1b83186780ee6130463fe42fac58e411ad9f79
2013-08-19 15:50:03 -07:00
Samuel Merritt
a4f371439b Refactor how we pick listings' content type.
There were a few different places where we had some repeated code to
figure out what format an account or container listing response should
be in (text, JSON, or XML). Now that's been pulled into a single
function.

As part of this, you can now raise HTTPException subclasses in proxy
controllers instead of laboriously plumbing error responses all the
way back up to swift.proxy.server.Application.handle_request(). This
lets us avoid certain ugly patterns, like the one where a method
returns a tuple of (x, y, z, error) and the caller has to see if it
got (value, value, value, None) or (None, None, None, errorvalue). Now
we can just raise the error.

Change-Id: I316873df289160d526487ad116f6fbb9a575e3de
2013-08-16 15:45:45 -07:00
Alex Gaynor
72d6f6d5f7 Removed monkeypatching of __del__ in tests
On PyPy adding __del__ to a class after it's constructed does not
work, instead implement the leakchecker in tests by using weakrefs,
which works everywhere.

Change-Id: I873efb3e2f85f731d5836bf9bf06a21e939e8542
2013-08-12 22:31:32 -04:00
Samuel Merritt
3db1d9f804 Fix range GET w/If-None-Match.
An object GET request with both If-None-Match: <Etag> and
Range: bytes=N-M should result in a 304 Not Modified if the Etag
matches. However, it was resulting in a 416 Requested Range Not
Satisfiable. This commit fixes that.

The fix was to remove conditional_response=True from a request in the
proxy. As far as I can see, the only effect that
conditional_response=True has in swob is to enable range
checking. Since the object server handles all the range stuff, the
proxy shouldn't do it. Otherwise, you wind up with double-application
of the Range header, and that doesn't work very well.

This is probably left over from the WebOb days.

Fixes bug #1196805.

Change-Id: I207f3839731b7503baa3fdbd7cd1cfe3e1b46f62
2013-08-07 15:58:23 -07:00
Jenkins
948d34f3db Merge "Tell swift to figure out content type" 2013-07-31 20:26:29 +00:00
Jenkins
efe8d0105c Merge "add utf-8 charset to multipart-manifest=get resp" 2013-07-31 02:03:49 +00:00
David Goetz
3035a93ed2 Tell swift to figure out content type
Be able to tell swift to figure out the content-type even if it is
sent because old client code / curl has trouble sending blank
content-type headers.

Change-Id: Ie65ddf8993a19ea74e0b85a2ae56da84a617c19d
2013-07-30 15:17:38 -07:00
David Goetz
750684f754 add utf-8 charset to multipart-manifest=get resp
Change-Id: Ic06e8b07a4db4ccde633f3b56a49f198cdd467cb
2013-07-26 10:00:33 -07:00
Alex Gaynor
ff5a6d0111 Corrected many style violations in the tests.
I focussed primarily on F-category violations, they are all but all fixed with
this patch.

Change-Id: I343f6945c97984ed1093bc347b6def6994297041
2013-07-24 10:18:47 -07:00
Michael Barton
13bbe4b7c3 fix unit tests in 2.6 by using closing(GzipFile)
Python 2.6 doesn't support using GzipFile as a context manager.

Change-Id: Ic12b5a916bc89ae8d4480879723201c1715285af
2013-07-22 21:12:22 -07:00
Alex Gaynor
710e4a007f Ensure that files are always closed in the tests.
A failure to close files in a timely fassion means
that data is not necessarily written immediately
on Pythons which do not use reference counting
(e.g. PyPy).

Change-Id: I5d363249676032a025a22a67275c2eed3151b264
2013-07-18 17:29:14 -07:00
Samuel Merritt
151313ba8c Fix flaky test.
test_DELETE_x_container_headers_with_more_container_replicas() would
sometimes fail, and it's because a test helper method was defaulting
to sorting the captured headers by X-Container-Partition. Well, in
tests we use FakeRing, and the partition is always 1, so that sorting
didn't buy us much. Now the sorting is done by X-Container-Device,
which actually differs meaningfully.

Change-Id: Ibe5b2fcd3f23280ed2caaa703111a98861331866
2013-07-03 17:45:20 -07:00
David Goetz
9f942b1256 Allow SLOs to be made up of other SLOs
We've gone back and forth about this. In the initial commit, it couldn't
possibly work because you wouldn't be able to get the Etags to match. Then it
was expressly disallowed with a custom error message, and now its allowed. The
reason we're allowing it is that 1,000 segments isn't enough for some use cases
and we decided its better than just upping the number of allowed segments. The
code to make it work isn't all that complicated and it allows for virtually
unlimited SLO object size. There is also a new configurable limit on the
maximum connection time for both SLOs and DLOs defaulting to 1 day. This will
hopefully alleviate worries about infinite requests. Think I'll leave the
python-swift client support for nested SLOs to somebody else though :).

DocImpact

Change-Id: Id16187481b37e716d2bd09bdbab8cc87537e3ddd
2013-06-26 09:44:33 -07:00
David Hadas
8226761889 Deleted account respond as non existing accounts
Currently clients can not distinguish between non existing accounts
(which can be created) and accounts marked for deletion, which has
not yet been reaped and therefore cannot be re-created until reaped.

Following this patch, if an account is marked as deleted but hasn't
been reaped and is still on disk, responses will include a status
header:
    'X-Account-Status' = 'Deleted'

Fixes:Bug #1188609
Change-Id: Ibd39965ae3f5d45fd78f130e0e31f5a0141a8633
2013-06-26 08:33:23 +03:00
Jenkins
edf4068c8b Merge "Local write affinity for object PUT requests." 2013-06-26 04:39:53 +00:00
Samuel Merritt
d9f2a76973 Local write affinity for object PUT requests.
The proxy can now be configured to prefer local object servers for PUT
requests, where "local" is governed by the "write_affinity". The
"write_affinity_node_count" setting controls how many local object
servers to try before giving up and going on to remote ones.

I chose to simply re-order the object servers instead of filtering out
nonlocal ones so that, if all of the local ones are down, clients can
still get successful responses (just slower).

The goal is to trade availability for throughput. By writing to local
object servers across fast LAN links, clients get better throughput
than if the object servers were far away over slow WAN links. The
downside, of course, is that data availability (not durability) may
suffer when drives fail.

The default configuration has no write affinity in it, so the default
behavior is unchanged.

Added some words about these settings to the admin guide.

DocImpact

Change-Id: I09a0bd00524544ff627a3bccdcdc48f40720a86e
2013-06-23 22:04:56 -07:00
Jenkins
054ffbe4a6 Merge "Stop getting useless bytes on manifest Range requests." 2013-06-23 17:48:49 +00:00
Jenkins
abef11f89c Merge "get_info - removes duplicate code (Take 3)" 2013-06-12 13:15:06 +00:00
Jenkins
4aee4267ac Merge "Local read affinity for GET/HEAD requests." 2013-06-11 19:36:39 +00:00
Jenkins
5e6ce5d5b5 Merge "Fixed Bug 1187200" 2013-06-11 19:36:15 +00:00
Samuel Merritt
84d77a06b7 Stop getting useless bytes on manifest Range requests.
If you make a range request for a manifest object, then the proxy can
retrieve more data than it actually needs from the object server. It
sends the right bytes to the client, and then spends a bunch of time
reading and discarding bytes it doesn't need.

Example:

   file start  	       	       	       	       	       file end
       |	range start	  range end		  |
       v	     v         	       v       	       	  v
       [--------------------------------------------------]

The proxy would send a request to the object server with "Range:
bytes=${range start}-", and so the object server responds with all the
bytes from range start to file end. This needlessly transfers all the
bytes from range end to file end, as none of those are sent to the
client.

This commit fixes the sent Range header to only request the bytes
needed.

Change-Id: Ie8d4ee2c5e67c08795ac2d39d1df7d11856621a9
2013-06-10 17:31:46 -07:00
Samuel Merritt
f559c50acb Local read affinity for GET/HEAD requests.
Now you can configure the proxy server to read from "local" primary
nodes first, where "local" is governed by the newly-introduced
"read_affinity" setting in the proxy config. This is desirable when
the network links between regions/zones are of varying capacities; in
such a case, it's a good idea to prefer fetching data from closer
backends.

The new setting looks like rN[zM]=P, where N is the region number, M
is the optional zone number, and P is the priority. Multiple values
can be specified by separating them with commas. The priority for
nodes that don't match anything is a very large number, so they'll
sort last.

This only affects the ordering of the primary nodes; it doesn't affect
handoffs at all. Further, while the primary nodes are reordered for
all requests, it only matters for GET/HEAD requests since handling the
other verbs ends up making concurrent requests to *all* the primary
nodes, so ordering is irrelevant.

Note that the default proxy config does not have this setting turned
on, so the default configuration's behavior is unaffected.

blueprint multi-region

Change-Id: Iea4cd367ed37fe5ee69b63234541d358d29963a4
2013-06-10 16:51:47 -07:00
gholt
fef2afd927 Fixed Bug 1187200
See Bug 1187200 for a full description of the problem.

Part 1:

X-Delete-At-Container added to X-Delete-At-* info

This fixes the bug by passing the expiring-objects-account's
container name onward to the backend object servers. This is in case
the object servers' expiring_objects_container_divisor happens to be
different than the proxy server's, we want to make sure the host,
partition, and device match up with the container name. Different
container names would be fine, but not with mismatched host,
partition, and device info.

Part 2:

The db_replicator now double checks the disk path's partition against
the partition the ring gives back. If they don't match, it logs the
problem but continues to replicate the database to where it should be
and, on success to all proper nodes, removes the local out of place
database.

Bug 1187200

Change-Id: Id0873a3f2198ce285fe0b0c777738eff38bc2438
2013-06-08 20:00:32 +00:00
David Hadas
58f4e0f2e0 get_info - removes duplicate code (Take 3)
Consolidate the different ways in which info of account/container
is gathered, cached, used, updated, etc.

This refactoring increases code reuse and is a basis for later
addition of account ACLs.
Changing the get_info users is left for future.
This staged approach ensures the behaviour is unchanged.

Change-Id: I67b58030d3f9e3bc86bcd7ece0f1dc693c4e08c3
Fixes: Bug #1162199
2013-06-08 14:20:37 +03:00
Jenkins
d63f77cd48 Merge "Fix faked-out account GET for JSON and XML." 2013-06-04 13:58:37 +00:00
Jenkins
6377f6e075 Merge "Mock SysLogHandler for proxy/test_server.py" 2013-05-31 19:22:39 +00:00
Samuel Merritt
15c2ca55f0 Fix faked-out account GET for JSON and XML.
If account autocreation is on and the proxy receives a GET request for
a nonexistent account, it'll fake up a response that makes it look as
if the account exists, but without reifying that account into sqlite
DB files.

That faked-out response was just fine as long as you wanted a
text/plain response, but it didn't handle the case of format=json or
format=xml; in those cases, the response would still be
text/plain. This can break clients, and certainly causes crashes in
swift3. Now, those responses match as closely as possible.

The code for generating an account-listing response has been pulled
into (the new) swift.account.utils module, and both the fake response
and the real response use it, thus ensuring that they can't
accidentally diverge. There's also a new probe test for that
non-divergence.

Also, cleaned up a redundant matching of the Accept header in the code
for generating the account listing.

Note that some of the added tests here pass with or without this code
change; they were added because the code I was changing (parts of the
real account GET) wasn't covered by tests.

Bug 1183169

Change-Id: I2a3b8e5d9053e4d0280a320f31efa7c90c94bb06
2013-05-30 17:43:03 -07:00
gholt
d2dba73afd Mock SysLogHandler for proxy/test_server.py
Our tests get rather grabby with file descriptors, mostly due to the
SysLogHandler. In recent work I'm doing, my additional tests put
things just over the 1024 limit and tests would start to fail
somewhat randomly. This alleviates the problem by mocking out one of
the bigger users of SysLogHandler, proxy/test_server.py. If you want
to verify this has an impact, you can put the following in your
proxy/test_server.py's overall teardown method and try it with and
without the mocking:

    import subprocess
    print >>sys.stderr, '%s OPEN FILES' % len(subprocess.Popen(
        ['lsof'], stdout=subprocess.PIPE).communicate()[0].split('\n'))

Change-Id: I1bd3efe46ee69d09a32c5a964f04e36e46506446
2013-05-31 00:29:12 +00:00
Peter Portante
5174b7f85d Rework to support RFC 2616 Sec 4.4 Message Length
RFC 2616 Sec 4.4 Message Length describes how the content-length and
transfer-encoding headers interact. Basically, if chunked transfer
encoding is used, the content-length header value is ignored and if
the content-length header is present, and the request is not using
chunked transfer-encoding, then the content-length must match the body
length.

The only Transfer-Coding value we support in the Transfer-Encoding
header (to date) is "chunked". RFC 2616 Sec 14.41 specifies that if
"multiple encodings have been applied to an entity, the
transfer-codings MUST be listed in the order in which they were
applied." Since we only supported "chunked". If the Transfer-Encoding
header value has multiple transfer-codings, we return a 501 (Not
Implemented) (see RFC 2616 Sec 3.6) without checking if chunked is the
last one specified. Finally, if transfer-encoding is anything but
"chunked", we return a 400 (Bad Request) to the client.

This patch adds a new method, message_length, to the swob request
object which will apply an algorithm based on RFC 2616 Sec 4.4
leveraging the existing content_length property.

In addition to these changes, the proxy server will now notice when
the message length specified by the content-length header is greater
than the configured object maximum size and fail the request with a
413, "Request Entity Too Large", before reading the entire body.

This work flows from https://review.openstack.org/27152.

Change-Id: I5d2a30b89092680dee9d946e1aafd017eaaef8c0
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-05-25 21:05:56 -04:00
David Hadas
fd3acd2e59 Autocreate cleanups
Autocreate was messy - now cleaned.
Auto-create now occurs at account POST, and container PUT only.
A method for autocreation was added
Autocreation was removed from account_info and container_info.

Fake-it as if the account exists on account HEAD and account GET.

Return 404 on everything else when the account does not exist.

Fix: Bug #1172223
Fix: Bug #1179140
Change-Id: Iac54c1438eb09883fbc29a1ad2ac2245b95efc92
2013-05-16 20:54:05 +03:00
Jenkins
50157243dd Merge "Refactor Bulk middleware to handle long running requests" 2013-05-15 23:14:15 +00:00
Peter Portante
8825c9c74a Enhance log msg to report referer and user-agent
Enhance internally logged messages to report referer and user-agent.

Pass the referering URL and METHOD between internal servers (when
known), and set the user-agent to be the server type (obj-server,
container-server, proxy-server, obj-updater, obj-replicator,
container-updater, direct-client, etc.) with the process PID. In
conjunction with the transaction ID, it helps to track down which PID
from a given system was responsible for initiating the request and
what that server was working on to make this request.

This has been helpful in tracking down interactions between object,
container and account servers.

We also take things a bit further performaing a bit of refactoring to
consolidate calls to transfer_headers() now that we have a helper
method for constructing them.

Finally we performed further changes to avoid header key duplication
due to string literal header key values and the various objects
representing headers for requests and responses. See below for more
details.

====

Header Keys

There seems to be a bit of a problem with the case of the various
string literals used for header keys and the interchangable way
standard Python dictionaries, HeaderKeyDict() and HeaderEnvironProxy()
objects are used.

If one is not careful, a header object of some sort (one that does not
normalize its keys, and that is not necessarily a dictionary) can be
constructed containing header keys which differ only by the case of
their string literals. E.g.:

   { 'x-trans-id': '1234', 'X-Trans-Id': '5678' }

Such an object, when passed to http_connect() will result in an
on-the-wire header where the key values are merged together, comma
separated, that looks something like:

   HTTP_X_TRANS_ID: 1234,5678

For some headers in some contexts, this is behavior is desirable. For
example, one can also use a list of tuples which enumerate the multiple
values a single header should have.

However, in almost all of the contexts used in the code base, this is
not desirable.

This behavior arises from a combination of factors:

   1. Header strings are not constants and different lower-case and
      title-case header strings values are used interchangably in the
      code at times

      It might be worth the effort to make a pass through the code to
      stop using string literals and use constants instead, but there
      are plusses and minuses to doing that, so this was not attempted
      in this effort

   2. HeaderEnvironProxy() objects report their keys in ".title()"
      case, but normalize all other key references to the form
      expected by the Request class's environ field

      swob.Request.headers fields are HeaderEnvironProxy() objects.

   3. HeaderKeyDict() objects report their keys in ".lower()" case,
      and normalize all other key references to ".lower()" case

      swob.Response.headers fields are HeaderKeyDict() objects.

Depending on which object is used and how it is used, one can end up
with such a mismatch.

This commit takes the following steps as a (PROPOSED) solution:

   1. Change HeaderKeyDict() to normalize using ".title()" case to
      match HeaderEnvironProxy()

   2. Replace standard python dictionary objects with HeaderKeyDict()
      objects where possible

      This gives us an object that normalizes key references to avoid
      fixing the code to normalize the string literals.

   3. Fix up a few places to use title case string literals to match
      the new defaults

Change-Id: Ied56a1df83ffac793ee85e796424d7d20f18f469
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-05-13 17:39:02 +00:00
David Goetz
af2607c457 Refactor Bulk middleware to handle long running requests
Change-Id: I8ea0ff86518d453597faae44ec3918298e2d5147
2013-05-08 10:00:21 -07:00
Jenkins
e7242fc523 Merge "Improved autocreate testing" 2013-04-29 16:21:15 +00:00
David Hadas
0a75a9d509 Improved autocreate testing
Increase testing coverage and make some of the test more accurate
(Some of the tests included too many responses from the account
servers that were not used by the test)

Change-Id: Ide2e0fcb89d5905e70d3111f7ac57b1cff23a99c
Fixes: Bug #1172152
2013-04-25 00:41:34 +03:00
Kun Huang
fef0f491ff copy X-Delete-At unless X-Fresh-Metadata: true is supplied on an object copy
Current codes will copy metadata headers when x-fresh-metadata:false, we
still need copy "x-delete-at" header and ensure expiring work at the same
time.

Change-Id: Ie31326b5f7b565e51e5aa249279bc1786f7bc847
Fixes: bug #1067528
2013-04-24 10:40:41 -04:00
gholt
f63dc07b9d Extra safety on account-level DELETE
I just noticed tonight when adding a bunch of stuff to Swiftly that
the Bulk Delete middleware uses an account-level DELETE request,
albeit with a query parameter of bulk-delete. But, one typo and,
assuming the cluster supports it and you have access, whoops, you
just marked the account for deletion!

I put a bit of extra safety on the account deletion by requiring it
to have an empty query string.

Change-Id: Ib5df11193b04eff69d14185bd9d0607169131e7f
2013-04-19 09:43:31 +00:00