190 lines
6.6 KiB
Python
Raw Normal View History

# Copyright (c) 2013 OpenStack Foundation
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
# implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# This stuff can't live in test/unit/__init__.py due to its swob dependency.
Get better at closing WSGI iterables. PEP 333 (WSGI) says: "If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request[.]" There's a bunch of places where we weren't doing that; some of them matter more than others. Calling .close() can prevent a connection leak in some cases. In others, it just provides a certain pedantic smugness. Either way, we should do what WSGI requires. Noteworthy goofs include: * If a client is downloading a large object and disconnects halfway through, a proxy -> obj connection may be leaked. In this case, the WSGI iterable is a SegmentedIterable, which lacked a close() method. Thus, when the WSGI server noticed the client disconnect, it had no way of telling the SegmentedIterable about it, and so the underlying iterable for the segment's data didn't get closed. Here, it seems likely (though unproven) that the object server would time out and kill the connection, or that a ChunkWriteTimeout would fire down in the proxy server, so the leaked connection would eventually go away. However, a flurry of client disconnects could leave a big pile of useless connections. * If a conditional request receives a 304 or 412, the underlying app_iter is not closed. This mostly affects conditional requests for large objects. The leaked connections were noticed by this patch's co-author, who made the changes to SegmentedIterable. Those changes helped, but did not completely fix, the issue. The rest of the patch is an attempt to plug the rest of the holes. Co-Authored-By: Romain LE DISEZ <romain.ledisez@ovh.net> Change-Id: I168e147aae7c1728e7e3fdabb7fba6f2d747d937 Closes-Bug: #1466549
2015-06-18 12:58:03 -07:00
from collections import defaultdict
from hashlib import md5
from swift.common import swob
from swift.common.header_key_dict import HeaderKeyDict
Refactor server side copy as middleware Rewrite server side copy and 'object post as copy' feature as middleware to simplify the PUT method in the object controller code. COPY is no longer a verb implemented as public method in Proxy application. The server side copy middleware is inserted to the left of dlo, slo and versioned_writes middlewares in the proxy server pipeline. As a result, dlo and slo copy_hooks are no longer required. SLO manifests are now validated when copied so when copying a manifest to another account the referenced segments must be readable in that account for the manifest copy to succeed (previously this validation was not made, meaning the manifest was copied but could be unusable if the segments were not readable). With this change, there should be no change in functionality or existing behavior. This is asserted with (almost) no changes required to existing functional tests. Some notes (for operators): * Middleware required to be auto-inserted before slo and dlo and versioned_writes * Turning off server side copy is not configurable. * object_post_as_copy is no longer a configurable option of proxy server but of this middleware. However, for smooth upgrade, config option set in proxy server app is also read. DocImpact: Introducing server side copy as middleware Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Change-Id: Ic96a92e938589a2f6add35a40741fd062f1c29eb Signed-off-by: Prashanth Pai <ppai@redhat.com> Signed-off-by: Thiago da Silva <thiago@redhat.com>
2015-02-18 11:59:31 +05:30
from swift.common.swob import HTTPNotImplemented
from swift.common.utils import split_path
from test.unit import FakeLogger, FakeRing
Get better at closing WSGI iterables. PEP 333 (WSGI) says: "If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request[.]" There's a bunch of places where we weren't doing that; some of them matter more than others. Calling .close() can prevent a connection leak in some cases. In others, it just provides a certain pedantic smugness. Either way, we should do what WSGI requires. Noteworthy goofs include: * If a client is downloading a large object and disconnects halfway through, a proxy -> obj connection may be leaked. In this case, the WSGI iterable is a SegmentedIterable, which lacked a close() method. Thus, when the WSGI server noticed the client disconnect, it had no way of telling the SegmentedIterable about it, and so the underlying iterable for the segment's data didn't get closed. Here, it seems likely (though unproven) that the object server would time out and kill the connection, or that a ChunkWriteTimeout would fire down in the proxy server, so the leaked connection would eventually go away. However, a flurry of client disconnects could leave a big pile of useless connections. * If a conditional request receives a 304 or 412, the underlying app_iter is not closed. This mostly affects conditional requests for large objects. The leaked connections were noticed by this patch's co-author, who made the changes to SegmentedIterable. Those changes helped, but did not completely fix, the issue. The rest of the patch is an attempt to plug the rest of the holes. Co-Authored-By: Romain LE DISEZ <romain.ledisez@ovh.net> Change-Id: I168e147aae7c1728e7e3fdabb7fba6f2d747d937 Closes-Bug: #1466549
2015-06-18 12:58:03 -07:00
class LeakTrackingIter(object):
def __init__(self, inner_iter, fake_swift, path):
self.inner_iter = inner_iter
self.fake_swift = fake_swift
self.path = path
def __iter__(self):
for x in self.inner_iter:
yield x
def close(self):
self.fake_swift.mark_closed(self.path)
class FakeSwift(object):
"""
A good-enough fake Swift proxy server to use in testing middleware.
"""
Refactor server side copy as middleware Rewrite server side copy and 'object post as copy' feature as middleware to simplify the PUT method in the object controller code. COPY is no longer a verb implemented as public method in Proxy application. The server side copy middleware is inserted to the left of dlo, slo and versioned_writes middlewares in the proxy server pipeline. As a result, dlo and slo copy_hooks are no longer required. SLO manifests are now validated when copied so when copying a manifest to another account the referenced segments must be readable in that account for the manifest copy to succeed (previously this validation was not made, meaning the manifest was copied but could be unusable if the segments were not readable). With this change, there should be no change in functionality or existing behavior. This is asserted with (almost) no changes required to existing functional tests. Some notes (for operators): * Middleware required to be auto-inserted before slo and dlo and versioned_writes * Turning off server side copy is not configurable. * object_post_as_copy is no longer a configurable option of proxy server but of this middleware. However, for smooth upgrade, config option set in proxy server app is also read. DocImpact: Introducing server side copy as middleware Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Change-Id: Ic96a92e938589a2f6add35a40741fd062f1c29eb Signed-off-by: Prashanth Pai <ppai@redhat.com> Signed-off-by: Thiago da Silva <thiago@redhat.com>
2015-02-18 11:59:31 +05:30
ALLOWED_METHODS = [
'PUT', 'POST', 'DELETE', 'GET', 'HEAD', 'OPTIONS', 'REPLICATE']
def __init__(self):
self._calls = []
Get better at closing WSGI iterables. PEP 333 (WSGI) says: "If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request[.]" There's a bunch of places where we weren't doing that; some of them matter more than others. Calling .close() can prevent a connection leak in some cases. In others, it just provides a certain pedantic smugness. Either way, we should do what WSGI requires. Noteworthy goofs include: * If a client is downloading a large object and disconnects halfway through, a proxy -> obj connection may be leaked. In this case, the WSGI iterable is a SegmentedIterable, which lacked a close() method. Thus, when the WSGI server noticed the client disconnect, it had no way of telling the SegmentedIterable about it, and so the underlying iterable for the segment's data didn't get closed. Here, it seems likely (though unproven) that the object server would time out and kill the connection, or that a ChunkWriteTimeout would fire down in the proxy server, so the leaked connection would eventually go away. However, a flurry of client disconnects could leave a big pile of useless connections. * If a conditional request receives a 304 or 412, the underlying app_iter is not closed. This mostly affects conditional requests for large objects. The leaked connections were noticed by this patch's co-author, who made the changes to SegmentedIterable. Those changes helped, but did not completely fix, the issue. The rest of the patch is an attempt to plug the rest of the holes. Co-Authored-By: Romain LE DISEZ <romain.ledisez@ovh.net> Change-Id: I168e147aae7c1728e7e3fdabb7fba6f2d747d937 Closes-Bug: #1466549
2015-06-18 12:58:03 -07:00
self._unclosed_req_paths = defaultdict(int)
self.req_method_paths = []
self.swift_sources = []
self.txn_ids = []
self.uploaded = {}
# mapping of (method, path) --> (response class, headers, body)
self._responses = {}
self.logger = FakeLogger('fake-swift')
self.account_ring = FakeRing()
self.container_ring = FakeRing()
self.get_object_ring = lambda policy_index: FakeRing()
Allow smaller segments in static large objects The addition of range support for SLO segments (commit 25d5e68) required the range size to be at least the SLO minimum segment size (default 1 MiB). However, if you're doing something like assembling a video of short clips out of a larger one, then you might not need a full 1 MiB. The reason for the 1 MiB restriction was to protect Swift from resource overconsumption. It takes CPU, RAM, and internal bandwidth to connect to an object server, so it's much cheaper to serve a 10 GiB SLO if it has 10 MiB segments than if it has 10 B segments. Instead of a strict limit, now we apply ratelimiting to small segments. The threshold for "small" is configurable and defaults to 1 MiB. SLO segments may now be as small as 1 byte. If a client makes SLOs as before, it'll still be able to download the objects as fast as Swift can serve them. However, a SLO with a lot of small ranges or segments will be slowed down to avoid resource overconsumption. This is similar to how DLOs work, except that DLOs ratelimit *every* segment, not just small ones. UpgradeImpact For operators: if your cluster has enabled ratelimiting for SLO, you will want to set rate_limit_under_size to a large number prior to upgrade. This will preserve your existing behavior of ratelimiting all SLO segments. 5368709123 is a good value, as that's 1 greater than the default max object size. Alternately, hold down the 9 key until you get bored. If your cluster has not enabled ratelimiting for SLO (the default), no action is needed. Change-Id: Id1ff7742308ed816038a5c44ec548afa26612b95
2015-11-30 18:06:09 -08:00
def _find_response(self, method, path):
resp = self._responses[(method, path)]
if isinstance(resp, list):
try:
resp = resp.pop(0)
except IndexError:
raise IndexError("Didn't find any more %r "
"in allowed responses" % (
(method, path),))
return resp
def __call__(self, env, start_response):
method = env['REQUEST_METHOD']
Refactor server side copy as middleware Rewrite server side copy and 'object post as copy' feature as middleware to simplify the PUT method in the object controller code. COPY is no longer a verb implemented as public method in Proxy application. The server side copy middleware is inserted to the left of dlo, slo and versioned_writes middlewares in the proxy server pipeline. As a result, dlo and slo copy_hooks are no longer required. SLO manifests are now validated when copied so when copying a manifest to another account the referenced segments must be readable in that account for the manifest copy to succeed (previously this validation was not made, meaning the manifest was copied but could be unusable if the segments were not readable). With this change, there should be no change in functionality or existing behavior. This is asserted with (almost) no changes required to existing functional tests. Some notes (for operators): * Middleware required to be auto-inserted before slo and dlo and versioned_writes * Turning off server side copy is not configurable. * object_post_as_copy is no longer a configurable option of proxy server but of this middleware. However, for smooth upgrade, config option set in proxy server app is also read. DocImpact: Introducing server side copy as middleware Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Change-Id: Ic96a92e938589a2f6add35a40741fd062f1c29eb Signed-off-by: Prashanth Pai <ppai@redhat.com> Signed-off-by: Thiago da Silva <thiago@redhat.com>
2015-02-18 11:59:31 +05:30
if method not in self.ALLOWED_METHODS:
raise HTTPNotImplemented()
path = env['PATH_INFO']
_, acc, cont, obj = split_path(env['PATH_INFO'], 0, 4,
rest_with_last=True)
if env.get('QUERY_STRING'):
path += '?' + env['QUERY_STRING']
if 'swift.authorize' in env:
resp = env['swift.authorize'](swob.Request(env))
if resp:
return resp(env, start_response)
req_headers = swob.Request(env).headers
self.swift_sources.append(env.get('swift.source'))
self.txn_ids.append(env.get('swift.trans_id'))
try:
Allow smaller segments in static large objects The addition of range support for SLO segments (commit 25d5e68) required the range size to be at least the SLO minimum segment size (default 1 MiB). However, if you're doing something like assembling a video of short clips out of a larger one, then you might not need a full 1 MiB. The reason for the 1 MiB restriction was to protect Swift from resource overconsumption. It takes CPU, RAM, and internal bandwidth to connect to an object server, so it's much cheaper to serve a 10 GiB SLO if it has 10 MiB segments than if it has 10 B segments. Instead of a strict limit, now we apply ratelimiting to small segments. The threshold for "small" is configurable and defaults to 1 MiB. SLO segments may now be as small as 1 byte. If a client makes SLOs as before, it'll still be able to download the objects as fast as Swift can serve them. However, a SLO with a lot of small ranges or segments will be slowed down to avoid resource overconsumption. This is similar to how DLOs work, except that DLOs ratelimit *every* segment, not just small ones. UpgradeImpact For operators: if your cluster has enabled ratelimiting for SLO, you will want to set rate_limit_under_size to a large number prior to upgrade. This will preserve your existing behavior of ratelimiting all SLO segments. 5368709123 is a good value, as that's 1 greater than the default max object size. Alternately, hold down the 9 key until you get bored. If your cluster has not enabled ratelimiting for SLO (the default), no action is needed. Change-Id: Id1ff7742308ed816038a5c44ec548afa26612b95
2015-11-30 18:06:09 -08:00
resp_class, raw_headers, body = self._find_response(method, path)
headers = HeaderKeyDict(raw_headers)
except KeyError:
if (env.get('QUERY_STRING')
and (method, env['PATH_INFO']) in self._responses):
Allow smaller segments in static large objects The addition of range support for SLO segments (commit 25d5e68) required the range size to be at least the SLO minimum segment size (default 1 MiB). However, if you're doing something like assembling a video of short clips out of a larger one, then you might not need a full 1 MiB. The reason for the 1 MiB restriction was to protect Swift from resource overconsumption. It takes CPU, RAM, and internal bandwidth to connect to an object server, so it's much cheaper to serve a 10 GiB SLO if it has 10 MiB segments than if it has 10 B segments. Instead of a strict limit, now we apply ratelimiting to small segments. The threshold for "small" is configurable and defaults to 1 MiB. SLO segments may now be as small as 1 byte. If a client makes SLOs as before, it'll still be able to download the objects as fast as Swift can serve them. However, a SLO with a lot of small ranges or segments will be slowed down to avoid resource overconsumption. This is similar to how DLOs work, except that DLOs ratelimit *every* segment, not just small ones. UpgradeImpact For operators: if your cluster has enabled ratelimiting for SLO, you will want to set rate_limit_under_size to a large number prior to upgrade. This will preserve your existing behavior of ratelimiting all SLO segments. 5368709123 is a good value, as that's 1 greater than the default max object size. Alternately, hold down the 9 key until you get bored. If your cluster has not enabled ratelimiting for SLO (the default), no action is needed. Change-Id: Id1ff7742308ed816038a5c44ec548afa26612b95
2015-11-30 18:06:09 -08:00
resp_class, raw_headers, body = self._find_response(
method, env['PATH_INFO'])
headers = HeaderKeyDict(raw_headers)
elif method == 'HEAD' and ('GET', path) in self._responses:
Allow smaller segments in static large objects The addition of range support for SLO segments (commit 25d5e68) required the range size to be at least the SLO minimum segment size (default 1 MiB). However, if you're doing something like assembling a video of short clips out of a larger one, then you might not need a full 1 MiB. The reason for the 1 MiB restriction was to protect Swift from resource overconsumption. It takes CPU, RAM, and internal bandwidth to connect to an object server, so it's much cheaper to serve a 10 GiB SLO if it has 10 MiB segments than if it has 10 B segments. Instead of a strict limit, now we apply ratelimiting to small segments. The threshold for "small" is configurable and defaults to 1 MiB. SLO segments may now be as small as 1 byte. If a client makes SLOs as before, it'll still be able to download the objects as fast as Swift can serve them. However, a SLO with a lot of small ranges or segments will be slowed down to avoid resource overconsumption. This is similar to how DLOs work, except that DLOs ratelimit *every* segment, not just small ones. UpgradeImpact For operators: if your cluster has enabled ratelimiting for SLO, you will want to set rate_limit_under_size to a large number prior to upgrade. This will preserve your existing behavior of ratelimiting all SLO segments. 5368709123 is a good value, as that's 1 greater than the default max object size. Alternately, hold down the 9 key until you get bored. If your cluster has not enabled ratelimiting for SLO (the default), no action is needed. Change-Id: Id1ff7742308ed816038a5c44ec548afa26612b95
2015-11-30 18:06:09 -08:00
resp_class, raw_headers, body = self._find_response(
'GET', path)
body = None
headers = HeaderKeyDict(raw_headers)
elif method == 'GET' and obj and path in self.uploaded:
resp_class = swob.HTTPOk
headers, body = self.uploaded[path]
else:
raise KeyError("Didn't find %r in allowed responses" % (
(method, path),))
# simulate object PUT
if method == 'PUT' and obj:
Support for http footers - Replication and EC Before this patch, the proxy ObjectController supported sending metadata from the proxy server to object servers in "footers" that trail the body of HTTP PUT requests, but this support was for EC policies only. The encryption feature requires that footers are sent with both EC and replicated policy requests in order to persist encryption specific sysmeta, and to override container update headers with an encrypted Etag value. This patch: - Moves most of the functionality of ECPutter into a generic Putter class that is used for replicated object PUTs without footers. - Creates a MIMEPutter subclass to support multipart and multiphase behaviour required for any replicated object PUT with footers and all EC PUTs. - Modifies ReplicatedObjectController to use Putter objects in place of raw connection objects. - Refactors the _get_put_connections method and _put_connect_node methods so that more code is in the BaseObjectController class and therefore shared by [EC|Replicated]ObjectController classes. - Adds support to call a callback that middleware may have placed in the environ, so the callback can set footers. The x-object-sysmeta-ec- namespace is reserved and any footer values set by middleware in that namespace will not be forwarded to object servers. In addition this patch enables more than one value to be added to the X-Backend-Etag-Is-At header. This header is used to point to an (optional) alternative sysmeta header whose value should be used when evaluating conditional requests with If-[None-]Match headers. This is already used with EC policies when the ECObjectController has calculated the actual body Etag and sent it using a footer (X-Object-Sysmeta-EC-Etag). X-Backend-Etag-Is-At is in that case set to X-Object-Sysmeta-Ec-Etag so as to point to the actual body Etag value rather than the EC fragment Etag. Encryption will also need to add a pointer to an encrypted Etag value. However, the referenced sysmeta may not exist, for example if the object was created before encryption was enabled. The X-Backend-Etag-Is-At value is therefore changed to support a list of possible locations for alternate Etag values. Encryption will place its expected alternative Etag location on this list, as will the ECObjectController, and the object server will look for the first object metadata to match an entry on the list when matching conditional requests. That way, if the object was not encrypted then the object server will fall through to using the EC Etag value, or in the case of a replicated policy will fall through to using the normal Etag metadata. If your proxy has a third-party middleware that uses X-Backend-Etag-Is-At and it upgrades before an object server it's talking to then conditional requests may be broken. UpgradeImpact Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp> Closes-Bug: #1594739 Change-Id: I12a6e41150f90de746ce03623032b83ed1987ee1
2016-06-06 17:19:48 +01:00
input = ''.join(iter(env['wsgi.input'].read, ''))
if 'swift.callback.update_footers' in env:
footers = HeaderKeyDict()
env['swift.callback.update_footers'](footers)
req_headers.update(footers)
etag = md5(input).hexdigest()
headers.setdefault('Etag', etag)
headers.setdefault('Content-Length', len(input))
# keep it for subsequent GET requests later
Support for http footers - Replication and EC Before this patch, the proxy ObjectController supported sending metadata from the proxy server to object servers in "footers" that trail the body of HTTP PUT requests, but this support was for EC policies only. The encryption feature requires that footers are sent with both EC and replicated policy requests in order to persist encryption specific sysmeta, and to override container update headers with an encrypted Etag value. This patch: - Moves most of the functionality of ECPutter into a generic Putter class that is used for replicated object PUTs without footers. - Creates a MIMEPutter subclass to support multipart and multiphase behaviour required for any replicated object PUT with footers and all EC PUTs. - Modifies ReplicatedObjectController to use Putter objects in place of raw connection objects. - Refactors the _get_put_connections method and _put_connect_node methods so that more code is in the BaseObjectController class and therefore shared by [EC|Replicated]ObjectController classes. - Adds support to call a callback that middleware may have placed in the environ, so the callback can set footers. The x-object-sysmeta-ec- namespace is reserved and any footer values set by middleware in that namespace will not be forwarded to object servers. In addition this patch enables more than one value to be added to the X-Backend-Etag-Is-At header. This header is used to point to an (optional) alternative sysmeta header whose value should be used when evaluating conditional requests with If-[None-]Match headers. This is already used with EC policies when the ECObjectController has calculated the actual body Etag and sent it using a footer (X-Object-Sysmeta-EC-Etag). X-Backend-Etag-Is-At is in that case set to X-Object-Sysmeta-Ec-Etag so as to point to the actual body Etag value rather than the EC fragment Etag. Encryption will also need to add a pointer to an encrypted Etag value. However, the referenced sysmeta may not exist, for example if the object was created before encryption was enabled. The X-Backend-Etag-Is-At value is therefore changed to support a list of possible locations for alternate Etag values. Encryption will place its expected alternative Etag location on this list, as will the ECObjectController, and the object server will look for the first object metadata to match an entry on the list when matching conditional requests. That way, if the object was not encrypted then the object server will fall through to using the EC Etag value, or in the case of a replicated policy will fall through to using the normal Etag metadata. If your proxy has a third-party middleware that uses X-Backend-Etag-Is-At and it upgrades before an object server it's talking to then conditional requests may be broken. UpgradeImpact Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp> Closes-Bug: #1594739 Change-Id: I12a6e41150f90de746ce03623032b83ed1987ee1
2016-06-06 17:19:48 +01:00
self.uploaded[path] = (dict(req_headers), input)
if "CONTENT_TYPE" in env:
self.uploaded[path][0]['Content-Type'] = env["CONTENT_TYPE"]
Support for http footers - Replication and EC Before this patch, the proxy ObjectController supported sending metadata from the proxy server to object servers in "footers" that trail the body of HTTP PUT requests, but this support was for EC policies only. The encryption feature requires that footers are sent with both EC and replicated policy requests in order to persist encryption specific sysmeta, and to override container update headers with an encrypted Etag value. This patch: - Moves most of the functionality of ECPutter into a generic Putter class that is used for replicated object PUTs without footers. - Creates a MIMEPutter subclass to support multipart and multiphase behaviour required for any replicated object PUT with footers and all EC PUTs. - Modifies ReplicatedObjectController to use Putter objects in place of raw connection objects. - Refactors the _get_put_connections method and _put_connect_node methods so that more code is in the BaseObjectController class and therefore shared by [EC|Replicated]ObjectController classes. - Adds support to call a callback that middleware may have placed in the environ, so the callback can set footers. The x-object-sysmeta-ec- namespace is reserved and any footer values set by middleware in that namespace will not be forwarded to object servers. In addition this patch enables more than one value to be added to the X-Backend-Etag-Is-At header. This header is used to point to an (optional) alternative sysmeta header whose value should be used when evaluating conditional requests with If-[None-]Match headers. This is already used with EC policies when the ECObjectController has calculated the actual body Etag and sent it using a footer (X-Object-Sysmeta-EC-Etag). X-Backend-Etag-Is-At is in that case set to X-Object-Sysmeta-Ec-Etag so as to point to the actual body Etag value rather than the EC fragment Etag. Encryption will also need to add a pointer to an encrypted Etag value. However, the referenced sysmeta may not exist, for example if the object was created before encryption was enabled. The X-Backend-Etag-Is-At value is therefore changed to support a list of possible locations for alternate Etag values. Encryption will place its expected alternative Etag location on this list, as will the ECObjectController, and the object server will look for the first object metadata to match an entry on the list when matching conditional requests. That way, if the object was not encrypted then the object server will fall through to using the EC Etag value, or in the case of a replicated policy will fall through to using the normal Etag metadata. If your proxy has a third-party middleware that uses X-Backend-Etag-Is-At and it upgrades before an object server it's talking to then conditional requests may be broken. UpgradeImpact Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp> Closes-Bug: #1594739 Change-Id: I12a6e41150f90de746ce03623032b83ed1987ee1
2016-06-06 17:19:48 +01:00
self._calls.append((method, path, HeaderKeyDict(req_headers)))
# range requests ought to work, hence conditional_response=True
req = swob.Request(env)
Support for http footers - Replication and EC Before this patch, the proxy ObjectController supported sending metadata from the proxy server to object servers in "footers" that trail the body of HTTP PUT requests, but this support was for EC policies only. The encryption feature requires that footers are sent with both EC and replicated policy requests in order to persist encryption specific sysmeta, and to override container update headers with an encrypted Etag value. This patch: - Moves most of the functionality of ECPutter into a generic Putter class that is used for replicated object PUTs without footers. - Creates a MIMEPutter subclass to support multipart and multiphase behaviour required for any replicated object PUT with footers and all EC PUTs. - Modifies ReplicatedObjectController to use Putter objects in place of raw connection objects. - Refactors the _get_put_connections method and _put_connect_node methods so that more code is in the BaseObjectController class and therefore shared by [EC|Replicated]ObjectController classes. - Adds support to call a callback that middleware may have placed in the environ, so the callback can set footers. The x-object-sysmeta-ec- namespace is reserved and any footer values set by middleware in that namespace will not be forwarded to object servers. In addition this patch enables more than one value to be added to the X-Backend-Etag-Is-At header. This header is used to point to an (optional) alternative sysmeta header whose value should be used when evaluating conditional requests with If-[None-]Match headers. This is already used with EC policies when the ECObjectController has calculated the actual body Etag and sent it using a footer (X-Object-Sysmeta-EC-Etag). X-Backend-Etag-Is-At is in that case set to X-Object-Sysmeta-Ec-Etag so as to point to the actual body Etag value rather than the EC fragment Etag. Encryption will also need to add a pointer to an encrypted Etag value. However, the referenced sysmeta may not exist, for example if the object was created before encryption was enabled. The X-Backend-Etag-Is-At value is therefore changed to support a list of possible locations for alternate Etag values. Encryption will place its expected alternative Etag location on this list, as will the ECObjectController, and the object server will look for the first object metadata to match an entry on the list when matching conditional requests. That way, if the object was not encrypted then the object server will fall through to using the EC Etag value, or in the case of a replicated policy will fall through to using the normal Etag metadata. If your proxy has a third-party middleware that uses X-Backend-Etag-Is-At and it upgrades before an object server it's talking to then conditional requests may be broken. UpgradeImpact Co-Authored-By: Alistair Coles <alistair.coles@hpe.com> Co-Authored-By: Thiago da Silva <thiago@redhat.com> Co-Authored-By: Samuel Merritt <sam@swiftstack.com> Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp> Closes-Bug: #1594739 Change-Id: I12a6e41150f90de746ce03623032b83ed1987ee1
2016-06-06 17:19:48 +01:00
if isinstance(body, list):
resp = resp_class(
req=req, headers=headers, app_iter=body,
conditional_response=req.method in ('GET', 'HEAD'))
else:
resp = resp_class(
req=req, headers=headers, body=body,
conditional_response=req.method in ('GET', 'HEAD'))
Get better at closing WSGI iterables. PEP 333 (WSGI) says: "If the iterable returned by the application has a close() method, the server or gateway must call that method upon completion of the current request[.]" There's a bunch of places where we weren't doing that; some of them matter more than others. Calling .close() can prevent a connection leak in some cases. In others, it just provides a certain pedantic smugness. Either way, we should do what WSGI requires. Noteworthy goofs include: * If a client is downloading a large object and disconnects halfway through, a proxy -> obj connection may be leaked. In this case, the WSGI iterable is a SegmentedIterable, which lacked a close() method. Thus, when the WSGI server noticed the client disconnect, it had no way of telling the SegmentedIterable about it, and so the underlying iterable for the segment's data didn't get closed. Here, it seems likely (though unproven) that the object server would time out and kill the connection, or that a ChunkWriteTimeout would fire down in the proxy server, so the leaked connection would eventually go away. However, a flurry of client disconnects could leave a big pile of useless connections. * If a conditional request receives a 304 or 412, the underlying app_iter is not closed. This mostly affects conditional requests for large objects. The leaked connections were noticed by this patch's co-author, who made the changes to SegmentedIterable. Those changes helped, but did not completely fix, the issue. The rest of the patch is an attempt to plug the rest of the holes. Co-Authored-By: Romain LE DISEZ <romain.ledisez@ovh.net> Change-Id: I168e147aae7c1728e7e3fdabb7fba6f2d747d937 Closes-Bug: #1466549
2015-06-18 12:58:03 -07:00
wsgi_iter = resp(env, start_response)
self.mark_opened(path)
return LeakTrackingIter(wsgi_iter, self, path)
def mark_opened(self, path):
self._unclosed_req_paths[path] += 1
def mark_closed(self, path):
self._unclosed_req_paths[path] -= 1
@property
def unclosed_requests(self):
return {path: count
for path, count in self._unclosed_req_paths.items()
if count > 0}
@property
def calls(self):
return [(method, path) for method, path, headers in self._calls]
@property
def headers(self):
return [headers for method, path, headers in self._calls]
@property
def calls_with_headers(self):
return self._calls
@property
def call_count(self):
return len(self._calls)
def register(self, method, path, response_class, headers, body=''):
self._responses[(method, path)] = (response_class, headers, body)
def register_responses(self, method, path, responses):
self._responses[(method, path)] = list(responses)
class FakeAppThatExcepts(object):
MESSAGE = "We take exception to that!"
def __init__(self, exception_class=Exception):
self.exception_class = exception_class
def __call__(self, env, start_response):
raise self.exception_class(self.MESSAGE)