This reverts the fallocate- and punch_hole-related parts of commit
c78a5962b5f6c9e75f154cac924a226815236e98.
Closes-Bug: #2031035
Related-Change: I3e26f8d4e5de0835212ebc2314cac713950c85d7
Change-Id: I8050296d6982f70bb64a63765b25d287a144cb8d
The 'log_route' argument of utils.get_logger() determines which global
Logger instance is wrapped by the returned LogAdapter. Most middlewares
(s3api being the exception) explicity set 'log_route' to equal the
middleware 'brief' name e.g. 'bulk', 'tempauth' etc. However, the
s3api middleware sets 'log_route' to be the config 'log_name', if that
key is found in config.
When a proxy pipeline is instantiated via wsgi.run_wsgi(), all
middlewares and the proxy app are passed a default conf with
'"log_name": "proxy-server"'. As a result, the s3api middleware calls
get_logger() with log_route='proxy-server' and its LogAdapter
therefore shares the same Logger instance used by proxy-server app
(and any other middleware that similarly fails to explicitly
differentiate 'log_route)'.
Each Logger instance has a StatsdClient instance bound to it by
get_logger(). The Related-Change added statsd metrics to the s3api
middleware and sets 's3api' as the 'statsd_tail_prefix' when calling
get_logger(). This had the unintended effect of replacing the shared
Logger instance's StatsdClient with one that has prefix 's3api', such
that stats emitted by the proxy app (e.g. memcache shard range
hit/miss stats) would be erroneously prefixed with 's3api'.
This patch modifies the s3api middleware logger instantiation to
explictly set log_route='s3api', so that the s3api middleware
LogAdapter now wraps a unique global Logger instance, with a unique
StatsdClient instance bound to it.
The 'server' attribute of the middleware's LogAdapter, which may be
included in log lines by the "%(server)s" format element, is not
affected by this change. Its value is derived from the config
'log_name' or the 'name' argument passed to get_logger().
Change-Id: Ia89485bae8f92f4f3d9f5375cab8ff08f70a11a7
Related-Change: I4976b3ee24e4ec498c66359f391813261d42c495
Currently, SLO manifest files will be evicted from page cache
after reading it, which cause hard drives very busy when user
requests a lot of parallel byte range GETs for a particular
SLO object.
This patch will add a new config 'keep_cache_slo_manifest', and
try keeping the manifest files in page cache by not evicting them
after reading if config settings allow so.
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I557bd01643375d7ad68c3031430899b85908a54f
RemoteDisconnected from both BadStatusLine and ConnectionResetError
(which in turn eventually inherits from OSError). We want to make
sure it gets handled as a BadStatusLine, as it doesn't get its errno
set and would otherwise get the default traceback handling.
Change-Id: I0fb1f764722d73db6d3b79acc128f37f51499d35
Client when explicitly closed before finishing the download.
leads to a 499, but the shutdown logging for proxy in py3
needs to be fixed. We have done it by killing all running
coroutines in the ContextPool
Change-Id: Ic372ea9866bb7f2659e02f8796cdee01406e2079
Previously it was possible for an entire object PUT data transfer to
execute without the greenthread sleeping and allowing other
greenthreads to run. This was more likely with an EC PUT because the
computation of EC fragments might be slower than the rate at which
they are drained out of IO send buffers, so IO never blocks. In
extreme cases this could cause timeouts in other greenthreads to pop.
This patch adds a periodic zero-time sleep in the object PUT data
transfer loop. An existing pattern in the GET path is re-used, and
extracted to a new CooperativeIterator helper class.
Change-Id: Idd6b767f1a746c72c106199f5d1fada3615b1e97
Closes-Bug: #2019955
Related-Change: Iae27109f5a3d109ad21ec9a972e39f22150f6dbb
Previously swift.common.utils monkey patched logging.thread,
logging.threading, and logging._lock upon import with eventlet
threading modules, but that is no longer reasonable or necessary.
With py3, the existing logging._lock is not patched by eventlet,
unless the logging module is reloaded. The existing lock is not
tracked by the gc so would not be found by eventlet's
green_existing_locks().
Instead we group all monkey patching into utils function and apply
patching consistently across daemons and WSGI servers.
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Closes-Bug: #1380815
Change-Id: I6f35ad41414898fb7dc5da422f524eb52ff2940f
... and clean up WatchDog start a little.
If this pattern proves useful we could consider extending it.
Change-Id: Ia85f9321b69bc4114a60c32a7ad082cae7da72b3
Updating shard range cache has been restructured and upgraded to v2
which only persist the essential attributes in memcache (see
Related-Change). This is the following patch to restructure the
listing shard ranges cache for object listing in the same way.
UpgradeImpact
=============
The cache key for listing shard ranges in memcached is renamed
from 'shard-listing/<account>/<container>' to
'shard-listing-v2/<account>/<container>', and cache data is
changed to be a list of [lower bound, name]. As a result, this
will invalidate all existing listing shard ranges stored in the
memcache cluster.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Change-Id: I54a32fd16e3d02b00c18b769c6f675bae3ba8e01
Also:
- move some tests to test_utils.TestNamespace.
- move ShardName class in file (no change to class)
- move end_marker method from ShardRange to Namespace
Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Change-Id: Ibd5614d378ec5e9ba47055ba8b67a42ab7f7453c
Restructure the shard ranges that are stored in memcache for
object updating to only persist the essential attributes of
shard ranges in memcache (lower bounds and names), so the
aggregate of memcache values is much smaller and retrieval
will be much faster too.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
UpgradeImpact
=============
The cache key for updating shard ranges in memcached is renamed
from 'shard-updating/<account>/<container>' to
'shard-updating-v2/<account>/<container>', and cache data is
changed to be a list of [lower bound, name]. As a result, this
will invalid all existing updating shard ranges stored in the
memcache cluster.
Change-Id: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Adding a "use_replication" field to the node dict, a helper function to
set use_replication dict value for a node copy by looking up the header
value for x-backend-use-replication-network
Change-Id: Ie05af464765dc10cf585be851f462033fc6bdec7
pytest still complains about some 20k warnings, but the vast majority
are actually because of eventlet, and a lot of those will get cleaned up
when upper-constraints picks up v0.33.2.
Change-Id: If48cda4ae206266bb41a4065cd90c17cbac84b7f
We've seen shards become stuck while sharding because they had
incomplete or stale deleted shard ranges. The root container had more
complete and useful shard ranges into which objects could have been
cleaved, but the shard never merged the root's shard ranges.
While the sharder is auditing shard container DBs it would previously
only merge shard ranges fetched from root into the shard DB if the
shard was shrinking or the shard ranges were known to be children of
the shard. With this patch the sharder will now merge other shard
ranges from root during sharding as well as shrinking.
Shard ranges from root are only merged if they would not result in
overlaps or gaps in the set of shard ranges in the shard DB. Shard
ranges that are known to be ancestors of the shard are never merged,
except the root shard range which may be merged into a shrinking
shard. These checks were not previously applied when merging
shard ranges into a shrinking shard.
The two substantive changes with this patch are therefore:
- shard ranges from root are now merged during sharding,
subject to checks.
- shard ranges from root are still merged during shrinking,
but are now subjected to checks.
Change-Id: I066cfbd9062c43cd9638710882ae9bd85a5b4c37
Lines like `Invalid response 500 from ::1` aren't terribly useful in an
all-in-one, while lines like
Error syncing with node: {'device': 'd5', 'id': 3, 'ip': '::1',
'meta': '', 'port': 6200, 'region': 1, 'replication_ip': '::1',
'replication_port': 6200, 'weight': 8000.0, 'zone': 1, 'index': 0}:
Timeout (60s)
are needlessly verbose.
While we're at it, introduce a node_to_string() helper, and use it in a
bunch of places.
Change-Id: I62b12f69e9ac44ce27ffaed320c0a3563673a018
Adds an is_child_of method that infers the parent-child relationship
of two shard ranges from their names. This new method is limited to
use only under the same account.
Co-Authored-By: Jianjian Huo <jhuo@nvidia.com>
Change-Id: Iac3a8ec5d8947989b64aa27f40caa3d8d1423a7c
The setDaemon method of the threading.Thread was deprecated
in Python 3.10 (*).
Replace the setDaemon method with the daemon property.
*: https://docs.python.org/3.10/library/threading.html#threading.Thread.setDaemon
Change-Id: Ic854dc3c393d382a8acd20d89f56bff198a2ec5e
Signed-off-by: Takashi Natsume <takanattie@gmail.com>
We've known this would eventually be necessary for a while [1], and
way back in 2017 we started seeing SHA-1 collisions [2].
This patch follows the approach of soft deprecation of SHA1 in tempurl.
It's still a default digest, but we'll start with warning as the
middleware is loaded and exposing any deprecated digests
(if they're still allowed) in /info.
Further, because there is much shared code between formpost and tempurl, this
patch also goes and refactors shared code out into swift.common.digest.
Now that we have a digest, we also move digest related code:
- get_hmac
- extract_digest_and_algorithm
[1] https://www.schneier.com/blog/archives/2012/10/when_will_we_se.html
[2] https://security.googleblog.com/2017/02/announcing-first-sha1-collision.html
Change-Id: I581cadd6bc79e623f1dae071025e4d375254c1d9
Sha1 has known to be deprecated for a while so allow the formpost
middleware to use SHA256 and SHA512. Follow the tempurl model and
accept signatures of the form:
<hex-encoded signature>
or
sha1:<base64-encoded signature>
sha256:<base64-encoded signature>
sha512:<base64-encoded signature>
where the base64-encoding can be either standard or URL-safe, and the
trailing '=' chars may be stripped off.
As part of this, pull the signature-parsing out to a new function, and
add detection for hex-encoded sha512 signatures to tempurl.
Change-Id: Iaba3725551bd47d75067a634a7571485b9afa2de
Related-Change: Ia9dd1a91cc3c9c946f5f029cdefc9e66bcf01046
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Closes-Bug: #1794601
Previously, we always needed to retry test_statsd_set_prefix_deprecation.
This was because the warning would be triggered in
test_get_logger_statsd_client_non_defaults and recorded in the
module-level warnings registry. Now, explicitly clear the warning
registry. From the docs [0]:
> One thing to be aware of is that if a warning has already been raised
> because of a once/default rule, then no matter what filters are set
> the warning will not be seen again unless the warnings registry
> related to the warning has been cleared.
[0] https://docs.python.org/3/library/warnings.html#testing-warnings
Change-Id: Icf4b381dcc04d04b5401e5ed3f43df049c1dd2b4
We had a test grab the system logger via logging.getLogger() but when
running under pytest it isn't returned at the DEBUG level as it does in
nosetests.
This patch updates the test to set the level to DEBUG explicitly and
allows the only unit test that fails `pytest test/unit` to pass.
Change-Id: I1c93136cd13e927a2deb380a95fb9f96ec79fa30
This is a fairly blunt tool: ratelimiting is per device and
applied independently in each worker, but this at least provides
some limit to disk IO on backend servers.
GET, HEAD, PUT, POST, DELETE, UPDATE and REPLICATE methods may be
rate-limited.
Only requests with a path starting '<device>/<partition>', where
<partition> can be cast to an integer, will be rate-limited. Other
requests, including, for example, recon requests with paths such as
'recon/version', are unconditionally forwarded to the next app in the
pipeline.
OPTIONS and SSYNC methods are not rate-limited. Note that
SSYNC sub-requests are passed directly to the object server app
and will not pass though this middleware.
Change-Id: I78b59a081698a6bff0d74cbac7525e28f7b5d7c1
s3api bucket listing elements currently have LastModified values with
millisecond precision. This is inconsistent with the value of the
Last-Modified header returned with an object GET or HEAD response
which has second precision. This patch reduces the precision to
seconds in bucket listings and upload part listings. This is also
consistent with observation of an aws listing response.
The last modified values in the swift native listing *up* to
the nearest second to be consistent with the seconds-precision
Last-Modified time header that is returned with an object GET or HEAD.
However, we continue to include millisecond digits set to 0 in the
last-modified string, e.g.: '2014-06-10T22:47:32.000Z'.
Also, fix the last modified time returned in an object copy response
to be consistent with the last modified time of the object that was
created. Previously it was rounded down, but it should be rounded up.
Change-Id: I8c98791a920eeedfc79e8a9d83e5032c07ae86d3
There are a few places where a last-modified value is calculated by
rounding a timestamp *up* to the nearest second. This patch refactors
to use a new Timestamp.ceil() method to do this rounding, along with a
clarifying docstring.
Change-Id: I9ef73e5183bdf21b22f5f19b8440ffef6988aec7
The AbstractRateLimiter currently loses any accumulated rate_buffer
allowance (i.e. the capacity to burst) after it has been idle for more
than rate_buffer seconds. This patch adds an option 'burst_after_idle'
which causes any rate_bufer allowance to be preserved during idle
periods so that there is capacity for a burst immediately after the
idle period.
Note that a burst on start-up can be avoided by initialising a
AbstractRateLimiter with running_time=now.
Change-Id: I280ce2aa3efa28c92b806436e7e87ad77429b7a4
Replaces the ratelimit_sleep helper function with an
EventletRateLimiter class that encapsulates the rate-limiting state
that previously needed to be maintained by the caller of the function.
The ratelimit_sleep function is retained but deprecated, and now
forwards to the EventletRateLimiter class.
The object updater's BucketizedUpdateSkippingLimiter is refactored to
take advantage of the new EventletRateLimiter class.
The rate limiting algorithm is corrected to make the allowed request
rate more uniform: previously pairs of requests would be allowed in
rapid succession before the rate limiter would the sleep for the time
allowance consumed by those two requests; now the rate limiter will
sleep as required after each allowed request. For example, before a
max_rate of 1 per second might result in 2 requests being allowed
followed by a 2 second sleep. That is corrected to be a sleep of 1
second after each request.
Change-Id: Ibcf4dbeb4332dee7e9e233473d4ceaf75a5a85c7