509 Commits

Author SHA1 Message Date
Clay Gerrard
b40ae84f85 make statsd_client more explicit
Even if stdlib socket is probably patched by the time StatsdClient
creates a socket, we want to import the green socket module explicitly
for better testing.

Move test_statsd.py to test_statsd_client.py so it matches the naming
convention of the rest of our test files.

Fix some patching of utils in test_statsd_client to patch
statsd_client.

Rename some vars in test_statsd_client that shadowed the statsd_client
module name.

Move some utils tests out of test_statsd_client and back into
test_utils.

Related-Change: I4b5b12a3b0288b696a39903264741bc862a94ad7
Change-Id: I3de22b7f15dd386fa9c873587782f0dfc4c42a27
2024-05-16 16:49:54 +00:00
Zuul
bf206ed2fe Merge "backend ratelimit: support reloadable config file" 2024-05-11 20:26:40 +00:00
Zuul
927e75aa4c Merge "Use ClosingMapper to ensure prompt client disconnect logging" 2024-05-09 20:17:42 +00:00
Tim Burke
9ec83c44fd Use ClosingMapper to ensure prompt client disconnect logging
Adds ClosingMapper class which is like map() but closes the
iterable.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Idd0ac21b365a138b065f01d05a257af62ea88177
2024-05-07 09:19:33 -07:00
Shreeya Deshpande
9da22bb5fe Move statsd testing to its own module
Change-Id: I4b5b12a3b0288b696a39903264741bc862a94ad7
2024-05-03 15:08:23 -07:00
Tim Burke
761d919677 tests: Use mock.patch more
Change-Id: I68974338f8e0284ed77960048a83f72855b93348
2024-05-01 17:33:11 -07:00
Tim Burke
b4dddb7406 tests: Use @with_tempdir more
Change-Id: I33e71f6c201bb4f2cf3481afd40cf489eb1fcd1f
2024-05-01 17:30:52 -07:00
Shreeya Deshpande
bc3a59bdd3 Refactor utils
- Move statsd client into it's own module
- Move all logging functions into their own module
- Move all config functions into their own module
- Move all helper functions into their own module

Partial-Bug: #2015274
Change-Id: Ic4b5005e3efffa8dba17d91a41e46d5c68533f9a
2024-04-30 20:27:47 +00:00
Alistair Coles
e9abfd76ee backend ratelimit: support reloadable config file
Add support for a backend_ratelimit_conf_path option in the
[filter:backend_ratelimit] config. If specified then the middleware
will give precedence to config options from that file over config
options from the [filter:backend_ratelimit] section.

The path defaults to /etc/swift/backend-ratelimit.conf.

The config file is periodically reloaded and any changed options are
applied. The middleware will log a warning the first time it fails to
load a config file that had previously been successfully loaded. The
middleware also logs at info level when it first successfully loads a
config file that had previously failed to be loaded. Otherwise, the
middleware will log when a config file is loaded that results in the
config being changed.

Change-Id: I6554e37c6ab5b0a260f99b54169cb90ab5718f81
2024-03-11 18:10:24 +00:00
Jianjian Huo
d5877179a5 Object-server: add periodic greenthread yielding during file read.
Currently, when object-server serves GET request and DiskFile
reader iterate over disk file chunks, there is no explicit
eventlet sleep called. When network outpace the slow disk IO,
it's possible one large and slow GET request could cause
eventlet hub not to schedule any other green threads for a
long period of time. To improve this, this patch add a
configurable sleep parameter into DiskFile reader, which
is 'cooperative_period' with a default value of 0 (disabled).

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I80b04bad0601b6cd6caef35498f89d4ba70a4fd4
2024-02-27 11:24:41 +11:00
Zuul
439dc93cc4 Merge "Add ClosingIterator class; be more explicit about closes" 2024-02-21 18:35:42 +00:00
Tim Burke
c522f5676e Add ClosingIterator class; be more explicit about closes
... in document_iters_to_http_response_body.

We seemed to be relying a little too heavily upon prompt garbage
collection to log client disconnects, leading to failures in
test_base.py::TestGetOrHeadHandler::test_disconnected_logging
under python 3.12.

Closes-Bug: #2046352
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I4479d2690f708312270eb92759789ddce7f7f930
2024-02-12 11:16:09 +00:00
Tim Burke
ce9e56a6d1 lint: Consistently use assertIsInstance
This has been available since py32 and was backported to py27; there
is no point in us continuing to carry the old idiom forward.

Change-Id: I21f64b8b2970e2dd5f56836f7f513e7895a5dc88
2024-02-07 15:48:39 -08:00
Tim Burke
76ca11773e lint: Up-rev hacking
Last time we did this was nearly 4 years ago; drag ourselves into
something approaching the present. Address a few new pyflakes issues
that seem reasonable to enforce:

   E275 missing whitespace after keyword
   E231 missing whitespace after ','
   E721 do not compare types, for exact checks use `is` / `is not`,
        for instance checks use `isinstance()`

Main motivator is that the old hacking kept us on an old version
of flake8 et al., which no longer work with newer Pythons.

Change-Id: I54b46349fabb9776dcadc6def1cfb961c123aaa0
2024-02-07 15:48:39 -08:00
Alistair Coles
252f0d36b7 proxy: only use listing shards cache for 'auto' listings
The proxy should NOT read or write to memcache when handling a
container GET that explicitly requests 'shard' or 'object' record
type. A request for 'shard' record type may specify 'namespace'
format, but this request is unrelated to container listings or object
updates and passes directly to the backend.

This patch also removes unnecessary JSON serialisation and
de-serialisation of namespaces within the proxy GET path when a
sharded object listing is being built. The final response body will
contain a list of objects so there is no need to write intermediate
response bodies with a list of namespaces.

Requests that explicitly specify record type of 'shard' will of
course still have the response body with serialised shard dicts that
is returned from the backend.

Change-Id: Id79c156432350c11c52a4004d69b85e9eb904ca6
2024-01-31 11:02:54 +00:00
Jianjian Huo
c073933387 Container-server: add container namespaces GET
The proxy-server makes GET requests to the container server to fetch
full lists of shard ranges when handling object PUT/POST/DELETE and
container GETs, then it only stores the Namespace attributes (lower
and name) of the shard ranges into Memcache and reconstructs the list
of Namespaces based on those attributes. Thus, a namespaces GET
interface can be added into the backend container-server to only
return a list of those Namespace attributes.

On a container server setup which serves a container with ~12000
shard ranges, benchmarking results show that the request rate of the
HTTP GET all namespaces (states=updating) is ~12 op/s, while the
HTTP GET all shard ranges (states=updating) is ~3.2 op/s.

The new namespace GET interface supports most of headers and
parameters supported by shard range GET interface. For example,
the support of marker, end_marker, include, reverse and etc. Two
exceptions are: 'x-backend-include-deleted' cannot be supported
because there is no way for a Namespace to indicate the deleted state;
the 'auditing' state query parameter is not supported because it is
specific to the sharder which only requests full shard ranges.

Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: If152942c168d127de13e11e8da00a5760de5ae0d
2024-01-11 10:46:53 +00:00
Clay Gerrard
bcb8810886 tests: consolidate Namespace/ShardRange _check_name
The Namespace class grew account/container properties to make them
easier to use in the proxy and subjected to similar consistency
requirements as the ShardRange's properties in the related change.

There are no new assertions added in this change, it merely
consolidates the py2/py3 validating helper which was duplicated
between the Namespace and ShardRange TestCases.

Related-Change-Id: Iebb09d6eff2165c25f80abca360210242cf3e6b7
Change-Id: Ide7f1dd3d9c664fb57c47dcd50edb44ae90ff5f9
2023-12-12 21:10:54 +00:00
Alistair Coles
93f812a518 Add account and container properties to Namespace
ShardRange.name is required to have the form <account/container>. We'd
like to be able to replace ShardRange instances with the Namespace
superclass but still have the convenience of the account and container
accessors.

The name is stored as a single attribute and split when accessing via
the account and container getters, rather than splitting into two
attributes in the name setter, to minimise the overhead of
constructing Namespace instances. Where performance can be critical
(e.g. fetching the entire set of namespaces from a container server)
the number of Namespace instances constructed can be much greater than
the number whose account and container properties are used. The author
found that splitting in the account and container getters became more
efficient than splitting in the name setter when the rate of
constructing instances was ~2x greater than the rate of calling the
account and container getters.

The account and container property setters are removed from the
ShardRange class. The name setter is removed from the Namespace class.
These setter were never used.

Change-Id: Iebb09d6eff2165c25f80abca360210242cf3e6b7
2023-12-11 12:49:58 +00:00
Clay Gerrard
4a37a2976b slo: refactor GET/HEAD response handling
This patch reorganizes the SLO read response handling.  The main goal
was to push the response header replacement for both GET/HEAD SLO and
multipart-manifest=get paths all into a common return path.  A new
RespAttrs primitive is used to carry around some metadata details from
requests made in SLO.  The authors hope these changes make the code more
easily readable and easier to modify.

Drive-By: add new "friendly_close" function in common.utils so we can
drain empty/error responses more confidently (and use it in swob and
request_helpers).

Drive-By: the tests added in the Related-Change discovered a 500 on
If-[Un]Modified-Since conditional GET requests - it probably wasn't
important, but this refactor fixed it on accident as a side effect.

Closes-Bug: #2040178
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Ashwin Nair <nairashwin952013@gmail.com>
Related-Change-Id: I54094f3d2098f56b755ec19cc9315d06a6ca8b15
Change-Id: Idc84e70539fc7480b6ecb86e2f0da904baf2c727
2023-11-10 15:26:28 -06:00
Zuul
c8b19f4fd1 Merge "Utils: fix Namespace and ShardRange attribute encoding in py2." 2023-11-08 02:16:07 +00:00
Jianjian Huo
d0d5533940 Utils: fix Namespace and ShardRange attribute encoding in py2.
Ensure name/account/container are always consistent and always encode
utf8 in py2.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Change-Id: Ia5374f55adf80fef92a92d916b3f89297463c673
2023-11-03 09:38:32 +00:00
Tim Burke
55f7833d86 systemd: Send STOPPING/RELOADING notifications
See https://www.freedesktop.org/software/systemd/man/sd_notify.html#Description
for more information.

Note that this requires that we keep the NOTIFY_SOCKET env var
around for more than just the first READY message, so we want to be
careful about when we're sending the default "READY=1".

UpgradeImpact
=============
Since prior versions of Swift would unset the NOTIFY_SOCKET env var,
services must be fully restarted (rather than seamlessly reloaded) to
emit the new messages.

Related-Change: Ice224fc2a6ba0150be180955037c13fc90365479
Change-Id: I201734ae0d6232ecb1923e67864dd928f90b6586
2023-10-16 15:44:06 -07:00
Tim Burke
20ff642154 stats: Round timings at 4 decimal places
It seems unreasonable to expect timings to be accurate to sub-100ns
resolution.

Why 4 places? We already had some tests for proxy-logging that would
assertAlmostEqual to that many places.

Change-Id: Ic7a0c4a416a46eb5198d7cce103358d677ec94ab
2023-10-13 17:27:16 +00:00
Tim Burke
d31a54a65c object: Block POSTs and chunked PUTs when already past reserve
Previously, clients could bypass fallocate_reserve checks by uploading
with `Transfer-Encoding: chunked` rather than sending a `Content-Length`

Now, a chunked transfer may still push a disk past the reserve threshold,
but once over the threshold, further PUTs and POSTs will 507. DELETEs
will still be allowed.

Closes-Bug: #2031049
Change-Id: I69ec7193509cd3ed0aa98aca15190468368069a5
2023-10-02 11:47:43 -07:00
Zuul
d99ad8fb31 Merge "proxy: Get rid of MetricsPrefixLoggerAdapter" 2023-09-11 05:10:00 +00:00
Tim Burke
9f385c07f3 proxy: Get rid of MetricsPrefixLoggerAdapter
It adds another layer of indirection and state for the sake of labeling;
longer term it'll be easier to be explicit at the point of emission.

Related-Change: I0522b1953722ca96021a0002cf93432b973ce626
Change-Id: Ieebafb19c3fa60334aff2914ab1ae70b8f140342
2023-08-21 13:50:24 -07:00
Zuul
1d742eee39 Merge "tests: Pollute stderr less" 2023-08-21 15:30:49 +00:00
Tim Burke
287fbadc1f tests: Pollute stderr less
Change-Id: I193874659536844d431f0c9fa9881e29392ae2b2
2023-08-18 11:39:56 -07:00
Tim Burke
1edf7df755 Partially revert "Pull libc-related functions out to a separate module"
This reverts the fallocate- and punch_hole-related parts of commit
c78a5962b5f6c9e75f154cac924a226815236e98.

Closes-Bug: #2031035
Related-Change: I3e26f8d4e5de0835212ebc2314cac713950c85d7
Change-Id: I8050296d6982f70bb64a63765b25d287a144cb8d
2023-08-18 17:50:31 +00:00
Alistair Coles
bdbe8ce9f8 s3api: fix statsd prefix mutation
The 'log_route' argument of utils.get_logger() determines which global
Logger instance is wrapped by the returned LogAdapter. Most middlewares
(s3api being the exception) explicity set 'log_route' to equal the
middleware 'brief' name e.g. 'bulk', 'tempauth' etc. However, the
s3api middleware sets 'log_route' to be the config 'log_name', if that
key is found in config.

When a proxy pipeline is instantiated via wsgi.run_wsgi(), all
middlewares and the proxy app are passed a default conf with
'"log_name": "proxy-server"'. As a result, the s3api middleware calls
get_logger() with log_route='proxy-server' and its LogAdapter
therefore shares the same Logger instance used by proxy-server app
(and any other middleware that similarly fails to explicitly
differentiate 'log_route)'.

Each Logger instance has a StatsdClient instance bound to it by
get_logger(). The Related-Change added statsd metrics to the s3api
middleware and sets 's3api' as the 'statsd_tail_prefix' when calling
get_logger(). This had the unintended effect of replacing the shared
Logger instance's StatsdClient with one that has prefix 's3api', such
that stats emitted by the proxy app (e.g. memcache shard range
hit/miss stats) would be erroneously prefixed with 's3api'.

This patch modifies the s3api middleware logger instantiation to
explictly set log_route='s3api', so that the s3api middleware
LogAdapter now wraps a unique global Logger instance, with a unique
StatsdClient instance bound to it.

The 'server' attribute of the middleware's LogAdapter, which may be
included in log lines by the "%(server)s" format element, is not
affected by this change. Its value is derived from the config
'log_name' or the 'name' argument passed to get_logger().

Change-Id: Ia89485bae8f92f4f3d9f5375cab8ff08f70a11a7
Related-Change: I4976b3ee24e4ec498c66359f391813261d42c495
2023-07-20 09:56:09 +01:00
Jianjian Huo
cb1e584e64 Object-server: keep SLO manifest files in page cache.
Currently, SLO manifest files will be evicted from page cache
after reading it, which cause hard drives very busy when user
requests a lot of parallel byte range GETs for a particular
SLO object.

This patch will add a new config 'keep_cache_slo_manifest', and
try keeping the manifest files in page cache by not evicting them
after reading if config settings allow so.

Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I557bd01643375d7ad68c3031430899b85908a54f
2023-07-07 12:48:24 -07:00
Zuul
7b1f7ee857 Merge "py3: Quiet RemoteDisconnected tracebacks" 2023-06-26 11:23:39 +00:00
Tim Burke
483e17d5b4 py3: Quiet RemoteDisconnected tracebacks
RemoteDisconnected from both BadStatusLine and ConnectionResetError
(which in turn eventually inherits from OSError). We want to make
sure it gets handled as a BadStatusLine, as it doesn't get its errno
set and would otherwise get the default traceback handling.

Change-Id: I0fb1f764722d73db6d3b79acc128f37f51499d35
2023-06-23 09:40:39 -07:00
Tim Burke
0235db3d31 tests: Stop trying to mutate instantiated EntryPoints
py311 made EntryPoints immutable. Subclass instead.

Closes-Bug: #2009228
Change-Id: I90198b1bb6b18b752e0fdc4d4d920914ea449413
2023-06-22 17:28:12 -07:00
indianwhocodes
be0c481e44 Fix proxy traceback for GeneratorExit in py3
Client when explicitly closed before finishing the download.
leads to a 499, but the shutdown logging for proxy in py3
needs to be fixed. We have done it by killing all running
coroutines in the ContextPool

Change-Id: Ic372ea9866bb7f2659e02f8796cdee01406e2079
2023-06-13 13:57:52 -07:00
Tim Burke
e29e2c3ae5 Move IP-address-related functions out to new module
Partial-Bug: #2015274
Change-Id: I7ffa3a8e95d4ec456860b0484caf1dd08ff0849a
2023-05-22 10:51:21 -07:00
Alistair Coles
aa96cb3dc6 proxy: add periodic zero-time sleep during object PUT
Previously it was possible for an entire object PUT data transfer to
execute without the greenthread sleeping and allowing other
greenthreads to run. This was more likely with an EC PUT because the
computation of EC fragments might be slower than the rate at which
they are drained out of IO send buffers, so IO never blocks. In
extreme cases this could cause timeouts in other greenthreads to pop.

This patch adds a periodic zero-time sleep in the object PUT data
transfer loop. An existing pattern in the GET path is re-used, and
extracted to a new CooperativeIterator helper class.

Change-Id: Idd6b767f1a746c72c106199f5d1fada3615b1e97
Closes-Bug: #2019955
Related-Change: Iae27109f5a3d109ad21ec9a972e39f22150f6dbb
2023-05-18 12:30:58 -07:00
Zuul
4c7b2e3bb5 Merge "Add cap_length helper" 2023-05-17 23:20:30 +00:00
Zuul
f99a6e5762 Merge "Log (Watchdog's) Timeouts with duration" 2023-05-01 06:27:27 +00:00
Zuul
b1dc6237c1 Merge "Don't monkey patch logging on import" 2023-04-28 22:44:29 +00:00
Chetan Mishra
84b995f275 Don't monkey patch logging on import
Previously swift.common.utils monkey patched logging.thread,
logging.threading, and logging._lock upon import with eventlet
threading modules, but that is no longer reasonable or necessary.

With py3, the existing logging._lock is not patched by eventlet,
unless the logging module is reloaded. The existing lock is not
tracked by the gc so would not be found by eventlet's
green_existing_locks().

Instead we group all monkey patching into utils function and apply
patching consistently across daemons and WSGI servers.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Closes-Bug: #1380815
Change-Id: I6f35ad41414898fb7dc5da422f524eb52ff2940f
2023-04-28 08:57:35 -07:00
Clay Gerrard
8d23dd8ac6 Log (Watchdog's) Timeouts with duration
... and clean up WatchDog start a little.

If this pattern proves useful we could consider extending it.

Change-Id: Ia85f9321b69bc4114a60c32a7ad082cae7da72b3
2023-04-28 10:14:01 -05:00
Jianjian Huo
71d507f8e1 Proxy: restructure cached listing shard ranges
Updating shard range cache has been restructured and upgraded to v2
which only persist the essential attributes in memcache (see
Related-Change). This is the following patch to restructure the
listing shard ranges cache for object listing in the same way.

UpgradeImpact
=============
The cache key for listing shard ranges in memcached is renamed
from 'shard-listing/<account>/<container>' to
'shard-listing-v2/<account>/<container>', and cache data is
changed to be a list of [lower bound, name]. As a result, this
will invalidate all existing listing shard ranges stored in the
memcache cluster.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Change-Id: I54a32fd16e3d02b00c18b769c6f675bae3ba8e01
2023-04-17 09:49:26 -07:00
Tim Burke
c78a5962b5 Pull libc-related functions out to a separate module
Partial-Bug: #2015274
Change-Id: I3e26f8d4e5de0835212ebc2314cac713950c85d7
2023-04-12 13:17:10 -07:00
Zuul
984cca9263 Merge "Pull timestamp-related functions out to a separate module" 2023-04-12 18:46:41 +00:00
Tim Burke
85d68ee491 Test nested (Metrics)PrefixLoggerAdapters
Change-Id: I71ad4de0ab3af8e7e865cb924f96e5c415935654
2023-04-05 20:57:26 -07:00
Tim Burke
0a4e41701d Add cap_length helper
Change-Id: Ib864c7dc6c8c7bb849f4f97a1239eb5cc04c424c
2023-04-05 20:54:39 -07:00
Tim Burke
c21256d870 Pull timestamp-related functions out to a separate module
Partial-Bug: #2015274
Change-Id: I5b7ab3b2c150ec1513b3e6ebc4b27808d5df042c
2023-04-05 14:45:57 -07:00
Alistair Coles
acf31a61db Rename ShardRange*Bound to Namespace*Bound
Also:

  - move some tests to test_utils.TestNamespace.
  - move ShardName class in file (no change to class)
  - move end_marker method from ShardRange to Namespace

Related-Change: If98af569f99aa1ac79b9485ce9028fdd8d22576b
Change-Id: Ibd5614d378ec5e9ba47055ba8b67a42ab7f7453c
2023-03-15 15:59:14 +00:00
Jianjian Huo
6ff90ea73e Proxy: restructure cached updating shard ranges
Restructure the shard ranges that are stored in memcache for
object updating to only persist the essential attributes of
shard ranges in memcache (lower bounds and names), so the
aggregate of memcache values is much smaller and retrieval
will be much faster too.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>

UpgradeImpact
=============
The cache key for updating shard ranges in memcached is renamed
from 'shard-updating/<account>/<container>' to
'shard-updating-v2/<account>/<container>', and cache data is
changed to be a list of [lower bound, name]. As a result, this
will invalid all existing updating shard ranges stored in the
memcache cluster.

Change-Id: If98af569f99aa1ac79b9485ce9028fdd8d22576b
2023-03-06 22:20:02 -08:00