14 Commits

Author SHA1 Message Date
Tim Burke
cedec8c5ef Latch shard-stat reporting
The idea is, if none of

  - timestamp,
  - object_count,
  - bytes_used,
  - state, or
  - epoch

has changed, we shouldn't need to send an update back to the root
container.

This is more-or-less comparable to what the container-updater does to
avoid unnecessary writes to the account.

Closes-Bug: #1834097
Change-Id: I1ee7ba5eae3c508064714c4deb4f7c6bbbfa32af
2020-05-29 22:33:10 -07:00
Andreas Jaeger
96b56519bf Update hacking for Python3
The repo is Python using both Python 2 and 3 now, so update hacking to
version 2.0 which supports Python 2 and 3. Note that latest hacking
release 3.0 only supports version 3.

Fix problems found.

Remove hacking and friends from lower-constraints, they are not needed
for installation.

Change-Id: I9bd913ee1b32ba1566c420973723296766d1812f
2020-04-03 21:21:07 +02:00
Zuul
e32689a96d Merge "Deprecate per-service auto_create_account_prefix" 2020-01-07 01:30:20 +00:00
Clay Gerrard
4601548dab Deprecate per-service auto_create_account_prefix
If we move it to constraints it's more globally accessible in our code,
but more importantly it's more obvious to ops that everything breaks if
you try to mis-configure different values per-service.

Change-Id: Ib8f7d08bc48da12be5671abe91a17ae2b49ecfee
2020-01-05 09:53:30 -06:00
Tim Burke
3f88907012 sharding: Better-handle newlines in container names
Previously, if you were on Python 2.7.10+ [0], such a newline would cause the
sharder to fail, complaining about invalid header values when trying to create
the shard containers. On older versions of Python, it would most likely cause a
parsing error in the container-server that was trying to handle the PUT.

Now, quote all places that we pass around container paths. This includes:

  * The X-Container-Sysmeta-Shard-(Quoted-)Root sent when creating the (empty)
    remote shards
  * The X-Container-Sysmeta-Shard-(Quoted-)Root included when initializing the
    local handoff for cleaving
  * The X-Backend-(Quoted-)Container-Path the proxy sends to the object-server
    for container updates
  * The Location header the container-server sends to the object-updater

Note that a new header was required in requests so that servers would
know whether the value should be unquoted or not. We can get away with
reusing Location in responses by having clients opt-in to quoting with
a new X-Backend-Accept-Quoted-Location header.

During a rolling upgrade,

  * old object-servers servicing requests from new proxy-servers will
    not know about the container path override and so will try to update
    the root container,
  * in general, object updates are more likely to land in the root
    container; the sharder will deal with them as misplaced objects, and
  * shard containers created by new code on servers running old code
    will think they are root containers until the server is running new
    code, too; during this time they'll fail the sharder audit and report
    stats to their account, but both of these should get cleared up upon
    upgrade.

Drive-by: fix a "conainer_name" typo that prevented us from testing that
we can shard a container with unicode in its name. Also, add more UTF8
probe tests.

[0] See https://bugs.python.org/issue22928

Change-Id: Ie08f36e31a448a547468dd85911c3a3bc30e89f1
Closes-Bug: 1856894
2020-01-03 16:04:57 -08:00
Zuul
2d87ad6333 Merge "sharder: Keep cleaving on empty shard ranges" 2019-09-27 01:42:09 +00:00
Matthew Oliver
e9cd9f74a5 sharder: Keep cleaving on empty shard ranges
When a container is being cleaved there is a possiblity that we're
dealing with an empty or near empty container created on a handoff node.
These containers may have a valid list of shard ranges, so would need
to cleave to the new shards.
Currently, when using a `cleave_batch_size` that is smaller then the
number of shard ranges on the cleaving container, these containers will
have to take a few shard passes to shard, even though there maybe
nothing in them.

This is worse if a really large container is sharding, and due to being
slow, error limitted a node causing a new container on a handoff
location. This empty container would have a large number of shard ranges
and could take a _very_ long time to shard away, slowing the process
down.

This patch eliminates the issue by detecting when no objects are
returned for a shard range. The `_cleave_shard_range` method now
returns 3 possible results:

  - CLEAVE_SUCCESS
  - CLEAVE_FAILED
  - CLEAVE_EMPTY

They all are pretty self explanitory. When `CLEAVE_EMPTY` is returned
the code will:

  - Log
  - Not replicate the empty temp shard container sitting in a
    handoff location
  - Not count the shard range in the `cleave_batch_size` count
  - Update the cleaving context so sharding can move forward

If there already is a shard range DB existing on a handoff node to use
then the sharder wont skip it, even if there are no objects, it'll
replicate it and treat it as normal, including using a `cleave_batch_size`
slot.

Change-Id: Id338f6c3187f93454bcdf025a32a073284a4a159
Closes-Bug: #1839355
2019-09-26 14:46:43 -07:00
Matthew Oliver
370ac4cd70 Sharding: Use the metadata timestamp as last_modified
This is a follow up patch from the cleaning up cleave context's patch
(patch 681970). Instead of tracking a last_modified timestamp, and storing
it in the context metadata, use the timestamp we use when storing any
metadata.

Reducing duplication is nice, but there's a more significant reason to
do this: affected container DBs can start getting cleaned up as soon as
they're running the new code rather than needing to wait for an
additional reclaim_age.

Change-Id: I2cdbe11f06ffb5574e573c4a60ba4e5d41a00c50
2019-09-23 13:43:09 -07:00
Matthew Oliver
81a41da542 Sharding: Clean up old CleaveConext's during audit
There is a sharding edge case where more CleaveContext are generated and
stored in the sharding container DB. If this number get's high enough,
like in the linked bug. If enough CleaveContects build up in the DB then
this can lead to the 503's when attempting to list the container due to
all the `X-Container-Sysmeta-Shard-Context-*` headers.

This patch resolves this by tracking the a CleaveContext's last
modified. And during the sharding audit, any context's that hasn't been
touched after reclaim_age are deleted.

This plus the skip empty ranges patches should improve these handoff
shards.

Change-Id: I1e502c328be16fca5f1cca2186b27a0545fecc16
Closes-Bug: #1843313
2019-09-18 17:10:36 +10:00
Pete Zaitcev
575538b55b py3: port the container
This started with ShardRanges and its CLI. The sharder is at the
bottom of the dependency chain. Even container backend needs it.
Once we started tinkering with the sharder, it all snowballed to
include the rest of the container services.

Beware, this does affect some of Python 2 code. Mostly it's trivial
and obviously correct, but needs checking by reviewers.

About killing the stray "from __future__ import unicode_literals":
we do not do it in general. The specific problem it caused was
a failure of functional tests because unicode leaked into a field
that was supposed to be encoded. It is just too hard to track the
types when rules change from file to file, so off with its head.

Change-Id: Iba4e65d0e46d8c1f5a91feb96c2c07f99ca7c666
2019-02-20 21:30:46 -06:00
John Dickinson
c26d67efcf fixed _check_node() in the container sharder
Previously, _check_node() wouldn't catch the raise ValueError when
a drive was unmounted. Therefore the error would bubble up, uncaught,
and stop the shard cycle. The practical effect is that an unmounted
drive on a node would prevent sharding for happening.

This patch updates _check_node() to properly use the check_drive()
method. Furthermore, the _check_node() return value has been modified
to be more similar to what check_drive() actually returns. This
should help prevent similar errors from being introduced in the future.

Closes-Bug: #1806500

Change-Id: I3da9b5b120a5980e77ef5c4dc8fa1697e462ce0d
2018-12-04 16:16:04 -08:00
Clay Gerrard
06cf5d298f Add databases_per_second to db daemons
Most daemons have a "go as fast as you can then sleep for 30 seconds"
strategy towards resource utilization; the object-updater and
object-auditor however have some "X_per_second" options that allow
operators much better control over how they spend their I/O budget.

This change extends that pattern into the account-replicator,
container-replicator, and container-sharder which have been known to peg
CPUs when they're not IO limited.

Partial-Bug: #1784753
Change-Id: Ib7f2497794fa2f384a1a6ab500b657c624426384
2018-10-30 22:28:05 +00:00
Tim Burke
773b633118 Change default sharding threshold to 1,000,000 objects
...instead of 10,000,000. The sample configs were already using one
million, all of our testing with non-SAIO containers was done with
one million, and the resulting container DBs were around 100MB which
seems like a comfortable size. Pretty sure this was just a typo during
some code cleanup.

Change-Id: Icd31f9d8efaac2d5dc0f021cad550687859558b9
2018-05-29 10:48:51 -07:00
Matthew Oliver
2641814010 Add sharder daemon, manage_shard_ranges tool and probe tests
The sharder daemon visits container dbs and when necessary executes
the sharding workflow on the db.

The workflow is, in overview:

- perform an audit of the container for sharding purposes.

- move any misplaced objects that do not belong in the container
  to their correct shard.

- move shard ranges from FOUND state to CREATED state by creating
  shard containers.

- move shard ranges from CREATED to CLEAVED state by cleaving objects
  to shard dbs and replicating those dbs. By default this is done in
  batches of 2 shard ranges per visit.

Additionally, when the auto_shard option is True (NOT yet recommeneded
in production), the sharder will identify shard ranges for containers
that have exceeded the threshold for sharding, and will also manage
the sharding and shrinking of shard containers.

The manage_shard_ranges tool provides a means to manually identify
shard ranges and merge them to a container in order to trigger
sharding. This is currently the recommended way to shard a container.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I7f192209d4d5580f5a0aa6838f9f04e436cf6b1f
2018-05-18 18:48:13 +01:00