swift/swift/container
Clay Gerrard 2558be42c8 Populate shrinking shards with shard ranges learnt from root
Shard shrinking can be instigated by a third party modifying shard
ranges, moving one shard to shrinking state and expanding the
namespace of one or more other shard(s) to act as acceptors. These
state and namespace changes must propagate to the shrinking and
acceptor shards. The shrinking shard must also discover the acceptor
shard(s) into which it will shard itself.

The sharder audit function already updates shards with their own state
and namespace changes from the root. However, there is currently no
mechanism for the shrinking shard to learn about the acceptor(s) other
than by a PUT request being made to the shrinking shard container.

This patch modifies the shard container audit function so that other
overlapping shards discovered from the root are merged into the
audited shard's db. In this way, the audited shard will have acceptor
shards to cleave to if shrinking.

This new behavior is restricted to when the shard is shrinking. In
general, a shard is responsible for processing its own sub-shard
ranges (if any) and reporting them to root. Replicas of a shard
container synchronise their sub-shard ranges via replication, and do
not rely on the root to propagate sub-shard ranges between shard
replicas. The exception to this is when a third party (or
auto-sharding) wishes to instigate shrinking by modifying the shard
and other acceptor shards in the root container.  In other
circumstances, merging overlapping shard ranges discovered from the
root is undesirable because it risks shards inheriting other unrelated
shard ranges. For example, if the root has become polluted by
split-brain shard range management, a sharding shard may have its
sub-shards polluted by an undesired shard from the root.

During the shrinking process a shard range's own shard range state may
be either shrinking or, prior to this patch, sharded. The sharded
state could occur when one replica of a shrinking shard completed
shrinking and moved the own shard range state to sharded before other
replica(s) had completed shrinking. This makes it impossible to
distinguish a shrinking shard (with sharded state), which we do want
to inherit shard ranges, from a sharding shard (with sharded state),
which we do not want to inherit shard ranges.

This patch therefore introduces a new shard range state, 'SHRUNK', and
applies this state to shard ranges that have completed shrinking.
Shards are now restricted to inherit shard ranges from the root only
when their own shard range state is either SHRINKING or SHRUNK.

This patch also:

 - Stops overlapping shrinking shards from generating audit warnings:
   overlaps are cured by shrinking and we therefore expect shrinking
   shards to sometimes overlap.

 - Extends an existing probe test to verify that overlapping shard
   ranges may be resolved by shrinking a subset of the shard ranges.

 - Adds a --no-auto-shard option to swift-container-sharder to enable the
   probe tests to disable auto-sharding.

 - Improves sharder logging, in particular by decrementing ranges_todo
   when a shrinking shard is skipped during cleaving.

 - Adds a ShardRange.sort_key class method to provide a single definition
   of ShardRange sort ordering.

 - Improves unit test coverage for sharder shard auditing.

Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Cherry-Picked-From: I9034a5715406b310c7282f1bec9625fe7acd57b6
Change-Id: I6543bfff5ca517ab4d6ffda881e3a50724078c65
2021-02-22 18:29:48 -06:00
..
__init__.py Initial commit of Swift code 2010-07-12 17:03:45 -05:00
auditor.py Pass logger instances to AccountBroker/ContainerBroker 2019-05-03 01:15:20 +09:00
backend.py Populate shrinking shards with shard ranges learnt from root 2021-02-22 18:29:48 -06:00
reconciler.py Allow direct and internal clients to use the replication network 2020-08-04 21:22:04 +00:00
replicator.py sharding: better handle get_shard_ranges failures 2019-07-12 17:37:37 -07:00
server.py Don't auto-create shard containers 2020-06-03 13:26:31 -07:00
sharder.py Populate shrinking shards with shard ranges learnt from root 2021-02-22 18:29:48 -06:00
sync.py Allow direct and internal clients to use the replication network 2020-08-04 21:22:04 +00:00
sync_store.py Container-Sync to iterate only over synced containers 2016-01-06 16:46:31 +02:00
updater.py Merge "container-updater: Always report zero objects/bytes used for shards" 2018-06-22 21:45:56 +00:00