sharder: avoid small tail shards
A container is typically sharded when it has grown to have an object count of shard_container_threshold + N, where N << shard_container_threshold. If sharded using the default rows_per_shard of shard_container_threshold / 2 then this would previously result in 3 shards: the tail shard would typically be small, having only N rows. This behaviour caused more shards to be generated than desirable. This patch adds a minimum-shard-size option to swift-manage-shard-ranges, and a corresponding option in the sharder config, which can be used to avoid small tail shards. If set to greater than one then the final shard range may be extended to more than rows_per_shard in order to avoid a further shard range with less than minimum-shard-size rows. In the example given, if minimum-shard-size is set to M > N then the container would shard into two shards having rows_per_shard rows and rows_per_shard + N respectively. The default value for minimum-shard-size is rows_per_shard // 5. If all options have their default values this results in minimum-shard-size being 100000. Closes-Bug: #1928370 Co-Authored-By: Matthew Oliver <matt@oliver.net.au> Change-Id: I3baa278c6eaf488e3f390a936eebbec13f2c3e55
This commit is contained in:
@@ -329,6 +329,18 @@ rows_per_shard 500000 This defines the initial
|
||||
containers. The default
|
||||
is shard_container_threshold // 2.
|
||||
|
||||
minimum_shard_size 100000 Minimum size of the final
|
||||
shard range. If this is
|
||||
greater than one then the
|
||||
final shard range may be
|
||||
extended to more than
|
||||
rows_per_shard in order
|
||||
to avoid a further shard
|
||||
range with less than
|
||||
minimum_shard_size rows.
|
||||
The default value is
|
||||
rows_per_shard // 5.
|
||||
|
||||
shrink_threshold This defines the
|
||||
object count below which
|
||||
a 'donor' shard container
|
||||
|
||||
Reference in New Issue
Block a user