The region is one level above the zone; it is intended to represent a
chunk of machines that is distant from others with respect to
bandwidth and latency.
Old rings will default to having all their devices in region 1. Since
everything is in the same region by default, the ring builder will
simply distribute across zones as it did before, so your partition
assignment won't move because of this change. If you start adding
devices in other regions, of course, the assignment will change to
take that into account.
swift-ring-builder still accepts the same syntax as before, but will
default added devices to region 1 if no region is specified.
Examples:
$ swift-ring-builder foo.builder add r2z1-1.2.3.4:555/sda
$ swift-ring-builder foo.builder add r1z3-1.2.3.4:555/sda
$ swift-ring-builder foo.builder add z3-1.2.3.4:555/sda
Also, some updates to ring-overview doc.
Change-Id: Ifefbb839cdcf033e6c9201fadca95224c7303a29
The Admin Guide now contains information about the ring serialization
change (and importantly, how to downgrade, if necessary).
Also added container-server conf var, "allow_versions" to the Deployment
Guide.
Also changed description of proxy-server conf var,
"max_containers_whitelist" to say it contains "account names" not
"account hashes".
Change-Id: Ib23c6118cc5195cc04765afd28e442e4c735f0d4
This commit introduces a new algorithm for assigning partition
replicas to devices. Basically, the ring builder organizes the devices
into tiers (first zone, then IP/port, then device ID). When placing a
replica, the ring builder looks for the emptiest device (biggest
parts_wanted) in the furthest-away tier.
In the case where zone-count >= replica-count, the new algorithm will
give the same results as the one it replaces. Thus, no migration is
needed.
In the case where zone-count < replica-count, the new algorithm
behaves differently from the old algorithm. The new algorithm will
distribute things evenly at each tier so that the replication is as
high-quality as possible, given the circumstances. The old algorithm
would just crash, so again, no migration is needed.
Handoffs have also been updated to use the new algorithm. When
generating handoff nodes, first the ring looks for nodes in other
zones, then other ips/ports, then any other drive. The first handoff
nodes (the ones in other zones) will be the same as before; this
commit just extends the list of handoff nodes.
The proxy server and replicators have been altered to avoid looking at
the ring's replica count directly. Previously, with a replica count of
C, RingData.get_nodes() and RingData.get_part_nodes() would return
lists of length C, so some other code used the replica count when it
needed the number of nodes. If two of a partition's replicas are on
the same device (e.g. with 3 replicas, 2 devices), then that
assumption is no longer true. Fortunately, all the proxy server and
replicators really needed was the number of nodes returned, which they
already had. (Bonus: now the only code that mentions replica_count
directly is in the ring and the ring builder.)
Change-Id: Iba2929edfc6ece89791890d0635d4763d821a3aa