The copy source must be container/object.
This patch avoids the server to return
an internal server error when user provides
a path without a container.
Fixes: bug #1255049
Change-Id: I1a85c98d9b3a78bad40b8ceba9088cf323042412
On GET, the proxy will go search the primary nodes plus some number of
handoffs for the account/container/object before giving up and
returning a 404. That number is, by default, twice the ring's replica
count. This was fine if your ring had an integral number of replicas,
but could lead to some slightly-odd behavior if you have fractional
replicas.
For example, imagine that you have 3.49 replicas in your object ring;
perhaps you're migrating a cluster from 3 replicas to 4, and you're
being smart and doing it a bit at a time.
On an object GET that all the primary nodes 404ed, the proxy would
then compute 2 * 3.49 = 6.98, round it up to 7, and go look at 7
handoff nodes. This is sort of weird; the intent was to look at 6
handoffs for objects with 3 replicas, and 8 handoffs for objects with
4, but the effect is 7 for everybody.
You also get little latency cliffs as you scale up replica counts. If,
instead of 3.49, you had 3.51 replicas, then the proxy would look at 8
handoff nodes in every case [ceil(2 * 3.51) = 8], so there'd be a
small-but-noticeable jump in the time it takes to produce a 404.
The fix is to compute the number of handoffs based on the number of
primary nodes for the partition, not the ring's replica count. This
gets rid of the little latency cliffs and makes the behavior more like
what you get with integral replica counts.
If your ring has an integral number of replicas, there's no behavior
change here.
Change-Id: I50538941e571135299fd6b86ecd9dc780cf649f5
The proxy can now be configured to prefer local object servers for PUT
requests, where "local" is governed by the "write_affinity". The
"write_affinity_node_count" setting controls how many local object
servers to try before giving up and going on to remote ones.
I chose to simply re-order the object servers instead of filtering out
nonlocal ones so that, if all of the local ones are down, clients can
still get successful responses (just slower).
The goal is to trade availability for throughput. By writing to local
object servers across fast LAN links, clients get better throughput
than if the object servers were far away over slow WAN links. The
downside, of course, is that data availability (not durability) may
suffer when drives fail.
The default configuration has no write affinity in it, so the default
behavior is unchanged.
Added some words about these settings to the admin guide.
DocImpact
Change-Id: I09a0bd00524544ff627a3bccdcdc48f40720a86e