swift/test/probe
Tim Burke 3189410f9d Ignore 404s from handoffs for objects when calculating quorum
We previously realized we needed to do that for accounts and containers
where the consequences of treating the 404 as authoritative were more
obvious: we'd cache the non-existence which prevented writes until it
fell out of cache.

The same basic logic applies for objects, though: if we see

    (Timeout, Timeout, Timeout, 404, 404, 404)

on a triple-replica policy, we don't really have any reason to think
that a 404 is appropriate. In fact, it seems reasonably likely that
there's a thundering-herd problem where there are too many concurrent
requests for data that *definitely is there*. By responding with a 503,
we apply some back-pressure to clients, who hopefully have some
exponential backoff in their retries.

The situation gets a bit more complicated with erasure-coded data, but
the same basic principle applies. We're just more likely to have
confirmation that there *is* data out there, we just can't reconstruct
it (right now).

Note that we *still want to check* those handoffs, of course. Our
fail-in-place strategy has us replicate (and, more recently,
reconstruct) to handoffs to maintain durability; it'd be silly *not* to
look.

UpgradeImpact:
--------------
Be aware that this may cause an increase in 503 Service Unavailable
responses served by proxy-servers. However, this should more accurately
reflect the state of the system.

Co-Authored-By: Thiago da Silva <thiagodasilva@gmail.com>
Change-Id: Ia832e9bab13167948f01bc50aa8a61974ce189fb
Closes-Bug: #1837819
Related-Bug: #1833612
Related-Change: I53ed04b5de20c261ddd79c98c629580472e09961
Related-Change: Ief44ed39d97f65e4270bf73051da9a2dd0ddbaec
2019-08-01 14:07:39 -07:00
..
__init__.py Add license in swift code file 2017-07-05 10:07:11 +08:00
brain.py Make the decision between primary/handoff sets more obvious 2018-05-22 12:12:42 -07:00
common.py Increase node_timeout in gate 2019-02-12 10:39:17 -06:00
test_account_failures.py Remove executable flag from some test modules 2016-10-31 21:22:10 +00:00
test_account_get_fake_responses_match.py Remove executable flag from some test modules 2016-10-31 21:22:10 +00:00
test_account_reaper.py Follow up delayed reap probe test 2016-08-18 14:08:56 +01:00
test_container_failures.py Increase node_timeout in gate 2019-02-12 10:39:17 -06:00
test_container_merge_policy_index.py add symlink to probetest for reconciler 2017-12-14 12:16:39 -08:00
test_container_sync.py Symlink implementation. 2017-12-13 21:26:12 +00:00
test_db_replicator.py Apply remote metadata in _handle_sync_response 2018-03-06 19:52:59 +00:00
test_empty_device_handoff.py Remove executable flag from some test modules 2016-10-31 21:22:10 +00:00
test_object_async_update.py No longer import nose 2017-11-07 15:39:25 +11:00
test_object_conditional_requests.py Make If-None-Match:* work properly with 0-byte PUTs 2018-02-26 13:12:44 +00:00
test_object_expirer.py Refactoring, test infrastructure changes and cleanup 2018-05-15 18:18:25 +01:00
test_object_failures.py Ignore 404s from handoffs for objects when calculating quorum 2019-08-01 14:07:39 -07:00
test_object_handoff.py Ignore 404s from handoffs for objects when calculating quorum 2019-08-01 14:07:39 -07:00
test_object_metadata_replication.py Remove all post_as_copy related code and configes 2017-09-16 05:50:41 +00:00
test_object_partpower_increase.py Add support to increase object ring partition power 2017-06-15 15:08:48 -07:00
test_reconstructor_rebuild.py Rebuild frags for unmounted disks 2019-02-08 18:04:55 +00:00
test_reconstructor_revert.py Tolerate swiftclient *not* mutatinng args 2017-08-25 12:27:41 -07:00
test_replication_servers_working.py Stop overwriting reserved term 2019-03-12 08:53:18 +00:00
test_sharder.py sharding: Cache shard ranges for object writes 2019-07-11 10:40:38 -07:00
test_signals.py Use latest eventlet in probe tests 2018-09-19 14:59:32 -07:00