12 Commits

Author SHA1 Message Date
John Dickinson
0c1b485ad6 exclude utf8 tests under py3
These are known to not work until https://bugs.python.org/issue37093
is addressed in CPython upstream.

Change-Id: I4a6877907d14b632a9a477c887913488427b62b7
2019-10-29 20:12:05 +00:00
Tim Burke
1d7e1558b3 py3: (mostly) port probe tests
There's still one problem, though: since swiftclient on py3 doesn't
support non-ASCII characters in metadata names, none of the tests in
TestReconstructorRebuildUTF8 will pass.

Change-Id: I4ec879ade534e09c3a625414d8aa1f16fd600fa4
2019-09-04 10:17:45 -07:00
Clay Gerrard
ea8e545a27 Rebuild frags for unmounted disks
Change the behavior of the EC reconstructor to perform a fragment
rebuild to a handoff node when a primary peer responds with 507 to the
REPLICATE request.

Each primary node in a EC ring will sync with exactly three primary
peers, in addition to the left & right nodes we now select a third node
from the far side of the ring.  If any of these partners respond
unmounted the reconstructor will rebuild it's fragments to a handoff
node with the appropriate index.

To prevent ssync (which is uninterruptible) receiving a 409 (Conflict)
we must give the remote handoff node the correct backend_index for the
fragments it will recieve.  In the common case we will use
determistically different handoffs for each fragment index to prevent
multiple unmounted primary disks from forcing a single handoff node to
hold more than one rebuilt fragment.

Handoff nodes will continue to attempt to revert rebuilt handoff
fragments to the appropriate primary until it is remounted or
rebalanced.  After a rebalance of EC rings (potentially removing
unmounted/failed devices), it's most IO efficient to run in
handoffs_only mode to avoid unnecessary rebuilds.

Closes-Bug: #1510342

Change-Id: Ief44ed39d97f65e4270bf73051da9a2dd0ddbaec
2019-02-08 18:04:55 +00:00
Alistair Coles
e109c7800f Add probe test for ssync of unexpired metadata to an expired object
Verify that metadata can be sync'd to a frag that has missed a POST
and consequently that frag appears to be expired, when in fact the
POST removed the X-Delete-At header.

Tests the fix added by the Related-Change.

Related-Bug: #1683689
Related-Change: I919994ead2b20dbb6c5671c208823e8b7f513715
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I9af9fc26098893db4043cc9a8d05d772772d4259
2017-08-25 15:36:00 -07:00
Romain LE DISEZ
69df458254 Allow to rebuild a fragment of an expired object
When a fragment of an expired object was missing, the reconstructor
ssync job would send a DELETE sub-request. This leads to situation
where, for the same object and timestamp, some nodes have a data file,
while others can have a tombstone file.

This patch forces the reconstructor to reconstruct a data file, even
for expired objects. DELETE requests are only sent for tombstoned
objects.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Closes-Bug: #1652323
Change-Id: I7f90b732c3268cb852b64f17555c631d668044a8
2017-08-04 23:05:08 +02:00
Romain LE DISEZ
091157fc7f Fix encoding issue in ssync_sender.send_put()
EC object metadata can currently have a mixture of bytestrings and
unicode.  The ssync_sender.send_put() method raises an
UnicodeDecodeError when it attempts to concatenate the metadata
values, if any bytestring has non-ascii characters.

The root cause of this issue is that the object server uses unicode
for the keys of some object metadata items that are received in the
footer of an EC PUT request, whereas all other object metadata keys
and values are persisted as bytestrings.

This patch fixes the bug by changing diskfile write_metadata()
function to encode all unicode metadata keys and values as utf8
encoded bytes before writing to disk. To cope with existing objects
that have a mixture of unicode and bytestring metadata, the diskfile
read_metadata() function is also changed so that all returned unicode
metadata keys and values are utf8 encoded. This ensures that
ssync_sender.send_put() (and any other caller of diskfile
read_metadata) only reads bytestrings from object metadata.

Closes-Bug: #1678018
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Ic23c55754ee142f6f5388dcda592a3afc9845c39
2017-04-19 18:05:52 +01:00
Alistair Coles
83750cf79c Fix UnicodeDecodeError in reconstructor _full_path function
Object paths can have non-ascii characters. Device dicts will
have unicode values. Forming a string using both will cause the
object path to be coerced to UTF8, which currently causes a
UnicodeDecodeError. This causes _get_response() to not return
and the recosntructor hangs.

The call to _full_path() is moved outside of _get_response()
(where its result is used in the exception handler logging)
so that _get_response() will always return even if _full_path()
raises an exception.

Unit tests are refactored to split out a new class with those
tests using an object name and the _full_path method, so that
the class can be subclassed to use an object name with non-ascii
characters.

Existing probe tests are subclassed to repeat using non-ascii
chars in object paths.

Change-Id: I4c570c08c770636d57b1157e19d5b7034fd9ed4e
Closes-Bug: 1679175
2017-04-18 14:07:01 +01:00
Alistair Coles
6574ce31ee EC: reconstruct using non-durable fragments
Previously the reconstructor would only reconstruct a missing fragment
when a set of ec_ndata other fragments was available, *all* of which
were durable. Since change [1] it has been possible to retrieve
non-durable fragments from object servers. This patch changes the
reconstructor to take advantage of [1] and use non-durable fragments.

A new probe test is added to test scenarios with a mix of failed and
non-durable nodes. The existing probe tests in
test_reconstructor_rebuild.py and test_reconstructor_durable.py were
broken. These were intended to simulate cases where combinations of
nodes were either failed or had non-durable fragments, but the test
scenarios defined were not actually created - every test scenario
broke only one node instead of the intent of breaking multiple
nodes. The existing tests have been refactored to re-use most of their
setup and assertion code, and merged with the new test into a single
class in test_reconstructor_rebuild.py.

test_reconstructor_durable.py is removed.

[1] Related-Change: I2310981fd1c4622ff5d1a739cbcc59637ffe3fc3

Change-Id: Ic0cdbc7cee657cea0330c2eb1edabe8eb52c0567
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Closes-Bug: #1624088
2016-11-03 16:54:09 +00:00
Gábor Antal
300d388825 Use more specific asserts in test/probe tests
I changed asserts with more specific assert methods.

e.g.: from assertTrue(sth == None) to assertIsNone(*) or
assertTrue(isinstance(inst, type)) to assertIsInstace(inst, type) or
assertTrue(not sth) to assertFalse(sth).

The code gets more readable, and a better description will be shown on fail.

Change-Id: I3768faa568e3964e726ecc48ac8cb133cb088284
2016-11-02 18:13:22 +00:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
Clay Gerrard
a3559edc23 Exclude local_dev from sync partners on failure
If the primary left or right hand partners are down, the next best thing
is to validate the rest of the primary nodes.  Where the rest should
exclude not just the left and right hand partners - but ourself as well.

This fixes a accidental noop when partner node is unavailable and
another node is missing data.

Validation:

Add probetests to cover ssync failures for the primary sync_to nodes for
sync jobs.

Drive-by:

Make additional plumbing for the check_mount and check_dir constraints into
the remaining daemons.

Change-Id: I4d1c047106c242bca85c94b569d98fd59bb255f4
2015-05-26 12:50:31 -07:00
paul luse
647b66a2ce Erasure Code Reconstructor
This patch adds the erasure code reconstructor. It follows the
design of the replicator but:
  - There is no notion of update() or update_deleted().
  - There is a single job processor
  - Jobs are processed partition by partition.
  - At the end of processing a rebalanced or handoff partition, the
    reconstructor will remove successfully reverted objects if any.

And various ssync changes such as the addition of reconstruct_fa()
function called from ssync_sender which performs the actual
reconstruction while sending the object to the receiver

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
blueprint ec-reconstructor
Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51
2015-04-14 00:52:17 -07:00