EC object metadata can currently have a mixture of bytestrings and
unicode. The ssync_sender.send_put() method raises an
UnicodeDecodeError when it attempts to concatenate the metadata
values, if any bytestring has non-ascii characters.
The root cause of this issue is that the object server uses unicode
for the keys of some object metadata items that are received in the
footer of an EC PUT request, whereas all other object metadata keys
and values are persisted as bytestrings.
This patch fixes the bug by changing diskfile write_metadata()
function to encode all unicode metadata keys and values as utf8
encoded bytes before writing to disk. To cope with existing objects
that have a mixture of unicode and bytestring metadata, the diskfile
read_metadata() function is also changed so that all returned unicode
metadata keys and values are utf8 encoded. This ensures that
ssync_sender.send_put() (and any other caller of diskfile
read_metadata) only reads bytestrings from object metadata.
Closes-Bug: #1678018
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: Ic23c55754ee142f6f5388dcda592a3afc9845c39
IMHO we shouldn't ever trust the invalidations file so much we try to
skip a listdir when creating a hashes.pkl for the first time. There may
be some subtle races looking back on the related patch, and it's related
patches.
This just makes some assertions to help demonstrate we should maintain
the invariant of setting hashes to valid via listdir.
Change-Id: I767e34a405de7911e9596e038e58a9a29f57a8f8
Related-Change-Id: I08c8cf09282f737103e580c1f57923b399abe58c
The refactoring in the Related-Change separated EC
specific object controller tests into EC specific TestCase
classes, but left two EC specific tests in the Replication
object controller test class. This patch moves them to the
appropriate test class.
Previously the tests were only executed once, now they are
executed in each of two subclasses using different EC
policies. As a result it was necessary to make the test
container name unique to the policy under test.
Related-Change: Ifd3d0fa66773e640bb61cc528f7a1b2358e97d91
Change-Id: Ie712ea91b5dd74c504a0dd6aa40c3d657277108c
Due to the refactoring of TestObjectController (related-change),
all of BaseTestECObjectController test methods are not being
needed to be unpatched because they are expected to run for test
setup-ed policies.
This patch works for items as follows:
- Move part of setUp/tearDown routines at BaseTestObjectController
needed by only TestReplicatedObjectController which affects
patch_policies
- Remove all unpatch_policies from BaseTestECObjectController
- Set up self.ec_policy to avoid to set policy index and
retrieve the policy for each test method.
The reason why I didn't squash this up to the related parent patch is
to clarify what was changed at those patches. The parent is for just
clustering the tests for each test class and this one attempts to
improve.
Related-Change: Idd155401982a2c48110c30b480966a863f6bd305
Change-Id: I25a3f8fc837706d78dca226fe282d9e5ead65a0d
This is harmless, but honestly annoying. Introduced by the big
ssync change If003dcc6f4109e2d2a42f4873a0779110fff16d6 .
Change-Id: I1a7c0171e03b900ca75da39b152cdffe60abe135
Object paths can have non-ascii characters. Device dicts will
have unicode values. Forming a string using both will cause the
object path to be coerced to UTF8, which currently causes a
UnicodeDecodeError. This causes _get_response() to not return
and the recosntructor hangs.
The call to _full_path() is moved outside of _get_response()
(where its result is used in the exception handler logging)
so that _get_response() will always return even if _full_path()
raises an exception.
Unit tests are refactored to split out a new class with those
tests using an object name and the _full_path method, so that
the class can be subclassed to use an object name with non-ascii
characters.
Existing probe tests are subclassed to repeat using non-ascii
chars in object paths.
Change-Id: I4c570c08c770636d57b1157e19d5b7034fd9ed4e
Closes-Bug: 1679175
Source code should be licensed under the Apache 2.0 license.
Add Apache License in swift/common/ring/__init__.py file.
[H102 H103]:https://docs.openstack.org/developer/hacking/
Change-Id: Ic0fb19e50220c6c91d338759f4eeb786de6084ec
SUSE tests their OpenStack packages on openSUSE Leap 42.2 and SLES 12
SP2, so this patch updates the install guide to address those newer
releases.
Change-Id: Ice6e85fdc8ac77029fde205dd9c63d3b62de3137
This patch uses a more specific string to match
the ring name in recon cli.
Almost all the places in the project where need to get the
suffix (like ring.gz, ts, data and .etc) will include the '.'
in front, if a file named 'object.sring.gz' in the swift_dir
will be added in the ring_names, which is not what we want.
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Closes-Bug: #1680704
Change-Id: Ida659fa71585f9b0cf36da75b58b28e6a25533df
When drive with container or account database is unmounted
replicator pushes database to handoff location. But this
handoff location finds replica with unmounted drive and
pushes database to the *next* handoff until all handoffs has
a replica - all container/account servers has replicas of
all unmounted drives.
This patch solves:
- Consumption of iterator on handoff location that results in
replication to the next and next handoff.
- StopIteration exception stopped not finished loop over
available handoffs if no more nodes exists for db replication
candidency.
Regression was introduced in 2.4.0 with rsync compression.
Co-Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: I344f9daaa038c6946be11e1cf8c4ef104a09e68b
Closes-Bug: 1675500
SSYNC is designed to limit concurrent incoming connections in order to
prevent IO contention. The reconstructor should expect remote
replication servers to fail ssync_sender when the remote is too busy.
When the remote rejects SSYNC - it should avoid forcing additional IO
against the remote with a REPLICATE request which causes suffix
rehashing.
Suffix rehashing via REPLICATE verbs takes two forms:
1) a initial pre-flight call to REPLICATE /dev/part will cause a remote
primary to rehash any invalid suffixes and return a map for the local
sender to compare so that a sync can be performed on any mis-matched
suffixes.
2) a final call to REPLICATE /dev/part/suf1-suf2-suf3[-sufX[...]] will
cause the remote primary to rehash the *given* suffixes even if they are
*not* invalid. This is a requirement for rsync replication because
after a suffix is synced via rsync the contents of a suffix dir will
likely have changed and the remote server needs to update it hashes.pkl
to reflect the new data.
SSYNC does not *need* to send a post-sync REPLICATE request. Any
suffixes that are modified by the SSYNC protocol will call _finalize_put
under the hood as it is syncing. It is however not harmful and
potentially useful to go ahead refresh hashes after an SSYNC while the
inodes of those suffixes are warm in the cache.
However, that only makes sense if the SSYNC conversation actually synced
any suffixes - if SSYNC is rejected for concurrency before it ever got
started there is no value in the remote performing a rehash. It may be
that *another* reconstructor is pushing data into that same partition
and the suffixes will become immediately invalidated.
If a ssync_sender does not successful finish a sync the reconstructor
should skip the REPLICATE call entirely and move on to the next
partition without causing any useless remote IO.
Closes-Bug: #1665141
Change-Id: Ia72c407247e4525ef071a1728750850807ae8231
Some public functions in the diskfile manager expect or return full
file paths. It implies a filesystem diskfile implementation.
To make it easier to plug alternate diskfile implementations, patch
functions to take more generic arguments.
This commit changes DiskFileManager _get_hashes() arguments from:
- partition_path, recalculate=None, do_listdir=False
to :
- device, partition, policy, recalculate=None, do_listdir=False
Callers are modified accordingly, in diskfile.py, reconstructor.py,
and replicator.py
Change-Id: I8e2d7075572e466ae2fa5ebef5e31d87eed90fec
Probably the most common format for documenting arguments is reST field
lists [1]. This change updates some docstrings to comply with the field
lists syntax.
[1] http://sphinx-doc.org/domains.html#info-field-lists
Change-Id: I87e77a9bbd5bcb834b35460ce0adff5bc59d9168
Previously, requests involving DLOs would bypass versioned_writes:
* Any existing DLOs wouldn't get copied to the archive container during
overwrites (or deletes, with history-mode), so there would be no
evidence they had ever existed.
* Any new DLOs wouldn't copy overwritten objects to the archive
container, potentially leading to data loss.
Now, DLOs will behave like every other type of object under
versioned_writes.
Change-Id: I488e13eead2f33dd272d03f6f898adc52fc7fdad
Related-Change: Ie899290b3312e201979eafefb253d1a60b65b837
Related-Change: Ib5b29a19e1d577026deb50fc9d26064a8da81cd7
Closes-Bug: #1626989
Some public functions in the diskfile manager expect or return full
file paths. It implies a filesystem diskfile implementation.
To make it easier to plug alternate diskfile implementations, patch
functions to take more generic arguments.
This commit changes DiskFileManager yield_hashes() returned values
from:
- object_path, object_hash, timestamps
to:
- object_hash, timestamps
object_path was not used by any caller.
Change-Id: I914fb1ec8ce7c9c26d22e1d07f03bd03f4504176
Deprecate swift-temp-url and call python-swiftclient's
implementation instead. This adds python-swiftclient as an
optional dependency of Swift which is noted in releasenotes.
Change-Id: I0404f16c21099cb7695430f5b63722729c305613
The current test for reload does not actually verify the order in
which stop and start are called. Tighten that up to ensure stop is
called before start.
Change-Id: Iede6eb2049e3ab8f3810a69a6b77713de3c71399
Related-Change: I4bf57c8cdba6773ddc1e4013e2b2a9736dacada8
swift-auth-server was removed in 2011. While there are still some lingering
references in test_manager, it's not clear whether removing them would
reduce test coverage for things like bad pid files.
See https://github.com/openstack/swift/commit/bd22dbe
Change-Id: I4bf57c8cdba6773ddc1e4013e2b2a9736dacada8