The previous locking method would leave the lock dir lying around
if the process died unexpectedly, preventing others swift-recon-cron
process from running sucessfuly and requiring a manual clean.
Change-Id: Icb328b2766057a2a4d126f63e2d6dfa5163dd223
Provides a simple, experimental, CLI tool to generate a
composite ring from a list of component builder files.
For example:
swift-ring-composer <composite-file> compose \
<builder-file> <builder-file> --output <ring-file>
Commands available:
- compose: compose a list of builder file to a composite ring
- show: show the metadata for a composite ring
Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Change-Id: I25a79e71c13af352e19e4358f60545265b51584f
The sharder daemon visits container dbs and when necessary executes
the sharding workflow on the db.
The workflow is, in overview:
- perform an audit of the container for sharding purposes.
- move any misplaced objects that do not belong in the container
to their correct shard.
- move shard ranges from FOUND state to CREATED state by creating
shard containers.
- move shard ranges from CREATED to CLEAVED state by cleaving objects
to shard dbs and replicating those dbs. By default this is done in
batches of 2 shard ranges per visit.
Additionally, when the auto_shard option is True (NOT yet recommeneded
in production), the sharder will identify shard ranges for containers
that have exceeded the threshold for sharding, and will also manage
the sharding and shrinking of shard containers.
The manage_shard_ranges tool provides a means to manually identify
shard ranges and merge them to a container in order to trigger
sharding. This is currently the recommended way to shard a container.
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: I7f192209d4d5580f5a0aa6838f9f04e436cf6b1f
I'm not really clear on why a sqlite3.OperationalError should cause us to
retry with stale_reads_ok=True, but swift.common.exceptions.LockTimeout
*definitely* should.
Change-Id: I707dec1d11b8db80bc8fbee30662b319bf10d6a5
Similar to the object replicator and reconstructor, these arguments
are comma-separated lists of device names and partitions,
respectively, on which the account or container replicator will
operate. Other devices and partitions are ignored.
Change-Id: Ic108f5c38f700ac4c7bcf8315bf4c55306951361
Bring under test
- test/unit/cli/test_dispersion_report.py
- test/unit/cli/test_info.py and
- test/unit/cli/test_relinker.py
I've verified that swift-*-info (at least) behave reasonably under
py3, even swift-object-info when there's non-utf8 metadata on the
data/meta file.
Change-Id: Ifed4b8059337c395e56f5e9f8d939c34fe4ff8dd
This has been deprecated since Swift 2.10.0 (Newton) including a
message that it would go away. Let's actually remove it.
Change-Id: I7d3659761c71119363ff2c0c750e37b4c6374a39
Related-Change: Ifa8bf636f20f82db4845b02d1b58699edaa39356
...rather than only comparing the ETag from the last response over and
over again.
NB: This tool *does not* like EC data :-(
Change-Id: Idd37f94b07f607ab8a404dd986760361c39af029
Closes-Bug: 1266636
Add a --drop-prefixes flag to swift-account-info, swift-container-info,
and swift-object-info. This makes the output between the three more
consistent.
Change-Id: I98252ff74c4983eaad0a93d9a9fc527c74ffce68
This patch allows to import the dispersion report tool, and thus making
it more easily usable within other Python tools. This can be also used
in a follow up patch to add some tests for the report tool.
It also fixes a bug when using the "--dump-json" option - until now it
returned the policy name and made the JSON invalid.
Change-Id: Ie0d52a1a54fc152bb72cbb3f84dcc36a8dad972a
Clarify in usage statement and man pages that CLI override options for
swift-object-reconstructor and swift-object-replicator only have
effect when --once is used.
Also add a link to object reconstructor source code docs to the doc
index page for consistency with the other object services.
Change-Id: If348b340d59a672d3a19d4df231ebdb74f4aed51
If swift-recon/swift-get-nodes/swift-object-info is used with the
swiftdir option they will read rings from the given directory; however
they are still using /etc/swift/swift.conf to find the policies on the
current node.
This makes it impossible to maintain a local swift.conf copy (if you
don't have write access to /etc/swift) or check multiple clusters from
the same node.
Until now swift-recon was also not usable with storage policy aliases,
this patch fixes this as well.
Closes-Bug: 1577582
Closes-Bug: 1604707
Closes-Bug: 1617951
Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Change-Id: I13188d42ec19e32e4420739eacd1e5b454af2ae3
This patch adds methods to increase the partition power of an existing
object ring without downtime for the users using a 3-step process. Data
won't be moved to other nodes; objects using the new increased partition
power will be located on the same device and are hardlinked to avoid
data movement.
1. A new setting "next_part_power" will be added to the rings, and once
the proxy server reloaded the rings it will send this value to the
object servers on any write operation. Object servers will now create a
hard-link in the new location to the original DiskFile object. Already
existing data will be relinked using a new tool in the new locations
using hardlinks.
2. The actual partition power itself will be increased. Servers will now
use the new partition power to read from and write to. No longer
required hard links in the old object location have to be removed now by
the relinker tool; the relinker tool reads the next_part_power setting
to find object locations that need to be cleaned up.
3. The "next_part_power" flag will be removed.
This mostly implements the spec in [1]; however it's not using an
"epoch" as described there. The idea of the epoch was to store data
using different partition powers in their own namespace to avoid
conflicts with auditors and replicators as well as being able to abort
such an operation and just remove the new tree. This would require some
heavy change of the on-disk data layout, and other object-server
implementations would be required to adopt this scheme too.
Instead the object-replicator is now aware that there is a partition
power increase in progress and will skip replication of data in that
storage policy; the relinker tool should be simply run and afterwards
the partition power will be increased. This shouldn't take that much
time (it's only walking the filesystem and hardlinking); impact should
be low therefore. The relinker should be run on all storage nodes at the
same time in parallel to decrease the required time (though this is not
mandatory). Failures during relinking should not affect cluster
operations - relinking can be even aborted manually and restarted later.
Auditors are not quarantining objects written to a path with a different
partition power and therefore working as before (though they are reading
each object twice in the worst case before the no longer needed hard
links are removed).
Co-Authored-By: Alistair Coles <alistair.coles@hpe.com>
Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
[1] https://specs.openstack.org/openstack/swift-specs/specs/in_progress/
increasing_partition_power.html
Change-Id: I7d6371a04f5c1c4adbb8733a71f3c177ee5448bb
- Verify .ring.gz path exist if ring file is the first argument.
- Code Refactoring:
- swift/cli/info.parse_get_node_args()
- Respective test cases for info.parse_get_node_args()
Closes-Bug: #1539275
Change-Id: I0a41936d6b75c60336be76f8702fd616d74f1545
Signed-off-by: Sachin Patil <psachin@redhat.com>
Fixies this problem:
* swift-drive-audit needs to be run by root, because only root have
"umount" permission
* swift-object servers typically runs as user swift
* if swift-drive-audit is run by root, /var/cache/swift/drive.recon is
owned by root, with 0o600
* recon middleware (inside swift-object-server) can't read this cache
file: swift-object: Error reading recon cache file
This patch adds "user" option to drive-audit config file. Recon cache
is chowned to this user.
Change-Id: Ibf20543ee690b7c5a37fabd1540fd5c0c7b638c9
swift-recon-cron looks at the drives mounted in directories below
/srv/node, but before this commit, it tried to call listdir() on
everything in this directory, even if it is not a directory.
Change-Id: Id281352f7ab6ecb520eb00f3649873d8c8678608
Signed-off-by: Stefan Majewsky <stefan.majewsky@sap.com>
F812 list comprehension redefines <variable> from line ...
While the current violations were benign, this sort of code can easily
lead to subtle bugs. Seems worth checking, especially given how cheap it
is to bring existing code in line with it.
Change-Id: Ibdcf9f93b85a1f1411198001df6bdbfa8f92d114
python-swiftclient includes an improved and tested method to generate
tempurls. The command syntax is essentially the same, therefore we can
deprecate this one by importing that method.
python-swiftclient is not added as a requirement; if the import fails
due to a missing swiftclient module it will just raise a deprecation
warning.
Closes-Bug: #1607523
Closes-Bug: #1607519
Change-Id: Ifa8bf636f20f82db4845b02d1b58699edaa39356
If the curl command is used exactly as in the help, the ampersand
in the signature is interpreted as an operator and the curl
command breaks. I am aware of developers who have wasted a lot of
time because of this.
Change-Id: I6468c9a098b56db8242a2cf2c23b7a4857bd8574
Running swift-init with -h, --help, or no arguments
displays help for the command. The help does not
document the 'main', 'all', and 'rest' options.
These are documented in the man page.
This patch adds all these server options in the
help of swift-init.
Change-Id: I8e27589912ae72ace14c955e66b86942bc23d9f7
Closes-Bug: #1580722
As much as anything, I'm just tired of seeing a bunch or piecemeal
fixes.
Note that we *need* to include
from __future__ import print_function
in order to support things like
print() # Would print "()" (the repr of an empty tuple) otherwise
print(foo, end='') # Would SyntaxError
print(bar, file=sys.stderr) # Would SyntaxError
Change-Id: I8fdf0740e292eb1ee785512d02e8c552781dcae1
If you have 2 swift regions served by the same keystone,
then the client cannot get the correct URL for the swift endpoint
without specifying a region_name.
Closes-Bug: 1587088
Change-Id: Iaab883386e125c3ca6b9554389e63df17267a135
Extended the use of the DatabaseBroker "stale_reads_ok" flag to the
AccountBroker and ContainerBroker. Now checks for an sqlite3 error
from the _commit_puts call that processes the pending files.
If this error is raised, then the stale_reads_ok flag will be checked
to determine how to proceed as opposed to simply raising.
The first time that print_info is attempted, the flag will be
false, but swift-[account|container]-info will check for the
raised exception. If it was raised, then a warning is reported
that the data may be stale, and another attempt will be
made using the stale_reads_ok=True flag.
Change-Id: I761526eef62327888c865d87a9caafa3e7eabab6
Closes-Bug: 1531302
An additional info log message was added for case of
running drive-audit without failed device unmounting.
Change-Id: I11abee40a712b6c6de65e63626b6f7f0a9c9f4c7