For this commit, ssync is just a direct replacement for how
we use rsync. Assuming we switch over to ssync completely
someday and drop rsync, we will then be able to improve the
algorithms even further (removing local objects as we
successfully transfer each one rather than waiting for whole
partitions, using an index.db with hash-trees, etc., etc.)
For easier review, this commit can be thought of in distinct
parts:
1) New global_conf_callback functionality for allowing
services to perform setup code before workers, etc. are
launched. (This is then used by ssync in the object
server to create a cross-worker semaphore to restrict
concurrent incoming replication.)
2) A bit of shifting of items up from object server and
replicator to diskfile or DEFAULT conf sections for
better sharing of the same settings. conn_timeout,
node_timeout, client_timeout, network_chunk_size,
disk_chunk_size.
3) Modifications to the object server and replicator to
optionally use ssync in place of rsync. This is done in
a generic enough way that switching to FutureSync should
be easy someday.
4) The biggest part, and (at least for now) completely
optional part, are the new ssync_sender and
ssync_receiver files. Nice and isolated for easier
testing and visibility into test coverage, etc.
All the usual logging, statsd, recon, etc. instrumentation
is still there when using ssync, just as it is when using
rsync.
Beyond the essential error and exceptional condition
logging, I have not added any additional instrumentation at
this time. Unless there is something someone finds super
pressing to have added to the logging, I think such
additions would be better as separate change reviews.
FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
CLUSTERS. Some of us will be in a limited fashion to look
for any subtle issues, tuning, etc. but generally ssync is
an experimental feature. In its current implementation it is
probably going to be a bit slower than rsync, but if all
goes according to plan it will end up much faster.
There are no comparisions yet between ssync and rsync other
than some raw virtual machine testing I've done to show it
should compete well enough once we can put it in use in the
real world.
If you Tweet, Google+, or whatever, be sure to indicate it's
experimental. It'd be best to keep it out of deployment
guides, howtos, etc. until we all figure out if we like it,
find it to be stable, etc.
Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6
This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32
Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62
Signed-off-by: Peter Portante <peter.portante@redhat.com>
Mea culpa: these two scripts were missed in commit:
https://review.openstack.org/46956
Fixes bug 1235441
Change-Id: I4303bc808448a79bddbb991526b0cca26150b392
Signed-off-by: Peter Portante <peter.portante@redhat.com>
except x,y: was deprected and is removed in Python 3.x.
Use "except x as y:" instead which works in any Python
version >= 2.6.
Change-Id: I7008c74b807340f3457d3a0c8bd0b83f23169d14
- Makes swift-dispersion-populate a bit faster when using a larger
dispersion_coverage with a larger part_power.
- Adds option to only run population for container OR objects
- Adds option to let you resume population at given point (useful if you
need to resume population after a previous run error'd out or the
like) by specifying which suffix to start at.
The original populate just randomly used uuid4().hex as a suffix on the
container/object names until all the partition's required where covered.
This isn't a big deal if you're only doing 1% coverage on a ring with a
small part power but takes ages if you're doing 100% on a larger ring.
Change-Id: I52f890a774412c1d6179f12db9081aedc58b6bc2
We're working on adding enforcement that things have appropriate
copyright license headers. In anticipation of that, fix the files that don't
have them.
Change-Id: Ie0a9fd5eece5b6671ff4389b07b69ca29be7d017
The swift-dispersion-populate and swift-dispersion-report tools now
accept a --insecure option.
Also, dispersion.conf now has a keystone_api_insecure option.
Default is obviously to use the secure path.
DocImpact
Change-Id: I4000352e547d9ce5b08ade54e0c886281caff891
With enable write affinity, it's necessary to wait until
replication has moved things to their proper homes before
running delete request. With write affinity turned on, only
nodes in local region will get the object right after PUT request.
Fix bug #1198926
Change-Id: I3aa8933d45c47a010ae05561e12176479e7c9bcc
Making it possible for one to overwrite the default set of regexes
used to search for device block errors in the log file. Also making
the log file naming pattern configurable by setting them in the
drive-audit.conf file.
Updating "Detecting Failed Drives" section on the admin guide as well.
Change-Id: I7bd3acffed196da3e09db4c9dcbb48a20bdd1cf0
'devices' is set in object-server.conf on each node, not in ring data,
and the things printed here is just for watching not for running, so
just leave a note here. (this https://review.openstack.org/#/c/23951/
is used for running, so just a note is not enough)
mark this commit as bug fixing is because this script is the last place
using /srv/node but not from conf as Chmouel said.
fixes import change on read_metadata
fixes bug #885006
Change-Id: I727ec2d01c093af61fd3895e5701d87ef67cd9ff
Instead of blacklisting Hacking globally,
only blacklist those that currently occur frequently
(for a later followup patch), and fix the rest. In
detail:
H101 Use TODO(NAME)
H201 no 'except:' at least use 'except Exception:'
H231 octal number 022 should be written as 0o22
H401 docstring should not start with a space
H701 Empty localization string
Change-Id: Ib3b3d56b68d1cf15d3b67ac9749fcbdb876dc52a
For systems with very large numbers of partitions, 1% dispersion
coverage may simply be too much/take too long. This fix allows <1
values to be used for dispersion_coverage.
DocImpact
Change-Id: I5ed35b69754d55a410e66e658b3854de57c7666b
The old format is still present and works just like it did before, so
your existing scripts won't break.
New format pros:
* it's readable even for Swift newcomers
* it's easy to extend
* it's familiar to anyone who's used a Unix command line
* we don't have to maintain the parser
New format cons:
* you can't add multiple devices in one go
Old format pros:
* you can add many devices with one command
* it's compact
Old format cons:
* it confuses newcomers
* "wait, is that zone dash IP colon port slash device, or zone slash
IP dash port colon meta underscore device?" Just try walking
someone through adding a device over voice chat.
* it's annoying to add new fields
Note that this only affects the command "swift-ring-builder
<builderfile> add". Other swift-ring-builder commands are unchanged.
DocImpact
Change-Id: I034b7f79eb6f4d81a5c4da193e1358741441c5b5
Two types of parallelism are added:
- concurrency to speed up what a single process does
- a way to run multiple daemons to work on different parts of the work
DocImpact
Change-Id: I48997f68eb2fd8de19a5ee8b9fcdf76dde2ba0ab
When added a new devices into builder the add_dev function returns it
unique id.
blueprint argparse-in-swift-ring-builder
Change-Id: I57080bb625e812f6cea71199df907a44b332b552
Dramatic part of RingBuilder.search_devs which parse a complex format
of a search device string moved to the swift-ring-builder script.
Instead, the search_devs has a simple interface to search devices.
blueprint argparse-in-swift-ring-builder
Change-Id: If3dd77b297b474fb9a058e4693fef2dfb11fca3d
Instances of the RingBuilder class can store its data to a disk file by
the save method and load it by the load method.
blueprint argparse-in-swift-ring-builder
Change-Id: I69fdf0693ca9f520d235a795ecdd2da310dcd5d3
When swift-bench is run in direct mode, don't try to delete the
containers which weren't created.
Fixes bug 1177960.
Change-Id: Ice07e8729bb776e2b215894cf95fb80b64167a8d
Allow Swift daemons and servers to optionally accept a directory as the
configuration parameter. Directory based configuration leverages
ConfigParser's native multi-file support. Files ending in '.conf' in the
given directory are parsed in lexicographical order. Filenames starting with
'.' are ignored. A mixture of file and directory configuration paths is not
supported - if the configuration path is a file behavior is unchanged.
* update swift-init to search for conf.d paths when building servers
(e.g. /etc/swift/proxy-server.conf.d/)
* new script swift-config can be used to inspect the cumulative configuration
* pull a little bit of code out of run_wsgi and test separately
* fix example config bug for the proxy servers client_disconnect option
* added section on directory based configuration to deployment guide
DocImpact
Implements: blueprint confd
Change-Id: I89b0f48e538117f28590cf6698401f74ef58003b
If we set device's weight to zero, currently balance will be set
special value(999.99) until zero weighted device return all
its partitions. So we cannot check balance has changed.
Thus we need to check balance or last_balance is special value.
Change-Id: I5b7db8b8e48db0c4771c51a764bda689869817d5
Fixes: bug #1171731
Support separate replication ip address:
- Added new function in utils. This function provides ability
to select separate IP address for replication service.
- Db_replicator and object replicators were changed.
Replication process uses new function now.
Replication network parameters:
- Replication network fields (replication_ip, replication_port)
support was added to device dictionary in swift-ring-builder script.
- Changes were made to support new fields in search, show and set_info
functions.
Implementation of replication servers:
- Separate replication servers use the same code as normal replication
servers, but with replication_server parameter = True. When using a
separate replication network, the non-replication servers set
replication_server = False. When there is no separate replication
network (the default case), replication_server is not included in the config.
DocImpact
Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0
Implements: blueprint dedicated-replication-network
* Algorithm format_device was changed for simplicity extension new
ip addresses parameters.
* Some prints outputs was replacement by function format_device.
Change-Id: I8565d42fcdb62eeb398c4432bb6f499c27c05cf6
Indent swift-ring-builder output was changed according
with old style (before implementation of region tier).
Change-Id: I0d1cc7acdc5baf86f343745aea6fc2120838fd36
Folks have actually been asking for this. I think they're sending a
DELETE TempURL to someone way ahead of time and the someone issues it
when they're ready. Honestly, I'm not entirely sure of the use case,
but having the set of methods configurable wouldn't hurt.
Change-Id: Ibdb48f8a72077b045eeedddfae4c0a1f56098d7a
swift-ring-builder uses outdated unbounded class method to handle
not supported command. It worked for python 2.6 or older but not
for python 2.7 or newer. This patch fixes the problem.
Change-Id: I7dbc681ef6be44f6d79ff93189ccca13c51eab74
Fixes: bug #1154882
Using root logger will make public module log failed by unknow keyword.
Just change the logger name to module itself could dismiss this impact.
And disable logger.propagate could prevent double outputing.
Change-Id: I18696d124ebac9ca970d502558972e51de759097
Fixes: bug #1105133