54 Commits

Author SHA1 Message Date
Christian Schwede
9729bc83eb Don't delete misplaced dbs if not replicated
If one uses only a single replica and a database file is placed on a
wrong partition, it will be removed instead of replicated to the correct
partition.

There are two reasons for this:
1. The list of nodes is empty when there is only a single replica
2. all(responses) is True even if there is no response at all, and the
latter is always True if there is no node to replicate to.

This patch fixes this by adding a special case if used with only one
replica to the node selection loop and ensures that the list of
responses is not empty.  Also adds a test that fails on current master
and passes with this change.

Closes-Bug: 1568591

Change-Id: I028ea8c1928e8c9a401db31fb266ff82606f8371
2016-05-12 07:28:40 +00:00
Shashirekha Gundur
cf48e75c25 change default ports for servers
Changing the recommended ports for Swift services
from ports 6000-6002 to unused ports 6200-6202;
so they do not conflict with X-Windows or other services.

Updated SAIO docs.

DocImpact
Closes-Bug: #1521339
Change-Id: Ie1c778b159792c8e259e2a54cb86051686ac9d18
2016-04-29 14:47:38 -04:00
Peter Lisák
16de32f168 Log error if a local device not identified in replicator
Example:
* Different port in config and in ring file.
* Running daemon on server not in ring file.
In both cases replication daemon is running but nothing is replicated.
Error log helps to distinguish a local device can't be identified.

Closes-Bug: 1508228
Change-Id: I99351b7d9946f250b7750df91c13d09352a145ce
2015-11-21 14:04:32 +01:00
Zack M. Davis
1b8b08039a remove remaining simplejson uses, prefer standard library import
a1c32702, 736cf54a, and 38787d0f remove uses of `simplejson` from
various parts of Swift in favor of the standard libary `json`
module (introduced in Python 2.6). This commit performs the remaining
`simplejson` to `json` replacements, removes two comments highlighting
quirks of simplejson with respect to Unicode, and removes the references
to it in setup documentation and requirements.txt.

There were a lot of places where we were importing json from
swift.common.utils, which is less intuitive than a direct `import json`,
so that replacement is made as well.

(And in two more tiny drive-bys, we add some pretty-indenting to an XML
fragment and use `super` rather than naming a base class explicitly.)

Change-Id: I769e88dda7f76ce15cf7ce930dc1874d24f9498a
2015-11-16 12:34:24 -08:00
Lisak, Peter
b6b7578190 node_timeout as float in configs
It is more convenient to use float node_timeout for fine tunning latency.

Change-Id: I7c57bba053711a27d3802efe6f2a0bf53483a54f
2015-10-19 21:14:34 +02:00
Matthew Oliver
4a13dcc4a8 Make db_replicator usync smaller containers
The current rule inside the db_replicator is to rsync+merge
containers during replication if the difference between rowids
differ by more than 50%:

  # if the difference in rowids between the two differs by
  # more than 50%, rsync then do a remote merge.
  if rinfo['max_row'] / float(info['max_row']) < 0.5:

This mean on smaller containers, that only have few rows, and differ
by a small number still rsync+merge rather then copying rows.

This change adds a new condition, the difference in the rowids must
be greater than the defined per_diff otherwise usync will be used:

  # if the difference in rowids between the two differs by
  # more than 50% and the difference is greater than per_diff,
  # rsync then do a remote merge.
  # NOTE: difference > per_diff stops us from dropping to rsync
  # on smaller containers, who have only a few rows to sync.
  if rinfo['max_row'] / float(info['max_row']) < 0.5 and \
          info['max_row'] - rinfo['max_row'] > self.per_diff:

Change-Id: I9e779f71bf37714919a525404565dd075762b0d4
Closes-bug: #1019712
2015-10-19 15:26:12 +01:00
janonymous
f5f9d791b0 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3, replacing it.

Change-Id: Ida206abbb13c320095bb9e3b25a2b66cc31bfba8
Co-Authored-By: Ondřej Nový <ondrej.novy@firma.seznam.cz>
2015-10-11 12:57:25 +02:00
Romain LE DISEZ
71f6fd025e Allows to configure the rsync modules where the replicators will send data
Currently, the rsync module where the replicators send data is static. It
forbids administrators to set rsync configuration based on their current
deployment or needs.

As an example, the rsyncd configuration example encourages to set a connections
limit for the modules account, container and object. It permits to protect
devices from excessives parallels connections, because it would impact
performances.

On a server with many devices, it is tempting to increase this number
proportionally, but nothing guarantees that the distribution of the connections
will be balanced. In the worst scenario, a single device can receive all the
connections, which is a severe impact on performances.

This commit adds a new option named 'rsync_module' to the *-replicator sections
of the *-server configuration file. This configuration variable can be
extrapolated with device attributes like ip, port, device, zone, ... by using
the format {NAME}. eg:
    rsync_module = {replication_ip}::object_{device}

With this configuration, an administrators can solve the problem of connections
distribution by creating one module per device in rsyncd configuration.

The default values are backward compatible:
    {replication_ip}::account
    {replication_ip}::container
    {replication_ip}::object

Option vm_test_mode is deprecated by this commit, but backward compatibility is
maintained. The option is only effective when rsync_module is not set. In that
case, {replication_port} is appended to the default value of rsync_module.

Change-Id: Iad91df50dadbe96c921181797799b4444323ce2e
2015-09-07 08:00:18 +02:00
janonymous
c5b5cf91a9 test/unit: Replace python print operator with print function (pep H233, py33)
'print' function is compatible with 2.x and 3.x python versions
Link : https://www.python.org/dev/peps/pep-3105/

Python 2.6 has a __future__ import that removes print as language syntax,
letting you use the functional form instead

Change-Id: I94e1bc6bd83ad6b05695c7ebdf7cbfd8f6d9f9af
2015-07-28 21:03:05 +05:30
janonymous
cd7b2db550 unit tests: Replace "self.assert_" by "self.assertTrue"
The assert_() method is deprecated and can be safely replaced by assertTrue().
This patch makes sure that running the tests does not create undesired
warnings.

Change-Id: I0602ba39ef93263386644ee68088d5f65fcb4a71
2015-07-21 19:23:00 +05:30
Victor Stinner
1cc3eff958 Fixes for mock 1.1
The new release of mock 1.1 is more strict. It helped to find bugs in
tests.

Closes-Bug: #1473369
Change-Id: Id179513c6010d827cbcbdda7692a920e29213bcb
2015-07-10 16:37:11 +02:00
Darrell Bishop
df134df901 Allow 1+ object-servers-per-disk deployment
Enabled by a new > 0 integer config value, "servers_per_port" in the
[DEFAULT] config section for object-server and/or replication server
configs.  The setting's integer value determines how many different
object-server workers handle requests for any single unique local port
in the ring.  In this mode, the parent swift-object-server process
continues to run as the original user (i.e. root if low-port binding
is required), binds to all ports as defined in the ring, and forks off
the specified number of workers per listen socket.  The child, per-port
servers drop privileges and behave pretty much how object-server workers
always have, except that because the ring has unique ports per disk, the
object-servers will only be handling requests for a single disk.  The
parent process detects dead servers and restarts them (with the correct
listen socket), starts missing servers when an updated ring file is
found with a device on the server with a new port, and kills extraneous
servers when their port is found to no longer be in the ring.  The ring
files are stat'ed at most every "ring_check_interval" seconds, as
configured in the object-server config (same default of 15s).

Immediately stopping all swift-object-worker processes still works by
sending the parent a SIGTERM.  Likewise, a SIGHUP to the parent process
still causes the parent process to close all listen sockets and exit,
allowing existing children to finish serving their existing requests.
The drop_privileges helper function now has an optional param to
suppress the setsid() call, which otherwise screws up the child workers'
process management.

The class method RingData.load() can be told to only load the ring
metadata (i.e. everything except replica2part2dev_id) with the optional
kwarg, header_only=True.  This is used to keep the parent and all
forked off workers from unnecessarily having full copies of all storage
policy rings in memory.

A new helper class, swift.common.storage_policy.BindPortsCache,
provides a method to return a set of all device ports in all rings for
the server on which it is instantiated (identified by its set of IP
addresses).  The BindPortsCache instance will track mtimes of ring
files, so they are not opened more frequently than necessary.

This patch includes enhancements to the probe tests and
object-replicator/object-reconstructor config plumbing to allow the
probe tests to work correctly both in the "normal" config (same IP but
unique ports for each SAIO "server") and a server-per-port setup where
each SAIO "server" must have a unique IP address and unique port per
disk within each "server".  The main probe tests only work with 4
servers and 4 disks, but you can see the difference in the rings for the
EC probe tests where there are 2 disks per server for a total of 8
disks.  Specifically, swift.common.ring.utils.is_local_device() will
ignore the ports when the "my_port" argument is None.  Then,
object-replicator and object-reconstructor both set self.bind_port to
None if server_per_port is enabled.  Bonus improvement for IPv6
addresses in is_local_device().

This PR for vagrant-swift-all-in-one will aid in testing this patch:
https://github.com/swiftstack/vagrant-swift-all-in-one/pull/16/

Also allow SAIO to answer is_local_device() better; common SAIO setups
have multiple "servers" all on the same host with different ports for
the different "servers" (which happen to match the IPs specified in the
rings for the devices on each of those "servers").

However, you can configure the SAIO to have different localhost IP
addresses (e.g. 127.0.0.1, 127.0.0.2, etc.) in the ring and in the
servers' config files' bind_ip setting.

This new whataremyips() implementation combined with a little plumbing
allows is_local_device() to accurately answer, even on an SAIO.

In the default case (an unspecified bind_ip defaults to '0.0.0.0') as
well as an explict "bind to everything" like '0.0.0.0' or '::',
whataremyips() behaves as it always has, returning all IP addresses for
the server.

Also updated probe tests to handle each "server" in the SAIO having a
unique IP address.

For some (noisy) benchmarks that show servers_per_port=X is at least as
good as the same number of "normal" workers:
https://gist.github.com/dbishop/c214f89ca708a6b1624a#file-summary-md

Benchmarks showing the benefits of I/O isolation with a small number of
slow disks:
https://gist.github.com/dbishop/fd0ab067babdecfb07ca#file-results-md

If you were wondering what the overhead of threads_per_disk looks like:
https://gist.github.com/dbishop/1d14755fedc86a161718#file-tabular_results-md

DocImpact

Change-Id: I2239a4000b41a7e7cc53465ce794af49d44796c6
2015-06-18 12:43:50 -07:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
Jenkins
2ea8bae389 Merge "Allow rsync to use compression" 2015-05-13 10:58:08 +00:00
Prashanth Pai
a6f630f27c fsync() on directories
renamer() method now does a fsync on containing directory of target path
and also on parent dirs of newly created directories, by default.
This can be explicitly turned off in cases where it is not
necessary (For example- quarantines).

The following article explains why this is necessary:
http://lwn.net/Articles/457667/

Although, it may seem like the right thing to do, this change does come
at a performance penalty. However, no configurable option is provided to
turn it off.

Also, lock_path() inside invalidate_hash() was always creating part of
object path in filesystem. Those are never fsync'd. This has been
fixed.

Change-Id: Id8e02f84f48370edda7fb0c46e030db3b53a71e3
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2015-03-04 12:33:56 +05:30
Prashanth Pai
9c33bbde69 Allow rsync to use compression
From rsync's man page:
-z, --compress
With this option, rsync compresses the file data as it is sent to the
destination machine, which reduces the amount of data being transmitted --
something that is useful over a slow connection.

A configurable option has been added to allow rsync to compress, but only
if the remote node is in a different region than the local one.

NOTE: Objects that are already compressed (for example: .tar.gz, .mp3)
might slow down the syncing process.

On wire compression can also be extended to ssync later in a different
change if required. In case of ssync, we could explore faster
compression libraries like lz4. rsync uses zlib which is slow but offers
higher compression ratio.

Change-Id: Ic9b9cbff9b5e68bef8257b522cc352fc3544db3c
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2015-03-02 14:39:58 +05:30
Jenkins
8cf9107022 Merge "Fix large out of sync out of date containers" 2015-01-13 13:58:39 +00:00
Harshit
99fa8b3f8e Removing commented out test in test_db_replicator
It removes test_dispatch test from test_db_replicator
which has been commented out for a while.

Change-Id: Ia28fa923a65ad7d85804cbf6f7acef244741bab1
Closes-Bug: #1408502
2015-01-10 01:07:45 -08:00
Clay Gerrard
404ac092d1 Fix large out of sync out of date containers
As I understand it db replication starts with a preflight sync request
to the remote container server who's response will include the last
synced row_id that it has on file for the sending nodes database id.

If the difference in the last sync point returned is more than 50% of
the local sending db's rows, it'll fall back to sending the whole db
over rsync and let the remote end merge items locally - but generally
there's just a few rows missing and they're shipped over the wire as
json and stuffed into some rather normal looking merge_items calls.

The one thing that's a bit different with these remote merge_items calls
(compared to your average run of the mill eat a bunch of entries out of
a .pending file) is the is source kwarg.  When this optional kwarg comes
into merge_items it's the remote sending db's uuid, and after we eat all
the rows it sent us we update our local incoming_sync table for that
uuid so that next time when it makes it's pre-flight sync request we can
tell it where it left off.

Now normally the sending db is going to push out it's rows up from the
returned sync_point in 1000 item diffs, up to 10 batches total (per_diff
and max_diffs options) - 10K rows.  If that goes well then everything is
in sync up to at least the point it started, and the sending db will
*also* ship over *it's* incoming_sync rows to merge_syncs on the remote
end.  Since the sending db is in sync with these other db's up to those
points so is the remote db now by way of the transitive property.  Also
note through some weird artifact that I'm not entirely convinced isn't
an unrelated and possibly benign bug the incoming_sync table on the
sending db will often also happen to include it's own uuid - maybe it
got pushed back to it from another node?

Anyway, that seemed to work well enough until a sending db got diff
capped (i.e. sent it's 10K rows and wasn't finished), when this happened
the final merge_syncs call never gets sent because the remote end is
definitely *not* up to date with the other databases that the sending db
is - it's not even up-to-date with the sending db yet!  But the hope is
certainly that on the next pass it'll be able to finish sending the
remaining items.  But since the remote end is who decides what the last
successfully synced row with this local sending db was - it's super
important that the incoming_sync table is getting updated in merge_items
when that source kwarg is there.

I observed this simple and straight forward process wasn't working well
in one case - which is weird considering it didn't have much in the way
of tests.  After I had the test and started looking into it seemed maybe
the source kwarg handling got over-indented a bit in the bulk insert
merge_items refactor.  I think this is correct - maybe we could send
someone up to the mountain temple to seek out gholt?

Change-Id: I4137388a97925814748ecc36b3ab5f1ac3309659
2015-01-07 17:20:35 -08:00
Clay Gerrard
233e0aebf7 Fix reclaim on deleted containers
The common db replicator's code path for reclaiming deleted db's beyond the
reclaim age was not covered by unittests, and a AttributeError snuck in.  In
writing the test that would cover the common code both for accounts and
containers I discovered another KeyError with the container conditional for
validating the container's fully reported status.

This fixes both those issues and adds additional tests for the cleanup empty
account container partition and suffix directories.

Change-Id: I2a1bfaefebd05b01231bf71dd908fcc49adb4c36
2014-12-03 17:10:15 -08:00
Caleb Tennis
d40cebfe55 Clean up empty account and container partitions directories.
Because we iterate over these directories on a replication run,
and they are not (previously) cleaned up, the time to start the
replication increases incrementally for each stale directory
lying around.  Thousands of directories across dozens of disks
on a single machine can make for non-trivial startup times.

Plus it just seems like good housekeeping.
Closes-Bug: #1396152

Change-Id: Iab607b03b7f011e87b799d1f9af7ab3b4ff30019
2014-12-02 16:24:32 -05:00
Takashi Kajinami
7a0c4d2482 Remove invalid connection checking in db_replicator
Account/Container-replicator checks connection generation and timeout
in HTTP REPLICATE Request in _repl_to_node, but it doesn't really checks
connection but only construction of ReplConnection class.
This patch removes that invalid checking.

Change-Id: Ie6b4062123d998e69c15638b741e7d1ba8a08b62
Closes-Bug: #1359018
2014-11-25 00:00:05 +09:00
Clay Gerrard
a14d2c857c Enqueue misplaced objects during container replication
After a container database is replicated, a _post_replicate_hook will enqueue
misplaced objects for the container-reconciler into the .misplaced_objects
containers.  Items to be reconciled are "batch loaded" into the reconciler
queue and the end of a container replication cycle by levering container
replication itself.

DocImpact
Implements: blueprint storage-policies
Change-Id: I3627efcdea75403586dffee46537a60add08bfda
2014-06-18 21:09:50 -07:00
Clay Gerrard
81bc31e6ec Merge container storage_policy_index
Keep status_changed_at in container databases current with status changes that
occur as a result of container creation, deletion, or re-creation.

Merge container put/delete/created timestamps when handling replicate
responses from remote servers in addition to during the handling of the
REPLICATE request.

When storage policies are configured on a cluster send status_changed_at,
object_count and storage_policy_index as part of container replication sync
args.

Use status_changed_at during replication to determine the oldest active
container and merge storage_policy_index.

DocImpact
Implements: blueprint storage-policies
Change-Id: Ib9a0dd42c271145e641437dc04d0ebea1e11fc47
2014-06-18 20:57:09 -07:00
Clay Gerrard
7624b198cf Update FakeRing and FakeLogger
FakeLogger gets better log level handling

Parameterize logger on some daemons which were previously
unparameterized and try and use the interface in tests.

FakeRing use more real code

The existing FakeRing mock's implementation bit me on some pretty subtle
character encoding issue by-passing the hash_path code that is normally
part of get_part_nodes.  This change tries to exercise more of the real
ring code paths when it makes sense and provide a better Fake for use in
testing.

Add write_fake_ring helper to test.unit for when you need a real ring.

DocImpact
Implements: blueprint storage-policies
Change-Id: Id2e3740b1dd569050f4e083617e7dd6a4249027e
2014-06-18 17:31:37 -07:00
Pete Zaitcev
a7cfcc3d7a Relocate DATADIR to backends
It simply makes sense that the definition of DATADIR belongs to
backends. After all, some of them may not even have any.

Coincidentially, a few unnecessary imports are dropped.

By the way, on the object server side, diskfile.py provides DATADIR
in the same way already.

Change-Id: I60bfd522c77c4a0ee13697a2e31141777c7e2398
2014-04-01 23:22:22 -06:00
Samuel Merritt
09ef06fd99 Convert all old-style classes to new-style
This cleanup has been slowly happening for a while; let's finish it.

Change-Id: I1561e3540d524834e0cc5bc725ab80936eae1f0e
2014-03-03 17:28:48 -08:00
Cristian A Sanchez
fdc775d6d5 Increases the UT coverage of db_replicator.py
Adds 20 unit tests to increase the coverage of db_replicator.py
from 71% to 90%

Change-Id: Ia63cb8f2049fb3182bbf7af695087bfe15cede54
Closes-Bug: #948179
2013-12-03 14:00:22 -03:00
Gonéri Le Bouder
14c5b547f2 test: improve db_replicator coverage
This patch adds a test for ReplicatorRpc.complete_rsync()
and complete extract_device() coverage.

test_extract_device:
  test the case the parameter is invalid

test_complete_rsync_with_bad_input:
  ensure the use of invalid parameters return a 404 erro

test_complete_rsync:
  validate the returned code in case of success

Change-Id: I59e0d26a1efe59d8beff1e81c2a7edc6de0872e9
2013-11-21 17:28:07 +01:00
Peter Portante
9411a24ba7 Revert "Refactor common/utils methods to common/ondisk"
This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32

Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-10-07 17:18:09 -04:00
ZhiQiang Fan
f72704fc82 Change OpenStack LLC to Foundation
Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58
Closes-bug: #1214176
2013-09-20 01:02:31 +08:00
Peter Portante
7760f41c3c Refactor common/utils methods to common/ondisk
Place all the methods related to on-disk layout and / or configuration
into a new common module that can be shared by the various modules
using the same on-disk layout.

Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-17 17:32:04 -04:00
Peter Portante
56593a1323 Pep8 unit test modules w/ <= 20 violations (6 of 12)
Change-Id: I7317beb97e1530cb18c62da55ccf4c64206ff362
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-01 16:12:42 -04:00
Alex Gaynor
ff5a6d0111 Corrected many style violations in the tests.
I focussed primarily on F-category violations, they are all but all fixed with
this patch.

Change-Id: I343f6945c97984ed1093bc347b6def6994297041
2013-07-24 10:18:47 -07:00
Vladimir Vechkanov
1f7d2a60d6 Refactor and add tests for db_replicator
* Create class for testing _repl_to_not and replicate_object fuctions to
  prevent duplication code by adding all preparation into setUp function.
* Move existed test function which testin _repl_to_not and
  replicate_object into created classes.
* Add tests for replicate_object and _repl_to_node functions.

Change-Id: I75ac7c6f0230e71bfb24328e44c33734b520b4cd
2013-07-02 17:03:28 +04:00
gholt
fef2afd927 Fixed Bug 1187200
See Bug 1187200 for a full description of the problem.

Part 1:

X-Delete-At-Container added to X-Delete-At-* info

This fixes the bug by passing the expiring-objects-account's
container name onward to the backend object servers. This is in case
the object servers' expiring_objects_container_divisor happens to be
different than the proxy server's, we want to make sure the host,
partition, and device match up with the container name. Different
container names would be fine, but not with mismatched host,
partition, and device info.

Part 2:

The db_replicator now double checks the disk path's partition against
the partition the ring gives back. If they don't match, it logs the
problem but continues to replicate the database to where it should be
and, on success to all proper nodes, removes the local out of place
database.

Bug 1187200

Change-Id: Id0873a3f2198ce285fe0b0c777738eff38bc2438
2013-06-08 20:00:32 +00:00
Vladimir Vechkanov
fd3b64bb16 Fix problem with changing class attribute
Attribute get_repl_missing_table in FakeBroker class was changed in
test_replicate_object_quarantine function and not returned back. That's
why next test cases takes not expexted values from FakeBroker.

fixes bug 1180354
Change-Id: Iba55255771e6483832c7782fcbe331e20e818f4e
2013-05-21 23:21:50 +04:00
Sergey Kraynev
ea7858176b Implementation of replication servers
Support separate replication ip address:
- Added new function in utils. This function provides ability
  to select separate IP address for replication service.
- Db_replicator and object replicators were changed.
  Replication process uses new function now.

Replication network parameters:
- Replication network fields (replication_ip, replication_port)
  support was added to device dictionary in swift-ring-builder script.
- Changes were made to support new fields in search, show and set_info
  functions.

Implementation of replication servers:
- Separate replication servers use the same code as normal replication
  servers, but with replication_server parameter = True.  When using a
  separate replication network, the non-replication servers set
  replication_server = False.  When there is no separate replication
  network (the default case), replication_server is not included in the config.

DocImpact
Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0
Implements: blueprint dedicated-replication-network
2013-04-21 18:14:42 -04:00
Greg Lange
30e88fd676 add unit tests for db_replicator
Change-Id: I9002fa193a51f40523e7936e3117a2f3f2b2f7f8
2013-04-04 18:45:24 +00:00
Greg Lange
44f00a23c1 fixed some minor things in tests that pyflakes complained about
Change-Id: Ifeab56a964630bcf941e932fcbe39e6572e62975
2013-03-26 20:42:26 +00:00
gholt
4e5889d6ce Refactor db_replicator's roundrobin_datadirs
roundrobin_datadirs was returning any .db file at any depth in the
accounts/containers structure. Since xfs corruption can cause such
files to appear in odd places at times (only happened on one drive of
ours so far, but still...), I've refactored this function to only
return .db files at the proper depth.

Change-Id: Id06ef6584941f8a572e286f69dfa3d96fe451355
2012-11-15 21:44:14 +00:00
gholt
b5509b1bee Db reclamation should remove empty suffix dirs
When a db is reclaimed it removes the hash dir the db files are in,
but it does not try to remove the parent suffix dir though it might
be empty now. This eventually leads to a bunch of empty suffix dirs
lying around. This patch fixes that by attempting to remove the
parent suffix dir after a hash dir reclamation.

Here's a quick script to see how bad a given drive might be:

import os, os.path, sys
if len(sys.argv) != 2:
    sys.exit('%s <mount-point>' % sys.argv[0])
in_use = 0
empty = 0
containers = os.path.join(sys.argv[1], 'containers')
for p in os.listdir(containers):
    partition = os.path.join(containers, p)
    for s in os.listdir(partition):
        suffix = os.path.join(partition, s)
        if os.listdir(suffix):
            in_use += 1
        else:
            empty += 1
print in_use, 'in use,', empty, 'empty,', '%.02f%%' % (
    100.0 * empty / (in_use + empty)), 'empty'

And here's a quick script to clean up a drive:
NOTE THAT I HAVEN'T ACTUALLY RUN THIS ON A LIVE NODE YET!

import errno, os, os.path, sys
if len(sys.argv) != 2:
    sys.exit('%s <mount-point>' % sys.argv[0])
containers = os.path.join(sys.argv[1], 'containers')
for p in os.listdir(containers):
    partition = os.path.join(containers, p)
    for s in os.listdir(partition):
        suffix = os.path.join(partition, s)
        try:
            os.rmdir(suffix)
        except OSError, err:
            if err.errno not in (errno.ENOENT, errno.ENOTEMPTY):
                print err

Change-Id: I2e6463a4cd40597fc236ebe3e73b4b31347f2309
2012-10-25 19:42:56 +00:00
lrqrun
7b664c99e5 Fix PEP8 issues in ./test/unit/common .
Fix some pep8 issues in
       modified:   test_bufferedhttp.py
       modified:   test_constraints.py
       modified:   test_db.py
       modified:   test_db_replicator.py
       modified:   test_init.py
make the code looks pretty.

Change-Id: I1c374b1ccd4f028c4e4b2e8194a6d1c201d50571
2012-08-31 11:24:46 +08:00
Darrell Bishop
66400b7337 Add device name to *-replicator.removes for DBs
To tell when replication for a device has finished, it's important to
know when the replicator is removing objects.  This was previously
handled for the object-replicator
(object-replicator.partition.delete.count.<device> and
object-replicator.partition.update.count.<device> metrics) but not the
account and container replicators.

This patch extends the existing DB removal count metrics to make them
per-device.  The new metrics are:
 account-replicator.removes.<device>
 container-replicator.removes.<device>

There's also a bonus refactoring and increased test coverage of the DB
replicator code.

Change-Id: I2067317d4a5f8ad2a496834147954bdcdfc541c1
2012-08-22 13:35:09 -07:00
Jenkins
6682138b0a Merge "Make ring class interface slightly more abstracted from implementation." 2012-03-22 20:25:06 +00:00
John Dickinson
1ecf5ebba1 updated copyright date for all files
Change-Id: Ifd909d3561c2647770a7e0caa3cd91acd1b4f298
2012-03-19 13:45:34 -05:00
Michael Barton
e008c2ebb8 Make ring class interface slightly more abstracted from implementation.
Change-Id: I0f55d61c7b8de30460f17a69e5d9946494dbda6e
2012-03-14 22:00:30 +00:00
David Goetz
2d9103f9e0 adding double quarantine support for db replication 2011-04-18 15:00:59 -07:00
Anne Gentle
8823427161 Changed copyright notices on py files and the single rst file with a copyright notice 2011-01-04 17:34:43 -06:00
Chuck Thier
158e6c3ae9 refactored bins to by more DRY 2010-08-31 23:12:59 +00:00