74 Commits

Author SHA1 Message Date
Clay Gerrard
a38f63e1c6 Use correct Storage-Policy header for REPLICATE requests
Under some concurrency the object-replicator could potentially send the
wrong X-Backed-Storage-Policy-Index header to it's partner nodes during
replication if there were multiple storage policies on the same node
because of a race where multiple jobs being processed concurrently would
mutate some shared state on the ObjectReplicator instance.

Instead of using shared stated on the ObjectReplicator instance when
mutating the default headers send with REPLICATION requests each job
will copy them into a local where they can safely be updated.

Change-Id: I5522db57af7e308b1f9d4181f14ea14e386a71fd
2015-08-24 11:20:02 -07:00
Jenkins
2d41ff7b45 Merge "Enable Object Replicator's failure count in recon" 2015-08-24 07:32:08 +00:00
Pradeep Kumar Singh
ab163702de Emit warning log in object replicator
When the object-replicator encounters handoffs_first and
handoff_delete options as enabled it should emit a log
warning indicating that it should be changed back to the
default before the next "normal" rebalance.

Closes-Bug: #1457262

Change-Id: If9dc2796c18ed3cf13da920831e2d5c2ae9f12a0
2015-08-21 02:47:04 +00:00
Hisashi Osanai
79ba4a8598 Enable Object Replicator's failure count in recon
This patch makes the count of object replication failure in recon.
And "failure_nodes" is added to Account Replicator and
Container Replicator.

Recon shows the count of object repliction failure as follows:
$ curl http://<ip>:<port>/recon/replication/object
{
    "replication_last": 1416334368.60865,
    "replication_stats": {
        "attempted": 13346,
        "failure": 870,
	"failure_nodes": {
            "192.168.0.1": {"sdb1": 3},
            "192.168.0.2": {"sdb1": 851,
                            "sdc1": 1,
                            "sdd1": 8},
            "192.168.0.3": {"sdb1": 3,
                            "sdc1": 4}
	},
        "hashmatch": 0,
        "remove": 0,
        "rsync": 0,
        "start": 1416354240.9761429,
        "success": 1908
    },
    "replication_time": 2316.5563162644703,
    "object_replication_last": 1416334368.60865,
    "object_replication_time": 2316.5563162644703
}

Note that 'object_replication_last' and 'object_replication_time' are
considered to be transitional and will be removed in the subsequent
releases. Use 'replication_last' and 'replication_time' instead.

Additionaly this patch adds the count in swift-recon and it will be
showed as follows:
$ swift-recon object -r
========================================================================
=======
--> Starting reconnaissance on 4 hosts
========================================================================
=======
[2014-11-27 16:14:09] Checking on replication
[replication_failure] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%,
no_result: 0, reported: 4
[replication_success] low: 3, high: 3, avg: 3.0, total: 12,
Failed: 0.0%, no_result: 0, reported: 4
[replication_time] low: 0, high: 0, avg: 0.0, total: 0, Failed: 0.0%,
no_result: 0, reported: 4
[replication_attempted] low: 1, high: 1, avg: 1.0, total: 4,
Failed: 0.0%, no_result: 0, reported: 4
Oldest completion was 2014-11-27 16:09:45 (4 minutes ago) by
192.168.0.4:6002.
Most recent completion was 2014-11-27 16:14:19 (-10 seconds ago) by
192.168.0.1:6002.
========================================================================
=======

In case there is a cluster which has servers, a server runs with this
patch and the other servers run without this patch. If swift-recon
executes on the server which runs with this patch, there are unnecessary
information on the output such as [failure], [success] and [attempted].
Because other servers which run without this patch are not able to
send a response with information that this patch needs.
Therefore once you apply this patch, you also apply this patch to other
servers before you execute swift-recon.

DocImpact
Change-Id: Iecd33655ae2568482833131f422679996c374d78
Co-Authored-By: Kenichiro Matsuda <matsuda_kenichi@jp.fujitsu.com>
Co-Authored-By: Brian Cline <bcline@softlayer.com>
Implements: blueprint enable-object-replication-failure-in-recon
2015-08-18 11:40:02 +09:00
janonymous
9456af35a2 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3,changes in
dir:
*test/unit/obj/*
*test/unit/test_locale/*

Change-Id: I3dd0c1107165ac529f1cd967363e5cf408a1d02b
2015-08-07 19:28:35 +05:30
Alistair Coles
bcd00d9461 Refactor diskfile
This patch mostly eliminates the duplicate code that was
deliberately left in place during EC review to avoid major
churn of the diskfile module prior to the kilo release.

This focuses on obvious de-duplication and shuffling code
between classes. It deliberately does not attempt to
hammer out every last piece of de-duplication where that
would introduce more complex changes - that can come later.

Code is moved from the module level and from ECDiskFile*
classes into new BaseDiskFile* classes.

Concrete classes for replication and EC policy retain their
existing names i.e. DiskFile[Manager|Writer|Reader|] and
ECDiskFile[Manager|Writer|Reader|] respectively.

Knock-on changes:

- fix bug whereby get_hashes was ignoring self.reclaim_age
  and always using the default arg value.

- replication diskfile manager now deletes a tombstone that is older
  than reclaim_age even when there is a newer .meta file.

- replication diskfile manager will no longer raise an
  AssertionError if only a .meta file is found during
  hash_cleanup_listdir.

- fix stale test in test_auditor.py: test_with_tombstone test
  setup was convoluted (probably dates back to when object puts
  did not clean up the object dir). Now that they do you have to
  try harder to create a dir with a tombstone and a data file.

Change-Id: I963e0d0ae0d6569ad1de605034c529529cbb4f9a
2015-07-30 12:21:00 +01:00
janonymous
c907107fe4 cPickle is deprecated in py3, replacing it from six.moves
cPickle is deprecated and should be replaced with six.moves
to provide py2 and py3 compatibility.

Change-Id: Ibad990708722360d188c641e61444d50a16a1e93
2015-07-07 22:46:37 +05:30
Darrell Bishop
df134df901 Allow 1+ object-servers-per-disk deployment
Enabled by a new > 0 integer config value, "servers_per_port" in the
[DEFAULT] config section for object-server and/or replication server
configs.  The setting's integer value determines how many different
object-server workers handle requests for any single unique local port
in the ring.  In this mode, the parent swift-object-server process
continues to run as the original user (i.e. root if low-port binding
is required), binds to all ports as defined in the ring, and forks off
the specified number of workers per listen socket.  The child, per-port
servers drop privileges and behave pretty much how object-server workers
always have, except that because the ring has unique ports per disk, the
object-servers will only be handling requests for a single disk.  The
parent process detects dead servers and restarts them (with the correct
listen socket), starts missing servers when an updated ring file is
found with a device on the server with a new port, and kills extraneous
servers when their port is found to no longer be in the ring.  The ring
files are stat'ed at most every "ring_check_interval" seconds, as
configured in the object-server config (same default of 15s).

Immediately stopping all swift-object-worker processes still works by
sending the parent a SIGTERM.  Likewise, a SIGHUP to the parent process
still causes the parent process to close all listen sockets and exit,
allowing existing children to finish serving their existing requests.
The drop_privileges helper function now has an optional param to
suppress the setsid() call, which otherwise screws up the child workers'
process management.

The class method RingData.load() can be told to only load the ring
metadata (i.e. everything except replica2part2dev_id) with the optional
kwarg, header_only=True.  This is used to keep the parent and all
forked off workers from unnecessarily having full copies of all storage
policy rings in memory.

A new helper class, swift.common.storage_policy.BindPortsCache,
provides a method to return a set of all device ports in all rings for
the server on which it is instantiated (identified by its set of IP
addresses).  The BindPortsCache instance will track mtimes of ring
files, so they are not opened more frequently than necessary.

This patch includes enhancements to the probe tests and
object-replicator/object-reconstructor config plumbing to allow the
probe tests to work correctly both in the "normal" config (same IP but
unique ports for each SAIO "server") and a server-per-port setup where
each SAIO "server" must have a unique IP address and unique port per
disk within each "server".  The main probe tests only work with 4
servers and 4 disks, but you can see the difference in the rings for the
EC probe tests where there are 2 disks per server for a total of 8
disks.  Specifically, swift.common.ring.utils.is_local_device() will
ignore the ports when the "my_port" argument is None.  Then,
object-replicator and object-reconstructor both set self.bind_port to
None if server_per_port is enabled.  Bonus improvement for IPv6
addresses in is_local_device().

This PR for vagrant-swift-all-in-one will aid in testing this patch:
https://github.com/swiftstack/vagrant-swift-all-in-one/pull/16/

Also allow SAIO to answer is_local_device() better; common SAIO setups
have multiple "servers" all on the same host with different ports for
the different "servers" (which happen to match the IPs specified in the
rings for the devices on each of those "servers").

However, you can configure the SAIO to have different localhost IP
addresses (e.g. 127.0.0.1, 127.0.0.2, etc.) in the ring and in the
servers' config files' bind_ip setting.

This new whataremyips() implementation combined with a little plumbing
allows is_local_device() to accurately answer, even on an SAIO.

In the default case (an unspecified bind_ip defaults to '0.0.0.0') as
well as an explict "bind to everything" like '0.0.0.0' or '::',
whataremyips() behaves as it always has, returning all IP addresses for
the server.

Also updated probe tests to handle each "server" in the SAIO having a
unique IP address.

For some (noisy) benchmarks that show servers_per_port=X is at least as
good as the same number of "normal" workers:
https://gist.github.com/dbishop/c214f89ca708a6b1624a#file-summary-md

Benchmarks showing the benefits of I/O isolation with a small number of
slow disks:
https://gist.github.com/dbishop/fd0ab067babdecfb07ca#file-results-md

If you were wondering what the overhead of threads_per_disk looks like:
https://gist.github.com/dbishop/1d14755fedc86a161718#file-tabular_results-md

DocImpact

Change-Id: I2239a4000b41a7e7cc53465ce794af49d44796c6
2015-06-18 12:43:50 -07:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
Jenkins
2ea8bae389 Merge "Allow rsync to use compression" 2015-05-13 10:58:08 +00:00
paul luse
647b66a2ce Erasure Code Reconstructor
This patch adds the erasure code reconstructor. It follows the
design of the replicator but:
  - There is no notion of update() or update_deleted().
  - There is a single job processor
  - Jobs are processed partition by partition.
  - At the end of processing a rebalanced or handoff partition, the
    reconstructor will remove successfully reverted objects if any.

And various ssync changes such as the addition of reconstruct_fa()
function called from ssync_sender which performs the actual
reconstruction while sending the object to the receiver

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
blueprint ec-reconstructor
Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51
2015-04-14 00:52:17 -07:00
Alistair Coles
fa89064933 Per-policy DiskFile classes
Adds specific disk file classes for EC policy types.

The new ECDiskFile and ECDiskFileWriter classes are used by the
ECDiskFileManager.

ECDiskFileManager is registered with the DiskFileRouter for use with
EC_POLICY type policies.

Refactors diskfile tests into BaseDiskFileMixin and BaseDiskFileManagerMixin
classes which are then extended in subclasses for the legacy
replication-type DiskFile* and ECDiskFile* classes.

Refactor to prefer use of a policy instance reference over a policy_index
int to refer to a policy.

Add additional verification to DiskFileManager.get_dev_path to validate the
device root with common.constraints.check_dir, even when mount_check is
disabled for use in on a virtual swift-all-in-one.

Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I22f915160dc67a9e18f4738c1ddf068344e8ad5d
2015-04-14 00:52:16 -07:00
Clay Gerrard
a707829334 Update test infrastructure
* Get FakeConn ready for expect 100 continue
 * Use debug_logger more and with better interfaces
 * Fix patch_policies to be less annoying

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
Change-Id: I28c0a3539d994cbb8e6b94d63a23ed4ea6cb956d
2015-04-13 22:57:42 -07:00
Prashanth Pai
9c33bbde69 Allow rsync to use compression
From rsync's man page:
-z, --compress
With this option, rsync compresses the file data as it is sent to the
destination machine, which reduces the amount of data being transmitted --
something that is useful over a slow connection.

A configurable option has been added to allow rsync to compress, but only
if the remote node is in a different region than the local one.

NOTE: Objects that are already compressed (for example: .tar.gz, .mp3)
might slow down the syncing process.

On wire compression can also be extended to ssync later in a different
change if required. In case of ssync, we could explore faster
compression libraries like lz4. rsync uses zlib which is slow but offers
higher compression ratio.

Change-Id: Ic9b9cbff9b5e68bef8257b522cc352fc3544db3c
Signed-off-by: Prashanth Pai <ppai@redhat.com>
2015-03-02 14:39:58 +05:30
Clay Gerrard
2ff66a532c Fix object replicator partition cleanup
Probetests discovered two issues with the current state of the
object-replicator as a result of the attempts to clean up changes
related to efficient cross-region replication.

Known failures are:

  * rsync replication when configured with no sync_method in the config
    fails to clean up a handoff partition
  * ssync replication when there is only one region fails to cleanup a
    handoff partition

In both cases the path resulting in the failure moved through the
implicit else clause (dangling elif) of the partition cleanup code path.
In the ssync case the failure came form a miss on the first if branch
when delete_objs would be None if there is no remote regions.  In the
rsync case the failure came from a miss on the second elif condition
when looking for an entry in the conf dict and not setting a default.

This change adds unittests for both failures that should fail in a
reasonable way against master without requiring a probetest run against
other configs, as well as rephrasing the logic in the partition cleanup
handling to try and make the logic flow more explicit.

Change-Id: Ic59d998a3e36a3eb3e509d9fdf7096e812281357
2015-02-26 18:31:41 -08:00
Kota Tsuyuzaki
f578a35100 Fix efficient replication handoff delete
Current code might delete local handoff objects incorrectly
when remote node requires whole of the objects at poking
because empty cand_objs won't be applied to the delete candidate
objects list.

This patch ensures the delete candidate objects list always
will be updated (i.e. it will be empty list when the poke job
find whole local objects are required by remote), and then,
handle deleting objects correctly according to the delete
candidate.

This patch includes a test written by Clay Gerrard at [1].

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

1: https://review.openstack.org/#/c/155542/

Change-Id: Ie8f75ed65c7bfefbb18ddccd9fe0e41b72dca0a4
2015-02-19 00:09:31 -08:00
Clay Gerrard
596042b9c7 Minor cleanup post efficent multi-region replication
One log line had a typo, and I refactored the per object cleanup code out of
update_deleted into the per object hashdir cleanup method.

Change-Id: I19d03d0706a75bd8ec2fe327a1eb1b5ec36de6d2
2015-02-13 08:04:56 -08:00
Kota Tsuyuzaki
20ca279d74 Efficient Replication for Distributed Regions
This change provides a efficient way of replication
between regions of a global distributed cluster.

This approach makes object-replicator to push replicas
to a primary node in a remote region, then, to skip
pushing them to next primary node in the region with
expecting asynchronous replication.

This implementation includes a couple of changes on
ssync_sender to allow object-replicator to delete local
handoff objects correctly. One is to return a list of existing
objects in remote region. The list includes local paths of the
objects which exist both on the local device and the remote device.
The other is supporting existence check for specified objects.
It requires the object list build by the first change. When
the object list is given, ssync_sender does only missing_check
based on the list. These changes are needed because current
swift can not handle the existence check in object-level.

Note that this feature will work partially (i.e. only when
primary-to-primary) with rsync.

Implements: blueprint efficient-replication
Change-Id: I5d990444d7977f4127bb37f9256212c893438df1
2015-02-10 12:52:15 -08:00
Samuel Merritt
3ac43e8299 Allow per-policy overrides in object replicator.
The replicator already supports --devices and --partitions to restrict
its operation to a subset of devices and partitions. However,
operators don't always want to replicate a partition in all policies
since different policies (usually) have different rings.

For example, if I know that policy 0's partition 1234 is has no
replicas on primary nodes due to over-aggressive rebalancing, I really
want to find a node where the partition isa and make the replicator
push it onto the primaries. However, if I haven't been messing with
policy 1's ring, its partition 1234 is fine. With the existing
replicator args, I get both or neither; this commit lets me get just
the useful one.

Change-Id: Ib1d58fdd228a6ee7865321e65d7c04a891fa5c49
2015-01-22 16:10:22 -08:00
Samuel Merritt
ba8114a513 Improve object-replicator startup time.
The object replicator checks each partition directory to ensure it's
really a directory and not a zero-byte file. This was happening in
collect_jobs(), which is the first thing that the object replicator
does.

The effect was that, at startup, the object-replicator process would
list each "objects" or "objects-N" directory on each object device,
then stat() every single thing in there. On devices with lots of
partitions on them, this makes the replicator take a long time before
it does anything useful.

If you have a cluster with a too-high part_power plus some failing
disks elsewhere, you can easily get thousands of partition directories
on each disk. If you've got 36 disks per node, that turns into a very
long wait for the object replicator to do anything. Worse yet, if you
add in a configuration management system that pushes new rings every
couple hours, the object replicator can spend the vast majority of its
time collecting jobs, then only spend a short time doing useful work
before the ring changes and it has to start all over again.

This commit moves the stat() call (os.path.isfile) to the loop that
processes jobs. In a complete pass, the total work done is about the
same, but the replicator starts doing useful work much sooner.

Change-Id: I5ed4cd09dde514ec7d1e74afe35feaab0cf28a10
2014-12-08 15:05:29 -08:00
YummyBian
64aa8062bc Some statements are evaluated twice in the setUp of the
TestObjectReplicator

Remove the duplicated statements.

Closes-Bug: #1374783

Change-Id: If2b55e864fea497d7a7b218adf11eb7749c27765
2014-10-01 11:02:48 +00:00
Rafael Rivero
c1f6569c00 Fixes several typos (Swift)
Corrects spelling errors found in comments.

Change-Id: I228a888e3f256569ea32ef1613092dbd63e13c62
2014-09-18 21:18:50 -07:00
Steven Lang
7573fbd498 Object services user-agent string uses full name
It does not appear that, aside from the user-agent string, the strings
"obj-server", "obj-updater", or "obj-replicator" (or "obj-<anything>"*)
appear in the swift code base, aside from the directory containing the
object services code being named "obj".

Furthermore, the container, account, and proxy services construct their
user-agent string, as reported in the logs, using their full name. In
addition, this full name also shows up as the name of the process via
"ps" or "top", etc., which can make it easier for admins to match log
entries with other tools.

For consistency, we update the object services to use an "object-"
prefix rather than "obj-" in its user agent string.

* obj-etag does appear in a unit test, but not part of the regular
code.

Change-Id: I914fc189514207df2535731eda10cb4b3d30cc6c
2014-07-02 18:35:49 -07:00
Paul Luse
873c52e608 Replace POLICY and POLICY_INDEX with string literals
Replaced throughout code base &  tox'd. Functional as well
as probe tests pass with and without policies defined.

POLICY --> 'X-Storage-Policy'
POLICY_INDEX --> 'X-Backend-Storage-Policy-Index'

Change-Id: Iea3d06de80210e9e504e296d4572583d7ffabeac
2014-06-23 12:52:50 -07:00
Paul Luse
04f2970362 Add storage policy support for the Replicator
This makes it so that objects stored in all policies get replicated
properly. This is only for rsync replication, not ssync.

DocImpact
Implements: blueprint storage-policies
Change-Id: Ifdb4624841f35953ba80189e669d3ef15d5563fd
2014-06-18 17:31:38 -07:00
Samuel Merritt
7829ca07d7 Remove some debugging prints from tests
Change-Id: I7116aa75ea5c8e1ef85a799a6e1ddf0d6edffb4f
2014-03-18 12:50:51 -07:00
Samuel Merritt
b5b0b78fc7 Remove obsolete future imports
The with statement has been standard since Python 2.5, so we can get
rid of these imports.

Change-Id: I280971c3d8c01e94cc2c17cacaedcbe9d9c8a3c3
2013-11-22 12:23:58 -08:00
gholt
a80c720af5 Object replication ssync (an rsync alternative)
For this commit, ssync is just a direct replacement for how
we use rsync. Assuming we switch over to ssync completely
someday and drop rsync, we will then be able to improve the
algorithms even further (removing local objects as we
successfully transfer each one rather than waiting for whole
partitions, using an index.db with hash-trees, etc., etc.)

For easier review, this commit can be thought of in distinct
parts:

1)  New global_conf_callback functionality for allowing
    services to perform setup code before workers, etc. are
    launched. (This is then used by ssync in the object
    server to create a cross-worker semaphore to restrict
    concurrent incoming replication.)

2)  A bit of shifting of items up from object server and
    replicator to diskfile or DEFAULT conf sections for
    better sharing of the same settings. conn_timeout,
    node_timeout, client_timeout, network_chunk_size,
    disk_chunk_size.

3)  Modifications to the object server and replicator to
    optionally use ssync in place of rsync. This is done in
    a generic enough way that switching to FutureSync should
    be easy someday.

4)  The biggest part, and (at least for now) completely
    optional part, are the new ssync_sender and
    ssync_receiver files. Nice and isolated for easier
    testing and visibility into test coverage, etc.

All the usual logging, statsd, recon, etc. instrumentation
is still there when using ssync, just as it is when using
rsync.

Beyond the essential error and exceptional condition
logging, I have not added any additional instrumentation at
this time. Unless there is something someone finds super
pressing to have added to the logging, I think such
additions would be better as separate change reviews.

FOR NOW, IT IS NOT RECOMMENDED TO USE SSYNC ON PRODUCTION
CLUSTERS. Some of us will be in a limited fashion to look
for any subtle issues, tuning, etc. but generally ssync is
an experimental feature. In its current implementation it is
probably going to be a bit slower than rsync, but if all
goes according to plan it will end up much faster.

There are no comparisions yet between ssync and rsync other
than some raw virtual machine testing I've done to show it
should compete well enough once we can put it in use in the
real world.

If you Tweet, Google+, or whatever, be sure to indicate it's
experimental. It'd be best to keep it out of deployment
guides, howtos, etc. until we all figure out if we like it,
find it to be stable, etc.

Change-Id: If003dcc6f4109e2d2a42f4873a0779110fff16d6
2013-11-07 16:52:01 +00:00
Peter Portante
5202b0e586 DiskFile API, with reference implementation
Refactor on-disk knowledge out of the object server by pushing the
async update pickle creation to the new DiskFileManager class (name is
not the best, so suggestions welcome), along with the REPLICATOR
method logic. We also move the mount checking and thread pool storage
to the new ondisk.Devices object, which then also becomes the new home
of the audit_location_generator method.

For the object server, a new setup() method is now called at the end
of the controller's construction, and the _diskfile() method has been
renamed to get_diskfile(), to allow implementation specific behavior.

We then hide the need for the REST API layer to know how and where
quarantining needs to be performed. There are now two places it is
checked internally, on open() where we verify the content-length,
name, and x-timestamp metadata, and in the reader on close where the
etag metadata is checked if the entire file was read.

We add a reader class to allow implementations to isolate the WSGI
handling code for that specific environment (it is used no-where else
in the REST APIs). This simplifies the caller's code to just use a
"with" statement once open to avoid multiple points where close needs
to be called.

For a full historical comparison, including the usage patterns see:
https://gist.github.com/portante/5488238

(as of master, 2b639f5, Merge
 "Fix 500 from account-quota     This Commit
 middleware")
--------------------------------+------------------------------------
                                 DiskFileManager(conf)

                                   Methods:
                                     .pickle_async_update()
                                     .get_diskfile()
                                     .get_hashes()

                                   Attributes:
                                     .devices
                                     .logger
                                     .disk_chunk_size
                                     .keep_cache_size
                                     .bytes_per_sync

DiskFile(a,c,o,keep_data_fp=)    DiskFile(a,c,o)

  Methods:                         Methods:
   *.__iter__()
    .close(verify_file=)
    .is_deleted()
    .is_expired()
    .quarantine()
    .get_data_file_size()
                                     .open()
                                     .read_metadata()
    .create()                        .create()
                                     .write_metadata()
    .delete()                        .delete()

  Attributes:                      Attributes:
    .quarantined_dir
    .keep_cache
    .metadata
                                *DiskFileReader()

                                   Methods:
                                     .__iter__()
                                     .close()

                                   Attributes:
                                    +.was_quarantined

DiskWriter()                     DiskFileWriter()

  Methods:                         Methods:
    .write()                         .write()
    .put()                           .put()

* Note that the DiskFile class   * Note that the DiskReader() object
  implements all the methods       returned by the
  necessary for a WSGI app         DiskFileOpened.reader() method
  iterator                         implements all the methods
                                   necessary for a WSGI app iterator

                                 + Note that if the auditor is
                                   refactored to not use the DiskFile
                                   class, see
                                   https://review.openstack.org/44787
                                   then we don't need the
                                   was_quarantined attribute

A reference "in-memory" object server implementation of a backend
DiskFile class in swift/obj/mem_server.py and
swift/obj/mem_diskfile.py.

One can also reference
https://github.com/portante/gluster-swift/commits/diskfile for the
proposed integration with the gluster-swift code based on these
changes.

Change-Id: I44e153fdb405a5743e9c05349008f94136764916
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-10-17 15:03:31 -04:00
Peter Portante
9411a24ba7 Revert "Refactor common/utils methods to common/ondisk"
This reverts commit 7760f41c3ce436cb23b4b8425db3749a3da33d32

Change-Id: I95e57a2563784a8cd5e995cc826afeac0eadbe62
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-10-07 17:18:09 -04:00
ZhiQiang Fan
f72704fc82 Change OpenStack LLC to Foundation
Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58
Closes-bug: #1214176
2013-09-20 01:02:31 +08:00
Peter Portante
7760f41c3c Refactor common/utils methods to common/ondisk
Place all the methods related to on-disk layout and / or configuration
into a new common module that can be shared by the various modules
using the same on-disk layout.

Change-Id: I27ffd4665d5115ffdde649c48a4d18e12017e6a9
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-17 17:32:04 -04:00
Chuck Thier
a30a7ced9c Add handoffs_first and handoff_delete to obj-repl
If handoffs_first is True, then the object replicator will give
partitions that are not supposed to be on the node priority.

If handoff_delete is set to a number (n), then it will delete a handoff
partition if at least n replicas were successfully replicated

Also fixed a couple of things in the object replicator unit tests and
added some more

DocImpact

Change-Id: Icb9968953cf467be2a52046fb16f4b84eb5604e4
2013-09-13 15:44:07 +00:00
Peter Portante
56593a1323 Pep8 unit test modules w/ <= 20 violations (6 of 12)
Change-Id: I7317beb97e1530cb18c62da55ccf4c64206ff362
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-09-01 16:12:42 -04:00
Alex Gaynor
358a03325b Ensure http_connect is mocked out in tests
Otherwise this test will try to connect to several addresses that aren't
mapped. On Linux the entier 127.0.0.* block is mapped to localhost, so this
often isn't noticed, but on OS X only 127.0.0.1 is mapped by default.

Change-Id: I44034d5006ff25e052afdb6599ab5838e4c5ae5b
2013-08-05 15:35:34 -07:00
Chmouel Boudjnah
e448b2f5f7 Fake http_connect in test_replicator test
- mock http_connect in replicator.test_delete_partition test to not
  connect directly on 127.0.0.1 (which is not always avail on non
  linuxies).
- Fixes bug 1203907

Change-Id: I2622223c9fe5a3db2a113b6cd8d028a5db0915a7
2013-07-31 10:07:26 +02:00
Peter Portante
1e3ad44784 Merge object base module into diskfile.
All of the module methods of the (now defunct) base module we really
concerned with the on-disk layout which is what the DiskFile module is
really about.

Change-Id: I96e022c5f96e31537ced74139185851a2751701c
Signed-off-by: Peter Portante <peter.portante@redhat.com>
2013-07-30 09:38:22 -04:00
Alex Gaynor
0e3103c0dd Corrected a number of style violations in the tests.
Change-Id: Ib5e81ad0476c56cf84d222d67f55b8db3eb0249e
2013-07-22 15:27:54 -07:00
Alex Gaynor
c1f8f266d0 Ensure that files in tests are closed.
This is needed on Pythons which do not have
reference counting GCs (e.g. PyPy).

Change-Id: I5a613e832e9a7a149b3e9317c053c3048f34afcb
2013-07-20 16:08:53 -07:00
gholt
69cf78bb16 Moved tests for moved obj.base code
Follow-on to https://review.openstack.org/#/c/28895/

Moved the tests for the code that was moved to obj.base and also made
the new test file flake8 compliant.

Change-Id: I4be718927b6cd2de8efe32f8e54b458a4e05291b
2013-05-17 14:27:13 +00:00
Jenkins
d754b59cf8 Merge "Moved some code out of swift.obj.replicator" 2013-05-17 04:47:21 +00:00
gholt
9fe15dd15a Moved some code out of swift.obj.replicator
This will be needed in future replication work to avoid circular
imports.

I used swift.obj.base as the module name just because we seemed to
avoid putting code in __init__.py files so far and I didn't want to
buck the trend.

I would love to see other obj things like *_metadata and DiskFile
move into swift.obj.base as well and swift.obj.server just be the
WSGI server logic, but I'll leave that for the future.

I have changed the tests as little as possible (just the references
to where they get the code to test) to show the refactor has not
broken anything. I did add a test for tpool_reraise since there was
none before.

There will be a follow on patch for moving the tests to their new
location(s). I figured I'd wait to put the bikes in the shed until
everyone's done painting it.

Change-Id: I32b4ac88be21eb76c877d3f4cc1e6ac33304835b
2013-05-11 21:48:00 +00:00
Sergey Kraynev
ea7858176b Implementation of replication servers
Support separate replication ip address:
- Added new function in utils. This function provides ability
  to select separate IP address for replication service.
- Db_replicator and object replicators were changed.
  Replication process uses new function now.

Replication network parameters:
- Replication network fields (replication_ip, replication_port)
  support was added to device dictionary in swift-ring-builder script.
- Changes were made to support new fields in search, show and set_info
  functions.

Implementation of replication servers:
- Separate replication servers use the same code as normal replication
  servers, but with replication_server parameter = True.  When using a
  separate replication network, the non-replication servers set
  replication_server = False.  When there is no separate replication
  network (the default case), replication_server is not included in the config.

DocImpact
Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0
Implements: blueprint dedicated-replication-network
2013-04-21 18:14:42 -04:00
Greg Lange
44f00a23c1 fixed some minor things in tests that pyflakes complained about
Change-Id: Ifeab56a964630bcf941e932fcbe39e6572e62975
2013-03-26 20:42:26 +00:00
David Hadas
a979c8007b Add support for Hash Prefix
A new configuration parameter is added to /etc/swift/swift.conf
[swift-hash]
swift_hash_path_prefix = 'random unique string'

New installations are advised to set this parameter to a random secret,
which would not be disclosed ouside the organization.
The same secret needs to be used by all swift servers of the same cluster.

Existing installations should set this parameter to an empty string
(the default)

DocImpact

Fixes: Bug #1157454

Change-Id: I63b10d0b7d6dd3f74e0f10bb41b5f240fa03578a
2013-03-22 19:41:55 +02:00
Joe Gordon
45f0502b52 Fix spelling mistakes
git ls-files | misspellings -f -
Source: https://github.com/lyda/misspell-check

Change-Id: I4132e6a276e44e2a8985238358533d315ee8d9c4
2013-02-12 16:39:40 -08:00
gholt
95d5cf851b Fixed bug in object replicator
If the object replicator couldn't create a device's object directory
(due to permissions or whatever) it wouldn't do any work at all. This
fixes that.

Change-Id: I6a30439d036b29c9cfdb660428d13668e0dc8632
2013-01-12 07:25:15 +00:00
Darrell Bishop
ea95d0092a Avoid infinite recursion in swift.obj.replicator.get_hashes.
Fixes bug 1089140.

Turns out that if an exception bails out of the pickle loading (eg.
zero-byte hahes_file), the if clause to determine whether or not to
write out a fresh hashes_file can evaluate to false, leading to an
infinite loop.

This patch fixes this infinite loop generally, by ensuring that if any
exception is thrown, a new hashes_file is written.

Change-Id: I344c5f8e261ce7c667bdafe1687263a4150b21dc
2012-12-11 15:32:09 -08:00
David Goetz
a6c44d2764 allow replicator run_once to check specific devices/partitions
Change-Id: If45f77fda269ae6e251579542e70eb71bd11fe2a
2012-09-28 12:24:15 -07:00
David Goetz
d24e280bf4 obj replicator speed up
Change-Id: If02b573353dedea9c2368ce4733fe97599229b2e
2012-09-10 15:12:39 -07:00