38 Commits

Author SHA1 Message Date
Tim Burke
1d7e1558b3 py3: (mostly) port probe tests
There's still one problem, though: since swiftclient on py3 doesn't
support non-ASCII characters in metadata names, none of the tests in
TestReconstructorRebuildUTF8 will pass.

Change-Id: I4ec879ade534e09c3a625414d8aa1f16fd600fa4
2019-09-04 10:17:45 -07:00
Tim Burke
3189410f9d Ignore 404s from handoffs for objects when calculating quorum
We previously realized we needed to do that for accounts and containers
where the consequences of treating the 404 as authoritative were more
obvious: we'd cache the non-existence which prevented writes until it
fell out of cache.

The same basic logic applies for objects, though: if we see

    (Timeout, Timeout, Timeout, 404, 404, 404)

on a triple-replica policy, we don't really have any reason to think
that a 404 is appropriate. In fact, it seems reasonably likely that
there's a thundering-herd problem where there are too many concurrent
requests for data that *definitely is there*. By responding with a 503,
we apply some back-pressure to clients, who hopefully have some
exponential backoff in their retries.

The situation gets a bit more complicated with erasure-coded data, but
the same basic principle applies. We're just more likely to have
confirmation that there *is* data out there, we just can't reconstruct
it (right now).

Note that we *still want to check* those handoffs, of course. Our
fail-in-place strategy has us replicate (and, more recently,
reconstruct) to handoffs to maintain durability; it'd be silly *not* to
look.

UpgradeImpact:
--------------
Be aware that this may cause an increase in 503 Service Unavailable
responses served by proxy-servers. However, this should more accurately
reflect the state of the system.

Co-Authored-By: Thiago da Silva <thiagodasilva@gmail.com>
Change-Id: Ia832e9bab13167948f01bc50aa8a61974ce189fb
Closes-Bug: #1837819
Related-Bug: #1833612
Related-Change: I53ed04b5de20c261ddd79c98c629580472e09961
Related-Change: Ief44ed39d97f65e4270bf73051da9a2dd0ddbaec
2019-08-01 14:07:39 -07:00
Thiago da Silva
8d88209537 Return 404 on a GET if tombstone is newer
Currently the proxy keeps iterating through
the connections in hope of finding a success even
if it already has found a tombstone (404).

This change changes the code a little bit to compare
the timestamp of a 200 and a 404, if the tombstone is
newer, then it should be returned, instead of returning
a stale 200.

Closes-Bug: #1560574

Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Change-Id: Ia81d6832709d18fe9a01ad247d75bf765e8a89f4
Signed-off-by: Thiago da Silva <thiago@redhat.com>
2017-10-24 14:35:06 -04:00
Clay Gerrard
116462d8a5 Add probetest for response with duplicate frags
When an SSYNC connection fails after shipping a fragment, but before
cleaning itself up - it can lead to multiple replicas of the same
fragment index serviceable in the node_iter at the same time.  Or
perhaps more likely if a partner nodes win a race to rebuild waiting
on a handoff.  In this case, the proxy need not fail to service a GET
just because of extra information.  A similar guard was added to the
reconstructor in a related change [1].

This underlying bug is acctually fixed by the related change [2], this
probetest is just encoding the failure to help with future maintenance.

1. Related-Change: I91f3d4af52cbc66c9f7ce00726f247b5462e66f9
2. Related-Change: I2310981fd1c4622ff5d1a739cbcc59637ffe3fc3

Closes-Bug: 1624176

Change-Id: I06fc381a25613585dd18916d50e7ca2c68d406b6
2016-11-04 17:29:04 +00:00
Ondřej Nový
33c18c579e Remove executable flag from some test modules
Change-Id: I36560c2b54c43d1674b007b8105200869b5f7987
2016-10-31 21:22:10 +00:00
Janie Richling
96a0e07753 Enable object body and metadata encryption
Adds encryption middlewares.

All object servers and proxy servers should be upgraded before
introducing encryption middleware.

Encryption middleware should be first introduced with the
encryption middleware disable_encryption option set to True.
Once all proxies have encryption middleware installed this
option may be set to False (the default).

Increases constraints.py:MAX_HEADER_COUNT by 4 to allow for
headers generated by encryption-related middleware.

Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Christian Cachin <cca@zurich.ibm.com>
Co-Authored-By: Mahati Chamarthy <mahati.chamarthy@gmail.com>
Co-Authored-By: Peter Chng <pchng@ca.ibm.com>
Co-Authored-By: Alistair Coles <alistair.coles@hpe.com>
Co-Authored-By: Jonathan Hinson <jlhinson@us.ibm.com>
Co-Authored-By: Hamdi Roumani <roumani@ca.ibm.com>

UpgradeImpact

Change-Id: Ie6db22697ceb1021baaa6bddcf8e41ae3acb5376
2016-06-30 23:31:15 -07:00
Kota Tsuyuzaki
e56a1a550a pids in probe is no longer used
Change-Id: I1fd76004257a8c05ce8bb1f3ca0e45000509f833
2016-06-01 23:53:35 -07:00
Samuel Merritt
99305b9300 Fix probe tests from commit cf48e75
Commit cf48e75 changed the default account/container/object ports in a
lot of places, including the probetests. However, it didn't change
them in doc/saio/bin/remakerings, and since the probe tests must match
the rings, they started failing.

This commit just backs out the changes to the test/probe directory so
that remakerings and the probe tests match again.

Change-Id: I316a09e6ee1a911f37ce9df3d641644739f88eeb
2016-05-02 17:29:32 -07:00
Shashirekha Gundur
cf48e75c25 change default ports for servers
Changing the recommended ports for Swift services
from ports 6000-6002 to unused ports 6200-6202;
so they do not conflict with X-Windows or other services.

Updated SAIO docs.

DocImpact
Closes-Bug: #1521339
Change-Id: Ie1c778b159792c8e259e2a54cb86051686ac9d18
2016-04-29 14:47:38 -04:00
Clay Gerrard
f87a5487b5 Make rsync ignore it's own temporary files
In situations where rsync may inadvertently be unable to cleanup it's
temporary files we shouldn't spread them around the cluster.

By asking our rsync subexec to --exclude patterns that match it's own
convention for temporary naming we'll only ever transfer real replicated
artifacts and never temporary artifacts which should always be ignored
until they are fully transfered.

Cleanup of stale rsync droppings should be performed by the auditor and
will be addressed in a separate change related to lp bug #1554005.

Closes-Bug: #1553995

Change-Id: Ibe598b339af024d05e4d89c34d696e972d8189ff
2016-03-14 17:30:10 -07:00
paul luse
893f30c61d EC GET path: require fragments to be of same set
And if they are not, exhaust the node iter to go get more.  The
problem without this implementation is a simple overwrite where
a GET follows before the handoff has put the newer obj back on
the 'alive again' node such that the proxy gets n-1 fragments
of the newest set and 1 of the older.

This patch bucketizes the fragments by etag and if it doesn't
have enough continues to exhaust the node iterator until it
has a large enough matching set.

Change-Id: Ib710a133ce1be278365067fd0d6610d80f1f7372
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Closes-Bug: 1457691
2015-08-27 21:09:41 -07:00
Bill Huber
239e94e625 pep8 fix: assertEquals -> assertEqual
assertEquals is deprecated in py3 in the following dir:
test/probe/*

Change-Id: Ie08dd7a8a6c48e3452dfe4f2b41676330ce455d5
2015-08-06 09:28:51 -05:00
Darrell Bishop
df134df901 Allow 1+ object-servers-per-disk deployment
Enabled by a new > 0 integer config value, "servers_per_port" in the
[DEFAULT] config section for object-server and/or replication server
configs.  The setting's integer value determines how many different
object-server workers handle requests for any single unique local port
in the ring.  In this mode, the parent swift-object-server process
continues to run as the original user (i.e. root if low-port binding
is required), binds to all ports as defined in the ring, and forks off
the specified number of workers per listen socket.  The child, per-port
servers drop privileges and behave pretty much how object-server workers
always have, except that because the ring has unique ports per disk, the
object-servers will only be handling requests for a single disk.  The
parent process detects dead servers and restarts them (with the correct
listen socket), starts missing servers when an updated ring file is
found with a device on the server with a new port, and kills extraneous
servers when their port is found to no longer be in the ring.  The ring
files are stat'ed at most every "ring_check_interval" seconds, as
configured in the object-server config (same default of 15s).

Immediately stopping all swift-object-worker processes still works by
sending the parent a SIGTERM.  Likewise, a SIGHUP to the parent process
still causes the parent process to close all listen sockets and exit,
allowing existing children to finish serving their existing requests.
The drop_privileges helper function now has an optional param to
suppress the setsid() call, which otherwise screws up the child workers'
process management.

The class method RingData.load() can be told to only load the ring
metadata (i.e. everything except replica2part2dev_id) with the optional
kwarg, header_only=True.  This is used to keep the parent and all
forked off workers from unnecessarily having full copies of all storage
policy rings in memory.

A new helper class, swift.common.storage_policy.BindPortsCache,
provides a method to return a set of all device ports in all rings for
the server on which it is instantiated (identified by its set of IP
addresses).  The BindPortsCache instance will track mtimes of ring
files, so they are not opened more frequently than necessary.

This patch includes enhancements to the probe tests and
object-replicator/object-reconstructor config plumbing to allow the
probe tests to work correctly both in the "normal" config (same IP but
unique ports for each SAIO "server") and a server-per-port setup where
each SAIO "server" must have a unique IP address and unique port per
disk within each "server".  The main probe tests only work with 4
servers and 4 disks, but you can see the difference in the rings for the
EC probe tests where there are 2 disks per server for a total of 8
disks.  Specifically, swift.common.ring.utils.is_local_device() will
ignore the ports when the "my_port" argument is None.  Then,
object-replicator and object-reconstructor both set self.bind_port to
None if server_per_port is enabled.  Bonus improvement for IPv6
addresses in is_local_device().

This PR for vagrant-swift-all-in-one will aid in testing this patch:
https://github.com/swiftstack/vagrant-swift-all-in-one/pull/16/

Also allow SAIO to answer is_local_device() better; common SAIO setups
have multiple "servers" all on the same host with different ports for
the different "servers" (which happen to match the IPs specified in the
rings for the devices on each of those "servers").

However, you can configure the SAIO to have different localhost IP
addresses (e.g. 127.0.0.1, 127.0.0.2, etc.) in the ring and in the
servers' config files' bind_ip setting.

This new whataremyips() implementation combined with a little plumbing
allows is_local_device() to accurately answer, even on an SAIO.

In the default case (an unspecified bind_ip defaults to '0.0.0.0') as
well as an explict "bind to everything" like '0.0.0.0' or '::',
whataremyips() behaves as it always has, returning all IP addresses for
the server.

Also updated probe tests to handle each "server" in the SAIO having a
unique IP address.

For some (noisy) benchmarks that show servers_per_port=X is at least as
good as the same number of "normal" workers:
https://gist.github.com/dbishop/c214f89ca708a6b1624a#file-summary-md

Benchmarks showing the benefits of I/O isolation with a small number of
slow disks:
https://gist.github.com/dbishop/fd0ab067babdecfb07ca#file-results-md

If you were wondering what the overhead of threads_per_disk looks like:
https://gist.github.com/dbishop/1d14755fedc86a161718#file-tabular_results-md

DocImpact

Change-Id: I2239a4000b41a7e7cc53465ce794af49d44796c6
2015-06-18 12:43:50 -07:00
janonymous
09e7477a39 Replace it.next() with next(it) for py3 compat
The Python 2 next() method of iterators was renamed to __next__() on
Python 3. Use the builtin next() function instead which works on Python
2 and Python 3.

Change-Id: Ic948bc574b58f1d28c5c58e3985906dee17fa51d
2015-06-15 22:10:45 +05:30
paul luse
647b66a2ce Erasure Code Reconstructor
This patch adds the erasure code reconstructor. It follows the
design of the replicator but:
  - There is no notion of update() or update_deleted().
  - There is a single job processor
  - Jobs are processed partition by partition.
  - At the end of processing a rebalanced or handoff partition, the
    reconstructor will remove successfully reverted objects if any.

And various ssync changes such as the addition of reconstruct_fa()
function called from ssync_sender which performs the actual
reconstruction while sending the object to the receiver

Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: John Dickinson <me@not.mn>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Co-Authored-By: Tushar Gohad <tushar.gohad@intel.com>
Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
Co-Authored-By: Christian Schwede <christian.schwede@enovance.com>
Co-Authored-By: Yuan Zhou <yuan.zhou@intel.com>
blueprint ec-reconstructor
Change-Id: I7d15620dc66ee646b223bb9fff700796cd6bef51
2015-04-14 00:52:17 -07:00
John Dickinson
da6f8d8f13 fixed ugly code pattern in probe tests
Change-Id: I242f095ea0ca8d6d69c3b2258cce6b51c7963dce
2015-02-26 14:32:31 -08:00
Leah Klearman
a4ffd1d1c6 move test comments around
should make later refactoring easier

Change-Id: I7af399a14c8bc78fcfc438e4440d2f023c8aa5db
2015-02-12 20:36:12 -08:00
Leah Klearman
2c1b5af062 refactor probe tests
* refactor probe tests to use probe.common.ProbeTest
* move reset_environment functionality to ProbeTest.setUp()
* choose rings and policies that meet the criteria - raise SkipTest if
nothing matches
* replace all AssertionErrors in setup with SkipTest

Change-Id: Id56c497d58083f5fd55f5283cdd346840df039d3
2015-02-12 11:30:21 -08:00
Paul Luse
873c52e608 Replace POLICY and POLICY_INDEX with string literals
Replaced throughout code base &  tox'd. Functional as well
as probe tests pass with and without policies defined.

POLICY --> 'X-Storage-Policy'
POLICY_INDEX --> 'X-Backend-Storage-Policy-Index'

Change-Id: Iea3d06de80210e9e504e296d4572583d7ffabeac
2014-06-23 12:52:50 -07:00
Yuan Zhou
ad2a9cefe5 Fixes probe tests with non-zero default storage policy
Add headers param to direct_client.direct_get_object, which is used in
probetests to passthrough the X-Storage-Policy-Index header.

DocImpact
Implements: blueprint storage-policies
Change-Id: I19adbbcefbc086c8467bd904a275d55cde596412
2014-06-18 21:09:53 -07:00
Clay Gerrard
0efac0cac2 make probetests work with conf.d configs
Change-Id: I451ff4629730a334ac1bd8fc6cd75de95314e153
2014-03-12 12:04:45 -07:00
Chmouel Boudjnah
150f338fc2 Remove swiftclient dep on direct_client
Partial Implements: blueprint remove-swiftclient-dependency
Change-Id: I9af7150e5d21d50e5f880e57796314b8f05822d2
2013-12-24 03:11:43 -08:00
ZhiQiang Fan
f72704fc82 Change OpenStack LLC to Foundation
Change-Id: I7c3df47c31759dbeb3105f8883e2688ada848d58
Closes-bug: #1214176
2013-09-20 01:02:31 +08:00
Dirk Mueller
3d36a76156 Use Python 3.x compatible except construct
except x,y: was deprected and is removed in Python 3.x.
Use "except x as y:" instead which works in any Python
version >= 2.6.

Change-Id: I7008c74b807340f3457d3a0c8bd0b83f23169d14
2013-09-07 10:50:54 +02:00
Sergey Kraynev
ea7858176b Implementation of replication servers
Support separate replication ip address:
- Added new function in utils. This function provides ability
  to select separate IP address for replication service.
- Db_replicator and object replicators were changed.
  Replication process uses new function now.

Replication network parameters:
- Replication network fields (replication_ip, replication_port)
  support was added to device dictionary in swift-ring-builder script.
- Changes were made to support new fields in search, show and set_info
  functions.

Implementation of replication servers:
- Separate replication servers use the same code as normal replication
  servers, but with replication_server parameter = True.  When using a
  separate replication network, the non-replication servers set
  replication_server = False.  When there is no separate replication
  network (the default case), replication_server is not included in the config.

DocImpact
Change-Id: Ie9af5bdcdf9241c355e36053ca4adfe49dc35bd0
Implements: blueprint dedicated-replication-network
2013-04-21 18:14:42 -04:00
gholt
4a9d19197c Updated probe tests
The probe tests were woefully out of date with all the changes that
have ocurred since they were written. I've updated most of them and
removed some that are hopeless outdated.

I also greatly improved the timing issues (hopefully completely
solved them? I ran them 25 times with no problems) and made them pep8
1.3.1 safe.

Change-Id: I8e9dbb6e7d6e04e293843b1dce1ded99d84e0348
2012-07-06 23:30:12 +00:00
gholt
be79b0884e Fixes for probe tests
Updated the imports and added a head_account to the "is the cluster
started yet?" checks. Hopefully this fixes the notorious timing
issues of these tests where auth answers requests just a bit before
the rest of the cluster is ready.

Fixes bug 1014931

Change-Id: Iea1d62db2317560371da49af5e94a0279b646294
2012-06-27 05:09:53 +00:00
Paul McMillan
92fbf44d10 Fixed grammar and improve docs.
Corrected its/it's mistakes, harmonized line wrapping within some docs
and clarified doc wording in several places.

Change-Id: Ib9ac6d5e859f770a702e1fad6de8d4abe0390b47
2012-04-10 12:27:14 -07:00
John Dickinson
1ecf5ebba1 updated copyright date for all files
Change-Id: Ifd909d3561c2647770a7e0caa3cd91acd1b4f298
2012-03-19 13:45:34 -05:00
gholt
7c9e542c02 Implemented object POST as COPY 2011-06-08 04:19:34 +00:00
David Goetz
86cb12036b removing blank excepts 2011-01-26 14:31:33 -08:00
Anne Gentle
8823427161 Changed copyright notices on py files and the single rst file with a copyright notice 2011-01-04 17:34:43 -06:00
gholt
af99fb17e0 Fixed probe tests to not use relativity (on imports) 2010-12-22 11:35:11 -08:00
gholt
d0367fdf19 Updated direct_client to match the changes in client 2010-09-05 21:06:16 -07:00
gholt
bb01c22440 Updated tools and client.py to work with ACLs 2010-09-03 21:39:44 -07:00
kapil.foss@gmail.com
65358e873f pyflakes cleanups, unused modules, and variables 2010-07-30 08:20:07 -04:00
Chuck Thier
63538345f2 Modified probe tests to work with setup.py develop installs
Updated SAIO instructions to note that probe tests will reset your environment
Added the swift.egg-info directory to .bzrignore
2010-07-19 03:00:28 +00:00
Chuck Thier
001407b969 Initial commit of Swift code 2010-07-12 17:03:45 -05:00