265 Commits

Author SHA1 Message Date
Zuul
68924d920c Merge "Have slo tell the object-server that it wants whole manifests" 2020-01-18 13:31:32 +00:00
Zuul
e32689a96d Merge "Deprecate per-service auto_create_account_prefix" 2020-01-07 01:30:20 +00:00
Clay Gerrard
4601548dab Deprecate per-service auto_create_account_prefix
If we move it to constraints it's more globally accessible in our code,
but more importantly it's more obvious to ops that everything breaks if
you try to mis-configure different values per-service.

Change-Id: Ib8f7d08bc48da12be5671abe91a17ae2b49ecfee
2020-01-05 09:53:30 -06:00
Tim Burke
3f88907012 sharding: Better-handle newlines in container names
Previously, if you were on Python 2.7.10+ [0], such a newline would cause the
sharder to fail, complaining about invalid header values when trying to create
the shard containers. On older versions of Python, it would most likely cause a
parsing error in the container-server that was trying to handle the PUT.

Now, quote all places that we pass around container paths. This includes:

  * The X-Container-Sysmeta-Shard-(Quoted-)Root sent when creating the (empty)
    remote shards
  * The X-Container-Sysmeta-Shard-(Quoted-)Root included when initializing the
    local handoff for cleaving
  * The X-Backend-(Quoted-)Container-Path the proxy sends to the object-server
    for container updates
  * The Location header the container-server sends to the object-updater

Note that a new header was required in requests so that servers would
know whether the value should be unquoted or not. We can get away with
reusing Location in responses by having clients opt-in to quoting with
a new X-Backend-Accept-Quoted-Location header.

During a rolling upgrade,

  * old object-servers servicing requests from new proxy-servers will
    not know about the container path override and so will try to update
    the root container,
  * in general, object updates are more likely to land in the root
    container; the sharder will deal with them as misplaced objects, and
  * shard containers created by new code on servers running old code
    will think they are root containers until the server is running new
    code, too; during this time they'll fail the sharder audit and report
    stats to their account, but both of these should get cleared up upon
    upgrade.

Drive-by: fix a "conainer_name" typo that prevented us from testing that
we can shard a container with unicode in its name. Also, add more UTF8
probe tests.

[0] See https://bugs.python.org/issue22928

Change-Id: Ie08f36e31a448a547468dd85911c3a3bc30e89f1
Closes-Bug: 1856894
2020-01-03 16:04:57 -08:00
Tim Burke
e8b654f318 Have slo tell the object-server that it wants whole manifests
Otherwise, we waste a request on some 416/206 response that won't be
helpful.

To do this, add a new X-Backend-Ignore-Range-If-Metadata-Present header
whose value is a comma-separated list of header names. Middlewares may
include this header to tell object-servers to send the whole object
(rather than a 206 or 416) if *any* of the metadata are present.

Have dlo and symlink use it, too; it won't save us any round-trips, but
it should clean up some object-server logging.

Change-Id: I4ff2a178d0456e7e37d561109ef57dd0d92cbd4e
2020-01-02 15:48:39 -08:00
Clay Gerrard
698717d886 Allow internal clients to use reserved namespace
Reserve the namespace starting with the NULL byte for internal
use-cases.  Backend services will allow path names to include the NULL
byte in urls and validate names in the reserved namespace.  Database
services will filter all names starting with the NULL byte from
responses unless the request includes the header:

    X-Backend-Allow-Reserved-Names: true

The proxy server will not allow path names to include the NULL byte in
urls unless a middlware has set the X-Backend-Allow-Reserved-Names
header.  Middlewares can use the reserved namespace to create objects
and containers that can not be directly manipulated by clients.  Any
objects and bytes created in the reserved namespace will be aggregated
to the user's account totals.

When deploying internal proxys developers and operators may configure
the gatekeeper middleware to translate the X-Allow-Reserved-Names header
to the Backend header so they can manipulate the reserved namespace
directly through the normal API.

UpgradeImpact: it's not safe to rollback from this change

Change-Id: If912f71d8b0d03369680374e8233da85d8d38f85
2019-11-27 11:22:00 -06:00
Alexandre Lécuyer
4927b1f29c Specify pickle protocol in REPLICATE()
The default pickle protocol in python3 is version 3. This is not
readable by a python2 interpreter.

Force the use of version 2 in the object server REPLICATE() function,
for compatibility with python 2.

Change-Id: I19d23570ff3a084d288de1308e059cfd8134d6ad
2019-05-22 13:12:32 -07:00
Zuul
f1e2a21efe Merge "Wait longer for log lines in unit test" 2019-05-17 18:30:23 +00:00
Tim Burke
9290f29e1c Wait longer for log lines in unit test
We've seen a noteworthy (literally -- there's a comment about it) number
of failures for test_multiphase_put_drains_extra_commit_junk_disconnect;
I'd rather waste an extra few hundredths of a second every run than have
to recheck most patches we want merged.

Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
Change-Id: Iebec6c18866e4f8ef40faab539b173f017d658e3
2019-05-04 15:56:55 -07:00
Gilles Biannic
a4cc353375 Make log format for requests configurable
Add the log_msg_template option in proxy-server.conf and log_format in
a/c/o-server.conf. It is a string parsable by Python's format()
function. Some fields containing user data might be anonymized by using
log_anonymization_method and log_anonymization_salt.

Change-Id: I29e30ef45fe3f8a026e7897127ffae08a6a80cd9
2019-05-02 17:43:25 -06:00
Tim Burke
5409c4f347 Make test_multiphase_put_drains_extra_commit_junk_disconnect less flakey
Change-Id: I82503b13e4541196ad056e861221e9429c7f2c1c
2019-02-22 14:23:33 -08:00
Tim Burke
2bd7b7a109 py3 object-server follow-ups
Change-Id: Ief7d85af8d3e1d5e03a6484a889c9146d69f1377
Related-Change: I203a54fddddbd4352be0e6ea476a628e3f747dc1
2019-02-04 09:36:16 -08:00
Pete Zaitcev
5b5ed29ab4 py3: object server
This does not do anything about replicator or other daemons,
only ports the server.

- wsgi_to_bytes(something.get('X-Foo')) assumes that None is possible
- Dunno if del-in-for-in-dict calls for clone or list(), using list()
- Fixed the zero-copy with EventletPlungerString(bytes)
- Yet another reminder that Request.blank() takes a WSGI string path

Curiously enough, the sleep(0) before checking for logging was
already present in the tests. The py3 scheduling merely forces us
to do it consistently throughout.

Change-Id: I203a54fddddbd4352be0e6ea476a628e3f747dc1
2019-01-11 09:18:08 -06:00
Tim Burke
3420921a33 Clean up HASH_PATH_* patching
Previously, we'd sometimes shove strings into HASH_PATH_PREFIX or
HASH_PATH_SUFFIX, which would blow up on py3. Now, always use bytes.

Change-Id: Icab9981e8920da505c2395eb040f8261f2da6d2e
2018-11-01 20:52:33 +00:00
Tim Burke
781da0704b Stop accomodating 5+ year old object-server code in unit tests
Change-Id: Id2597d6a26fa8a1769e8cab1ba5cae29eb1e2942
2018-10-18 12:50:41 -07:00
Clay Gerrard
cbfa585d3b Refactor obj.server.ObjectController.PUT
Change-Id: Iebc5cd4c22db159db3b26685e02b37a028eb2be6
2018-09-13 16:42:20 -05:00
Zuul
3de21d945b Merge "Remove empty part dirs during ssync replication" 2018-06-23 02:19:18 +00:00
Samuel Merritt
7a7677868d Use X-Timestamp when checking object expiration
In the object server's PUT, POST, and DELETE handlers, we use the
request's X-Timestamp value for checking object expiration. In the GET
and HEAD handlers, we use it if present, but default to the current
time. That way, one can still use curl to make direct object GET or
HEAD requests as before.

If one object server's clock is ahead of the proxy server's clock for
some reason, and a client makes a POST request to update X-Delete-At,
then the skewed-time object server may refuse the new X-Delete-At
value.

In a cluster where two of the three replicas for an object live on the
same time-skewed node, this can result in confusing behavior for
clients. A client can make a POST request to update X-Delete-At,
receive a 400, and then discover later that the X-Delete-At value was
updated anyway, since one object server accepted the POST and
replication spread the new metadata around.

DELETE is somewhat less confusing. The client might get a spurious 404
in the above case, but the object will still be removed.

For PUT, an object server with a slow clock might refuse to overwrite
an object with an "older" one because it believes the on-disk object
is newer than the current time.

Change-Id: I10c28f97d4c6aca1d64bef3b93506cfbb50ade30
2018-05-22 16:42:53 -07:00
Alistair Coles
4a3efe61a9 Redirect object updates to shard containers
Enable the proxy to fetch a shard container location from the
container server in order to redirect an object update to the shard.

Enable the container server to redirect object updates to shard
containers.

Enable object updater to accept redirection of an object update.

Co-Authored-By: Matthew Oliver <matt@oliver.net.au>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>
Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Change-Id: I6ff85827eecdea746b3626c0d401f68139cce19d
2018-05-18 18:48:13 +01:00
Samuel Merritt
a19548b3e6 Remove empty part dirs during ssync replication
When we're pushing data to a remote node using ssync, we end up
walking the entire partition's directory tree. We were already
removing reclaimable (i.e. old) tombstones and non-durable EC data
files plus their containing hash dirs, but we were leaving the suffix
dirs around for future removal, and we weren't cleaning up partition
dirs at all. Now we remove as much of the directory structure as we
can, even up to the partition dir, as soon as we observe that it's
empty.

Change-Id: I2849a757519a30684646f3a6f4467c21e9281707
Closes-Bug: 1706321
2018-05-01 17:18:22 -07:00
Tim Burke
57b632fbb5 Fix object-server to not 400 all expirer DELETEs
In the related changes, we switched to using
Timestamp.normal representations for the X-If-Delete-At header.
However, the object-server required that the header be an int,
and the trailing '.00000' would cause trip the
"Bad X-If-Delete-At header value" error handling.

Now, we'll convert both the expirer header and the stored X-Delete-At to
Timestamps, even though we expect them to have no fractional value.

Note that we *could* have changed the expirer to continue sending
headers that are valid ints, but Timestamps are already the normal
Swift-way of passing and comparing times -- we should use that.

Related-Change: Ibf61eb1f767a48cb457dd494e1f7c12acfd205de
Related-Change: Ie82622625d13177e08a363686ec632f63d24f4e9
Change-Id: Ida22c1c8c5bf21bdc72c33e225e75fb750f8444b
2018-03-02 15:25:38 +00:00
Zuul
ee282c1166 Merge "Fix suffix-byte-range responses for zero-byte replicated objects." 2018-02-20 07:38:10 +00:00
Tim Burke
f4cff9fbba object-server can 409 in response to x-if-delete-at
Previously, if the expirer had a stale work item (because the object
was overwritten or deleted, or some other process handled the delete),
then it would keep retrying for reclaim_age, but every time it'd get
back a 412.

Now, have the object-server be smart enough to say, "I have more recent
information than you" and let the expirer accept that as success.

Change-Id: I0a94482ed16cb30ce79074e053e6177fe97bcaa9
2018-01-18 10:16:48 -08:00
Alistair Coles
dfa0c4e604 Preserve expiring object behaviour with old proxy-server
The related change [1] causes expiring object records to no longer be
created if the X-Delete-At-Container header is not sent to the object
server, but old proxies prior to [2] (i.e. releases prior to 1.9.0)
did not send this header.

The goal of [1] can be alternatively achieved by making expiring
object record creation be conditional on the X-Delete-At-Host header.

[1] Related-Change: I20fc2f42f590fda995814a2fa7ba86019f9fddc1
[2] Related-Change: Id0873a3f2198ce285fe0b0c777738eff38bc2438

Change-Id: Ia0081693f01631d3f2a59612308683e939ced76a
2018-01-17 11:42:22 -08:00
Clay Gerrard
d707fc7b6d DRY out tests until the stone bleeds
Can we go deeper!?

Change-Id: Ibd3b06542aa1bfcbcb71cc98e6bb21a6a67c12f4
2018-01-17 16:11:25 +00:00
Zuul
9e6e9fd1bf Merge "Send correct number of X-Delete-At-* headers" 2018-01-17 12:06:08 +00:00
Zuul
7227d9967b Merge "Add tests for X-Backend-Clean-Expiring-Object-Queue true" 2018-01-17 05:30:45 +00:00
Christopher Bartz
d8f9045518 Send correct number of X-Delete-At-* headers
Send just as many requests with X-Delete-At-* as we do X-Container-* to
the object server.  Furthermore, stop the object server on making an
update to the expirer queue when it wasn't told to do so and remove the
log warning which would have been produced.

Reason:

It can be the case that the number of object replicas (OR) is larger
than the number of container replicas (CR) for a given storage policy
(most likely in case of EC).  Before this commit, only CR object servers
received the x-delete-at-* headers, which means that OR - CR object
servers did not receive the headers.  The servers missing the header
would produce a log warning and create the x-delete-at-container header
and async update on their own, which could lead to a bug, if the
expiring_objects_container_divisor option was misconfigured.

Change-Id: I20fc2f42f590fda995814a2fa7ba86019f9fddc1
Closes-Bug: #1733588
2018-01-17 01:36:28 +00:00
Zuul
eaf056154e Merge "Limit object-expirer queue updates on object DELETE, PUT, POST" 2018-01-12 20:39:40 +00:00
Alistair Coles
35ad4e8745 Add tests for X-Backend-Clean-Expiring-Object-Queue true
Check that when X-Backend-Clean-Expiring-Object-Queue is true
the object server does indeed call async_update.

Change-Id: I0a87979147591f15349b868a12ac6dd15ac4e37f
Related-Change: I4d64f4d1d107c437fd3c23e19160157fdafbcd42
2018-01-12 17:26:26 +00:00
Samuel Merritt
48da3c1ed7 Limit object-expirer queue updates on object DELETE, PUT, POST
Currently, on deletion of an expiring object, each object server
writes an async_pending to update the expirer queue and remove the row
for that object. Each async_pending is processed by the object updater
and results in all container replicas being updated. This is also true
for PUT and POST requests for existing expiring objects.

If you have Rc container replicas and Ro object replicas (or EC
pieces), then the number of expirer-queue requests made is Rc * Ro [1].

For a 3-replica cluster, that number is 9, which is not terrible. For
a cluster with 3 container replicas and a 15+4 EC scheme, that number
is 57, which is terrible.

This commit makes it so at most two object servers will write out the
async_pending files needed to update the queue, dropping the request
count to 2 * Rc [2]. The object server now looks for a header
"X-Backend-Clean-Expiring-Object-Queue: <true|false>" and writes or
does not write expirer-queue async_pendings as appropriate. The proxy
sends that header to 2 object servers.

The queue update is not necessary for the proper functioning of the
object expirer; if the queue update fails, then the object expirer
will try to delete the object, receive 404s or 412s, and remove the
queue entry. Removal on object PUT/POST/DELETE is helpful but not
required.

[1] assuming no retries needed by the object updater

[2] or Rc, if a cluster has only one object replica

Change-Id: I4d64f4d1d107c437fd3c23e19160157fdafbcd42
2018-01-11 12:07:28 -08:00
Samuel Merritt
31c294de79 Fix time skew when using X-Delete-After
When a client sent "X-Delete-After: <n>", the proxy and all object
servers would each compute X-Delete-At as "int(time.time() +
n)". However, since they don't all compute it at exactly the same
time, the objects stored on disk can end up with differing values for
X-Delete-At, and in that case, the object-expirer queue has multiple
entries for the same object (one for each distinct X-Delete-At value).

This commit makes two changes, either one of which is sufficient to
fix the bug.

First, after computing X-Delete-At from X-Delete-After, X-Delete-After
is removed from the request's headers. Thus, the proxy computes
X-Delete-At, and the object servers don't, so there's only a single
value.

Second, computation of X-Delete-At now uses the request's X-Timestamp
instead of time.time(). In the proxy, these values are essentially the
same; the proxy is responsible for setting X-Timestamp. In the object
server, this ensures that all computed X-Delete-At values are
identical, even if the object servers' clocks are not, or if one
object server takes an extra second to respond to a PUT request.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Change-Id: I9a1b6826c4c553f0442cfe2bb78cdf49508fa4a5
Closes-Bug: 1741371
2018-01-05 10:06:33 -08:00
Samuel Merritt
dd9bc82826 Fix suffix-byte-range responses for zero-byte replicated objects.
Previously, given a GET request like "Range: bytes=-12345" for a
zero-byte object, Swift would return a 206 response with the header
"Content-Range: bytes 0--1/0". This is clearly incorrect. Now Swift
returns a 200 with the whole (zero-byte) object.

Note: this does not fix the bug for EC objects, only for replicated
ones. The fix for EC objects will follow in a separate commit.

Change-Id: If1edb665b0ae000da78c4efff6faddd94d75da6b
Partial-Bug: 1736840
2017-12-07 12:10:02 -08:00
Steve Kowalik
5a06e3da3b No longer import nose
Since Python 2.7, unittest in the standard library has included mulitple
facilities for skipping tests by decorators as well as an exception.
Switch to that directly, rather than importing nose.

Change-Id: I4009033473ea24f0d0faed3670db844f40051f30
2017-11-07 15:39:25 +11:00
Samuel Merritt
728b4ba140 Add checksum to object extended attributes
Currently, our integrity checking for objects is pretty weak when it
comes to object metadata. If the extended attributes on a .data or
.meta file get corrupted in such a way that we can still unpickle it,
we don't have anything that detects that.

This could be especially bad with encrypted etags; if the encrypted
etag (X-Object-Sysmeta-Crypto-Etag or whatever it is) gets some bits
flipped, then we'll cheerfully decrypt the cipherjunk into plainjunk,
then send it to the client. Net effect is that the client sees a GET
response with an ETag that doesn't match the MD5 of the object *and*
Swift has no way of detecting and quarantining this object.

Note that, with an unencrypted object, if the ETag metadatum gets
mangled, then the object will be quarantined by the object server or
auditor, whichever notices first.

As part of this commit, I also ripped out some mocking of
getxattr/setxattr in tests. It appears to be there to allow unit tests
to run on systems where /tmp doesn't support xattrs. However, since
the mock is keyed off of inode number and inode numbers get re-used,
there's lots of leakage between different test runs. On a real FS,
unlinking a file and then creating a new one of the same name will
also reset the xattrs; this isn't the case with the mock.

The mock was pretty old; Ubuntu 12.04 and up all support xattrs in
/tmp, and recent Red Hat / CentOS releases do too. The xattr mock was
added in 2011; maybe it was to support Ubuntu Lucid Lynx?

Bonus: now you can pause a test with the debugger, inspect its files
in /tmp, and actually see the xattrs along with the data.

Since this patch now uses a real filesystem for testing filesystem
operations, tests are skipped if the underlying filesystem does not
support setting xattrs (eg tmpfs or more than 4k of xattrs on ext4).

References to "/tmp" have been replaced with calls to
tempfile.gettempdir(). This will allow setting the TMPDIR envvar in
test setup and getting an XFS filesystem instead of ext4 or tmpfs.

THIS PATCH SIGNIFICANTLY CHANGES TESTING ENVIRONMENTS

With this patch, every test environment will require TMPDIR to be
using a filesystem that supports at least 4k of extended attributes.
Neither ext4 nor tempfs support this. XFS is recommended.

So why all the SkipTests? Why not simply raise an error? We still need
the tests to run on the base image for OpenStack's CI system. Since
we were previously mocking out xattr, there wasn't a problem, but we
also weren't actually testing anything. This patch adds functionality
to validate xattr data, so we need to drop the mock.

`test.unit.skip_if_no_xattrs()` is also imported into `test.functional`
so that functional tests can import it from the functional test
namespace.

The related OpenStack CI infrastructure changes are made in
https://review.openstack.org/#/c/394600/.

Co-Authored-By: John Dickinson <me@not.mn>

Change-Id: I98a37c0d451f4960b7a12f648e4405c6c6716808
2017-11-03 13:30:05 -04:00
Clay Gerrard
36a843be73 Preserve X-Static-Large-Object from .data file after POST
You can't modify the X-Static-Large-Object metadata with a POST, an
object being a SLO is a property of the .data file.  Revert the change
from 4500ff which attempts to correctly handle X-Static-Large-Object
metadata on a POST, but is subject to a race if the most recent SLO
.data isn't available during the POST.  Instead this change adjusts the
reading of metadata such that the X-Static-Large-Object metadata is
always preserved from the metadata on the datafile and bleeds through
a .meta if any.

Closes-bug: #1453807
Closes-bug: #1634723

Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Change-Id: Ie48a38442559229a2993443ab0a04dc84717ca59
2017-09-26 15:14:57 -04:00
Pavel Kvasnička
163fb4d52a Always require device dir for containers
For test purposes (e.g. saio probetests) even if mount_check is False,
still require check_dir for account/container server storage when real
mount points are not used.

This behavior is consistent with the object-server's checks in diskfile.

Co-Author: Clay Gerrard <clay.gerrard@gmail.com>
Related lp bug #1693005
Related-Change-Id: I344f9daaa038c6946be11e1cf8c4ef104a09e68b
Depends-On: I52c4ecb70b1ae47e613ba243da5a4d94e5adedf2
Change-Id: I3362a6ebff423016bb367b4b6b322bb41ae08764
2017-09-01 10:32:12 -07:00
Clay Gerrard
2e0ca543e8 Make X-Backend-Replication consistent for HEAD
Current an X-Backend-Replication GET request for an expired object will
still open the expired datafile and return the object, but HEAD request
with the same headers will 404.  This can lead to some bad assumptions
in probetests and other places where we make direct HEAD requests

N.B. Because SSYNC replication does not make any HEAD requests this
change is immaterial to the correctness of the consistency engine.

Related-Change-Id: I7f90b732c3268cb852b64f17555c631d668044a8
Change-Id: Idc01970b37d1b77e1d48f9c4f4979f63ee771093
2017-08-25 15:08:21 -07:00
Tim Burke
15e339da1f Fix more X-Delete-At timing issues
Seen in a gate:

======================================================================
2017-08-17 23:10:29.662540 | FAIL: test_GET_but_expired
(test.unit.obj.test_server.TestObjectController)
2017-08-17 23:10:29.662577 |
----------------------------------------------------------------------
2017-08-17 23:10:29.662600 | Traceback (most recent call last):
2017-08-17 23:10:29.662651 |   File "/home/jenkins/workspace/gate-cross-swift-python27-ubuntu-xenial/test/unit/obj/test_server.py", line 5754, in test_GET_but_expired
2017-08-17 23:10:29.662677 |     self.assertEqual(resp.status_int, 201)
2017-08-17 23:10:29.662697 | AssertionError: 400 != 201
2017-08-17 23:10:29.662729 | -------------------- >> begin captured stdout << ---------------------
2017-08-17 23:10:29.662769 | test INFO: None - - [17/Aug/2017:23:08:45 +0000] "PUT /sda1/p/a/c/o" 201 - "-" "-" "-" 0.0072 "-" 6413 -
2017-08-17 23:10:29.662811 | test INFO: None - - [17/Aug/2017:23:08:45 +0000] "GET /sda1/p/a/c/o" 200 4 "-" "-" "-" 0.0011 "-" 6413 -
2017-08-17 23:10:29.662852 | test INFO: None - - [17/Aug/2017:23:08:45 +0000] "PUT /sda1/p/a/c/o" 400 19 "-" "-" "-" 0.0004 "-" 6413 -
2017-08-17 23:10:29.662865 |
2017-08-17 23:10:29.662896 | --------------------- >> end captured stdout << ----------------------
2017-08-17 23:10:29.662925 |     '400 != 201' = '%s != %s' % (safe_repr(400), safe_repr(201))
2017-08-17 23:10:29.662956 |     '400 != 201' = self._formatMessage('400 != 201', '400 != 201')
2017-08-17 23:10:29.662981 | >>  raise self.failureException('400 != 201')

Change-Id: I643be9af8f054f33897dd74071027a739eaa2c5c
Related-Change: I10d3b9fcbefff3c415a92fa284a1ea1eda458581
Related-Bug: #1597520
2017-08-18 01:12:39 +00:00
Jenkins
a5955140e3 Merge "Allow to rebuild a fragment of an expired object" 2017-08-17 22:37:31 +00:00
Jenkins
53defc8b9e Merge "Fix intermittent failure in test_POST_but_expired" 2017-08-07 07:16:45 +00:00
Romain LE DISEZ
69df458254 Allow to rebuild a fragment of an expired object
When a fragment of an expired object was missing, the reconstructor
ssync job would send a DELETE sub-request. This leads to situation
where, for the same object and timestamp, some nodes have a data file,
while others can have a tombstone file.

This patch forces the reconstructor to reconstruct a data file, even
for expired objects. DELETE requests are only sent for tombstoned
objects.

Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
Closes-Bug: #1652323
Change-Id: I7f90b732c3268cb852b64f17555c631d668044a8
2017-08-04 23:05:08 +02:00
Jenkins
83b62b4f39 Merge "Add Timestamp.now() helper" 2017-07-18 03:27:50 +00:00
Tim Burke
fa83dd1ab4 Fix intermittent failure in test_POST_but_expired
It was rare (I saw it once in 8000 runs), but occassionally the clock would
roll over and cause us to PUT an X-Delete-At that matched the current time.

Change-Id: I10d3b9fcbefff3c415a92fa284a1ea1eda458581
Closes-Bug: #1597520
2017-07-12 01:05:45 +00:00
Jenkins
f1e1dbb80a Merge "Make eventlet.tpool's thread count configurable in object server" 2017-07-04 11:49:24 +00:00
Samuel Merritt
d9c4913e3b Make eventlet.tpool's thread count configurable in object server
If you're running servers_per_port > 0 and threads_per_disk = 0 (as it
should be with servers_per_port on), each object-server process will
have 20 IO threads waiting around to service eventlet.tpool
calls. This is far too many; with servers_per_port, there's no real
benefit to having so many IO threads.

This commit makes it so that, when servers_per_port > 0, each object
server defaults to having one main thread and one IO thread.

Also, eventlet's tpool size is now configurable via the object-server
config file. If a tpool size is set, that's what we'll use regardless
of servers_per_port. This allows operators with an excess of threads
to remove some regardless of servers_per_port.

Change-Id: I8f8914b7e70f2510393eb7c5e6be9708631ac027
Closes-Bug: 1554233
2017-06-23 16:16:03 +10:00
lingyongxu
ee9458a250 Using assertIsNone() instead of assertEqual(None)
Following OpenStack Style Guidelines:
[1] http://docs.openstack.org/developer/hacking/#unit-tests-and-assertraises
[H203] Unit test assertions tend to give better messages for more specific
assertions. As a result, assertIsNone(...) is preferred over
assertEqual(None, ...) and assertIs(..., None)

Change-Id: If4db8872c4f5705c1fff017c4891626e9ce4d1e4
2017-06-07 14:05:53 +08:00
Pete Zaitcev
5dfc3a75fb Open-code eventlet.listen()
Recently out gate started blowing up intermittently with a strange
case of ports mixed up. Sometimes a functional tests tries to
authorize on a port that's clearly an object server port, and
the like. As it turns out, eventlet developers added an unavoidable
SO_REUSEPORT into listen(), which makes listen(("localhost",0)
to reuse ports.

There's an issue about it:
 https://github.com/eventlet/eventlet/issues/411

This patch is working around the problem while eventlet people
consider the issue.

Change-Id: I67522909f96495a6a30e1acdb79835dce2189549
2017-05-11 01:39:14 -06:00
Jenkins
2abffb99b9 Merge "Fix sporadic failure in TestObjectController.test_container_update_async" 2017-05-10 14:51:35 +00:00
Tim Burke
50357de575 Fix sporadic failure in TestObjectController.test_container_update_async
Change-Id: Ie4d58626ebe97049703802a43c669cc78cf60f8b
Related-Change: I15f36e191cfe3ee6c82b4be56e8618ec0230e328
Closes-Bug: #1589994
2017-05-05 00:11:39 +00:00