s3api bucket listing elements currently have LastModified values with
millisecond precision. This is inconsistent with the value of the
Last-Modified header returned with an object GET or HEAD response
which has second precision. This patch reduces the precision to
seconds in bucket listings and upload part listings. This is also
consistent with observation of an aws listing response.
The last modified values in the swift native listing *up* to
the nearest second to be consistent with the seconds-precision
Last-Modified time header that is returned with an object GET or HEAD.
However, we continue to include millisecond digits set to 0 in the
last-modified string, e.g.: '2014-06-10T22:47:32.000Z'.
Also, fix the last modified time returned in an object copy response
to be consistent with the last modified time of the object that was
created. Previously it was rounded down, but it should be rounded up.
Change-Id: I8c98791a920eeedfc79e8a9d83e5032c07ae86d3
Specifically, parameters that may contain non-ASCII characters,
such as the prefix and marker to list current uploads.
Change-Id: Icfae68825f94ddf2412c0274c3d500e265117e8e
Previous problems included:
- returning wsgi strings quoted assuming UTF-8 on py3 when initiating
or completing multipart uploads
- trying to str() some unicode on py2 when listing parts, leading to
UnicodeEncodeErrors
Change-Id: Ibc1d42c8deffe41c557350a574ae80751e9bd565
md5 is not an approved algorithm in FIPS mode, and trying to
instantiate a hashlib.md5() will fail when the system is running in
FIPS mode.
md5 is allowed when in a non-security context. There is a plan to
add a keyword parameter (usedforsecurity) to hashlib.md5() to annotate
whether or not the instance is being used in a security context.
In the case where it is not, the instantiation of md5 will be allowed.
See https://bugs.python.org/issue9216 for more details.
Some downstream python versions already support this parameter. To
support these versions, a new encapsulation of md5() is added to
swift/common/utils.py. This encapsulation is identical to the one being
added to oslo.utils, but is recreated here to avoid adding a dependency.
This patch is to replace the instances of hashlib.md5() with this new
encapsulation, adding an annotation indicating whether the usage is
a security context or not.
While this patch seems large, it is really just the same change over and
again. Reviewers need to pay particular attention as to whether the
keyword parameter (usedforsecurity) is set correctly. Right now, all
of them appear to be not used in a security context.
Now that all the instances have been converted, we can update the bandit
run to look for these instances and ensure that new invocations do not
creep in.
With this latest patch, the functional and unit tests all pass
on a FIPS enabled system.
Co-Authored-By: Pete Zaitcev
Change-Id: Ibb4917da4c083e1e094156d748708b87387f2d87
When completing a multipart-upload, include the upload-id in sysmeta.
If we can't find the upload marker, check the final object name; if it
has an upload-id in sysmeta and it matches the upload-id that we're
trying to complete, allow the complete to continue.
Also add an early return if the already-completed upload's ETag matches
the computed ETag for the user's request. This should help clients that
can't take advantage of how we dribble out whitespace to try to keep the
conneciton alive: The client times out, retries, and if the upload
actually completed, it gets a fast 200 response.
Change-Id: I38958839be5b250c9d268ec7c50a56cdb56c2fa2
The repo is Python using both Python 2 and 3 now, so update hacking to
version 2.0 which supports Python 2 and 3. Note that latest hacking
release 3.0 only supports version 3.
Fix problems found.
Remove hacking and friends from lower-constraints, they are not needed
for installation.
Change-Id: I9bd913ee1b32ba1566c420973723296766d1812f
unittest2 was needed for Python version <= 2.6, so it hasn't been needed
for quite some time. See unittest2 note one:
https://docs.python.org/2.7/library/unittest.html
This drops unittest2 in favor of the standard unittest module.
Change-Id: I2e787cfbf1709b7f9c889230a10c03689e032957
Signed-off-by: Sean McGinnis <sean.mcginnis@gmail.com>
Previously, we'd preserve the sysmeta that we wrote down with the
original multipart-upload to track its S3-style etag on the new part,
causing it to have an ETag like `<MD5>-<N>`. Later, when the client
tried to complete the new multipart-upload, it would send that etag back
to the server, which would reject the request because the ETag didn't
look like a normal MD5.
Now, have s3api include blank values in the copy request to overwrite
the source sysmeta, and treat a blank etag override the same as a
missing one.
Change-Id: Id33a7ab9d0b8f33fede73eae540d6137708e1218
Closes-Bug: #1829959
We have code that's *supposed* to do it, but we weren't reading the
result of the bulk-delete, so we never actually deleted anything!
Change-Id: I5c972749cadf903161456f34371a6f83ebc05eb9
Closes-Bug: 1810567
Previously, we would list the segments container before completing a
multipart upload so that we could verify ETags and sizes before
attempting to create the SLO. However, container listings are only
eventually-consistent, which meant that clients could receive a 400
response complaining that parts could not be found, even though all
parts were uploaded successfully.
Now, use the new SLO validator callback to validate segment sizes, and
use the existing SLO checks to validate ETags.
Change-Id: I57ae6756bd5f06b80cf03a6b40bf58c845f710fe
Closes-Bug: #1636663
S3 docs say:
> Processing of a Complete Multipart Upload request could
> take several minutes to complete. After Amazon S3 begins
> processing the request, it sends an HTTP response header
> that specifies a 200 OK response. While processing is in
> progress, Amazon S3 periodically sends whitespace
> characters to keep the connection from timing out. Because
> a request could fail after the initial 200 OK response has
> been sent, it is important that you check the response
> body to determine whether the request succeeded.
Let's do that, too!
Change-Id: Iaf420983c41256ee9a4c43cfd74025d2ca069ae6
Closes-Bug: 1718811
Related-Change: I65cee5f629c87364e188aa05a06d563c3849c8f3
Multipart uploads in AWS (seem to) have ETags like:
'"' + MD5_hex(MD5(part1) + ... + MD5(partN)) + '-' + N + '"'
On the other hand, Swift SLOs have Etags like:
MD5_hex(MD5_hex(part1) + ... + MD5_hex(partN))
(In both examples, MD5 gets the raw 16-byte digest while MD5_hex
gets the 32-byte hex-encoded digest.)
Some clients (such as aws-sdk-java) use the presence of a dash
to decide whether to perform client-side validation of downloads.
Other clients (like s3cmd) use the presence of a dash *in bucket
listings* to decide whether or not to perform additional HEAD requests
to look for MD5 metadata that can be used to compare against the MD5s
of local files.
Now we include a dash as well, to prevent spurious errors like
> Unable to verify integrity of data download. Client calculated
> content hash didn't match hash calculated by Amazon S3. The data
> may be corrupt.
or unnecessary uploads/downloads because the client assumes data has
changed that hasn't.
For new multipart-uploads via the S3 API, the ETag that is stored will
be calculated in the same way that AWS uses. This ETag will be used in
GET/HEAD responses, bucket listings, and conditional requests via the S3
API. Accessing the same object via the Swift API will use the SLO Etag;
however, in JSON container listings the multipart upload etag will be
exposed in a new "s3_etag" key.
New SLOs and pre-existing multipart-uploads will continue to behave as
before; there is no data migration or mitigation as part of this patch.
Change-Id: Ibe68c44bef6c17605863e9084503e8f5dc577fab
Closes-Bug: 1522578
This attempts to import openstack/swift3 package into swift upstream
repository, namespace. This is almost simple porting except following items.
1. Rename swift3 namespace to swift.common.middleware.s3api
1.1 Rename also some conflicted class names (e.g. Request/Response)
2. Port unittests to test/unit/s3api dir to be able to run on the gate.
3. Port functests to test/functional/s3api and setup in-process testing
4. Port docs to doc dir, then address the namespace change.
5. Use get_logger() instead of global logger instance
6. Avoid global conf instance
Ex. fix various minor issue on those steps (e.g. packages, dependencies,
deprecated things)
The details and patch references in the work on feature/s3api are listed
at https://trello.com/b/ZloaZ23t/s3api (completed board)
Note that, because this is just a porting, no new feature is developed since
the last swift3 release, and in the future work, Swift upstream may continue
to work on remaining items for further improvements and the best compatibility
of Amazon S3. Please read the new docs for your deployment and keep track to
know what would be changed in the future releases.
Change-Id: Ib803ea89cfee9a53c429606149159dd136c036fd
Co-Authored-By: Thiago da Silva <thiago@redhat.com>
Co-Authored-By: Tim Burke <tim.burke@gmail.com>