86 Commits

Author SHA1 Message Date
Zuul
25e23988b3 Merge "Treat 404 as success when deleting segments" 2018-07-12 01:46:49 +00:00
Kota Tsuyuzaki
e65070964c Add force auth retry mode in swiftclient
This patch attemps to add an option to force get_auth call while retrying
an operation even if it gets errors other than 401 Unauthorized.

Why we need this:
The main reason why we need this is current python-swiftclient requests could
never get succeeded under certion situation using third party proxies/load balancers
between the client and swift-proxy server. I think, it would be general situation
of the use case.

Specifically describing nginx case, the nginx can close the socket from the client
when the response code from swift is not 2xx series. In default, nginx can wait the
buffers from the client for a while (default 30s)[1] but after the time past, nginx
will close the socket immediately. Unfortunately, if python-swiftclient has still been
sending the data into the socket, python-swiftclient will get socket error (EPIPE,
BrokenPipe). From the swiftclient perspective, this is absolutely not an auth error,
so current python-swiftclient will continue to retry without re-auth.
However, if the root cause is sort of 401 (i.e. nginx got 401 unauthorized from the
swift-proxy because of token expiration), swiftclient will loop 401 -> EPIPE -> 401...
until it consume the max retry times.

In particlar, less time to live of the token and multipart object upload with large
segments could not get succeeded as below:

Connection Model:

python-swiftclient -> nginx -> swift-proxy -> swift-backend

Case: Try to create slo with large segments and the auth token expired with 1 hour

1. client create a connection to nginx with successful response from swift-proxy and its auth
2. client continue to put large segment objects
   (e.g. 1~5GB for each and the total would 20~30GB, i.e. 20~30 segments)
3. after some of segments uploaded, 1 hour past but client is still trying to
   send remaining segment objects.
4. nginx got 401 from swift-proxy for a request and wait that the connection is closed
   from the client but timeout past because the python-swiftclient is still sending much data
   into the socket before reading the 401 response.
5. client got socket error because nginx closed the connection during sending the buffer.
6. client retries a new connection to nginx without re-auth...

<loop 4-6>

7. finally python-swiftclient failed with socket error (Broken Pipe)

In operational perspective, setting longer timeout for lingering close would be an option but
it's not complete solution because any other proxy/LB may not support the options.

If we actually do THE RIGHT THING in python-swiftclient, we should send expects: 100-continue
header and handle the first response to re-auth correctly.

HOWEVER, the current python's httplib and requests module used by python-swiftclient doesn't
support expects: 100-continue header [2] and the thread proposed a fix [3] is not super active.
And we know the reason we depends on the library is to fix a security issue that existed
in older python-swiftclient [4] so that we should touch around it super carefully.

In the reality, as the hot fix, this patch try to mitigate the unfortunate situation
described above WITHOUT 100-continue fix, just users can force to re-auth when any errors
occurred during the retries that can be accepted in the upstream.

1: http://nginx.org/en/docs/http/ngx_http_core_module.html#lingering_close
2: https://github.com/requests/requests/issues/713
3: https://bugs.python.org/issue1346874
4: https://review.openstack.org/#/c/69187/

Change-Id: I3470b56e3f9cf9cdb8c2fc2a94b2c551927a3440
2018-03-13 12:29:48 +09:00
Tim Burke
2901e1e9ef Treat 404 as success when deleting segments
Change-Id: I76be70ddb289bd4f1054a684a247279ab16ca34a
2018-01-26 14:14:18 -08:00
Timur Alperovich
2faea93287 Allow for object uploads > 5GB from stdin.
When uploading from standard input, swiftclient should turn the upload
into an SLO in the case of large objects. This patch picks the
threshold as 10MB (and uses that as the default segment size). The
consumers can also supply the --segment-size option to alter that
threshold and the SLO segment size. The patch does buffer one segment
in memory (which is why 10MB default was chosen).

(test is updated)

Change-Id: Ib13e0b687bc85930c29fe9f151cf96bc53b2e594
2018-01-18 04:56:12 +00:00
Zuul
cde257de5c Merge "Allow --meta on upload" 2017-12-08 19:51:05 +00:00
Timur Alperovich
0982791db2 Allow for uploads from standard input.
If "-" is passed in for the source, python-swiftclient will upload
the object by reading the contents of the standard input. The object
name option must be set, as well, and this cannot be used in
conjunction with other files.

This approach stores the entire contents as one object. A follow on
patch will change this behavior to upload from standard input as SLO,
unless the segment size is larger than the content size.

Change-Id: I1a8be6377de06f702e0f336a5a593408ed49be02
2017-07-26 17:04:19 -07:00
Tim Burke
638d7c789c Buffer reads from disk
Otherwise, Python defaults to 8k reads which seems kinda terrible.

Change-Id: I3160626e947083af487fd1c3cb0aa6a62646527b
Closes-Bug: #1671621
2017-07-11 17:04:49 -07:00
Tim Burke
484d7ee9b2 Allow --meta on upload
Previously, the --meta option was only allowed on post or copy subcommands.

Change-Id: I87bf0338c34b5e89aa946505bee68dbeb37d784c
Closes-Bug: #1616238
2017-07-06 12:43:11 -07:00
Christopher Bartz
cde73c196d Option to ignore mtime metadata entry.
Currently, the swiftclient upload command passes a custom metadata
header for each object (called object-meta-mtime), whose value is
the current UNIX timestamp. When downloading such an object with the
swiftclient, the mtime header is parsed and passed as the atime and
mtime for the newly created file.

There are use-cases where this is not desired, for example when using
tmp or scratch directories in which files older than a specific date
are deleted. This commit provides a boolean option for ignoring the
mtime header.

Change-Id: If60b389aa910c6f1969b999b5d3b6d0940375686
2017-07-06 10:19:12 -07:00
Jenkins
1d57403668 Merge "Skip checksum validation on partial downloads" 2017-06-22 01:13:26 +00:00
Jenkins
bc3171cfac Merge "Tolerate RFC-compliant ETags" 2017-06-22 01:13:04 +00:00
Jenkins
4515002d78 Merge "Stop sending X-Static-Large-Object headers" 2017-06-14 23:35:32 +00:00
Tim Burke
527f2ff687 Skip checksum validation on partial downloads
If we get back some partial content, we can't validate the MD5.
That's OK.

Change-Id: Ic1d65272190af0d3d982f3cd06833cac5c791a1e
Closes-Bug: 1642021
2017-04-21 10:30:01 -07:00
Tim Burke
64da481ccd Tolerate RFC-compliant ETags
Since time immemorial, Swift has returned unquoted ETags for plain-old
Swift objects -- I hear tell that we once tried to change this, but
quickly backed it out when some clients broke.

However, some proxies (such as nginx) apparently may force the ETag to
adhere to the RFC, which states [1]:

    An entity-tag consists of an opaque *quoted* string

(emphasis mine). See the related bug for an instance of this happening.

Since we can still get the original ETag easily, we should tolerate the
more-compliant format.

[1] https://tools.ietf.org/html/rfc2616.html#section-3.11 or, if you
    prefer the new ones, https://tools.ietf.org/html/rfc7232#section-2.3

Change-Id: I7cfacab3f250a9443af4b67111ef8088d37d9171
Closes-Bug: 1681529
Related-Bug: 1678976
2017-04-21 10:28:33 -07:00
John Dickinson
0cc4d8af18 respect bulk delete page size and fix logic error
Previously, using SwiftService to delete "many" objects would use
bulk delete if available, but it would not respect the bulk delete
page size. If the number of objects to delete exceeded the bulk delete
page size, SwiftService would ignore the error and nothing would be
deleted.

This patch changes _should_bulk_delete() to be _bulk_delete_page_size();
instead of returning a simple True/False, it returns the page size for
the bulk deleter, or 1 if objects should be deleted one at a time.
Delete SDK calls are then spread across multiple bulk DELETEs if the
requested number of objects to delete exceeds the returned page size.

Fixed the logic in _should_bulk_delete() so that if the object list
is exactly 2x the thread count, it will not bulk delete. This is the
natural conclusion following the logic that existed previously: if
the delete request can be satisfied by every worker thread doing one
or two tasks, don't bulk delete. But if it requires a worker thread
to do three or more tasks, do a bulk delete instead. Previously, the
logic would mean that if every worker thread did exactly two tasks, it
would bulk delete. This patch changes a "<" to a "<=".

Closes-Bug: 1679851
Change-Id: I3c18f89bac1170dc62187114ef06dbe721afcc2e
2017-04-20 09:41:53 -07:00
Tim Burke
aaaed55cd4 Stop sending X-Static-Large-Object headers
If we were to include this in a normal PUT, it would 400, but only if
slo is actually in the pipeline. If it's *not*, we'll create a normal
Swift object and the header sticks.

- This is really confusing for users; see the related bug.
- If slo is later enabled in the cluster, Swift starts responding 500
  with a KeyError because the client and on-disk formats don't match!

Change-Id: I1d80c76af02f2ca847123349224ddc36d2a6996b
Related-Change: I986c1656658f874172860469624118cc63bff9bc
Related-Bug: #1680083
2017-04-10 15:40:35 -07:00
liuyamin
2710ff255b Fix some reST field lists in docstrings
Probably the most common format for documenting arguments is
reST field lists [1]. This change updates some docstrings to
comply with the field lists syntax.

[1] http://sphinx-doc.org/domains.html#info-field-lists

Change-Id: Ic011fd3e3a8c5bafa24a3438a6ed5bb126b50e95
2017-03-29 09:28:46 +08:00
Kazufumi Noto
809e4cf98f Close file handle after upload job
The opened file for upload is not closed.
This fix prevents possible file handle leak.

Closes-Bug: #1559079
Change-Id: Ibc58667789e8f54c74ae2bbd32717a45f7b30550
2017-03-16 18:03:13 +00:00
Tim Burke
a1e2bcde4a Accept more types of input for headers/meta
Previously, we only accepted iterables of strings like 'Header: Value'.
Now, we'll also accept lists of tuples like ('Header', 'Value') as well
as dictionaries like {'Header': 'Value'}.

This should be more intuitive for application developers, who are
already used to being able to pass dicts or lists of tuples to libraries
like requests.

Change-Id: I93ed2f1e8305f0168b7a4bd90c205b04730da836
2016-11-18 11:47:14 -08:00
Charles Hsu
6cf2bd6626 Add additional headers for HEAD/GET/DELETE requests.
Change-Id: I69276ba711057c122f97deac412e492e313c34dd
Closes-Bug: 1615830
2016-11-07 13:18:29 +08:00
Clay Gerrard
96f0031e64 boolean logic cleanup in service.Swift[Copy|Post]Object
Change-Id: I07e74536502ec2479b22a825f684703d65924563
2016-08-25 13:15:44 -07:00
Jenkins
5acefd27e4 Merge "Strip leading/trailing whitespace from headers" 2016-08-25 04:24:16 +00:00
Jenkins
b57044a853 Merge "Add copy object method" 2016-08-24 23:59:34 +00:00
Marek Kaleta
4a2465fb12 Add copy object method
Implement copy object method in swiftclient Connection, Service and CLI.

Although COPY functionality can be accomplished with 'X-Copy-From'
header in PUT request, using copy is more convenient especially when
using copy for updating object metadata non-destructively.

Closes-Bug: 1474939
Change-Id: I1338ac411f418f4adb3d06753d044a484a7f32a4
2016-08-23 14:37:11 -07:00
Tim Burke
8fbe118cea Strip leading/trailing whitespace from headers
New versions of requests will raise an InvalidHeader error otherwise.

Change-Id: Idf3bcd8ac359bdda9a847bf99a78988943374134
Closes-Bug: #1614280
Closes-Bug: #1613814
2016-08-19 19:53:11 +00:00
Cheng Li
69bf4634b9 Add an option: disable etag check on downloads
This patch is to add an option of disable etag
check on downloads.

Change-Id: I9ad389dd691942dea6db470ca3f0543eb6e9703e
Closes-bug: #1581147
2016-06-02 22:53:18 +08:00
Jenkins
f9d0657e70 Merge "Support client certificate/key" 2016-05-19 22:20:17 +00:00
Jenkins
8ffc5c11ae Merge "Use application/directory content-type for dir markers" 2016-05-19 08:53:50 +00:00
Tim Burke
67f629cdeb Default to v3 auth if we find a (user|project)-domain-(name|id) option
Change-Id: I4616492752b620de0bf90672142f1071ec9bac83
2016-05-03 14:41:07 -07:00
Tim Burke
c3766319b9 Pull option processing out to service.py
...because it seems silly that we do nearly the same thing in two
different places

Change-Id: Iafafe3c553d00652adb91ceefd0de1479cbcb5da
2016-05-03 14:18:34 -07:00
Jenkins
6b065dd4a6 Merge "Identify segments uploaded via swiftclient" 2016-05-02 10:35:20 +00:00
Cedric Brandily
450f505c35 Support client certificate/key
This change enables to specify a client certificate/key with:
 * usual CLI options (--os-cert/--os-key)
 * usual environment variables ($OS_CERT/$OS_KEY)

Closes-Bug: #1565112
Change-Id: I12e151adcb6084d801c6dfed21d82232a3259aea
2016-04-10 23:20:49 +02:00
Sergey Gotliv
909bdf8954 Fix downloading from "marker" item
The documentation of "swift download" hints that "marker" option
is supported, but in reality we forgot to patch it through, so
all downloads were always done with the default, empty marker.

Closes-Bug: #1565393
Change-Id: I38bd29d2baa9188b61397dec75ce1d864041653c
2016-04-08 20:55:18 -06:00
Tim Burke
9fd537a082 Use application/directory content-type for dir markers
Previously, we were using a content-type of text/directory, but that is
already defined in RFC 2425 and doesn't reflect our usage:

   The text/directory Content-Type is defined for holding a variety
   of directory information, for example, name, or email address,
   or logo.

(From there it goes on to describe a superset of the vCard format
defined in RFC 2426.)

application/directory, on the other hand, is used by Static Web [1] and
is used by cloudfuse [2]. Seems like as sane a choice as any to
standardize on.

[1] https://github.com/openstack/swift/blob/2.5.0/swift/common/middleware/staticweb.py#L71-L75
[2] https://github.com/redbo/cloudfuse/blob/1.0/README#L105-L106

Change-Id: I19e30484270886292d83f50e7ee997b6e1623ec7
2016-04-08 16:57:51 -07:00
Tim Burke
4a6fa02c28 Identify segments uploaded via swiftclient
...using a new "application/swiftclient-segment" content-type.

Segments uploaded by swiftclient are expected to have a many-to-one
relationship to large objects, rather than the more-general many-to-many
relationship that SLO and DLO generally allow. Later, we may use this
information to make more intelligent decisions, such as when to
automatically clean up segments.

Change-Id: Ie56a3aa10065db754ac572cc37d93f2c901aac60
2016-04-08 13:41:29 -07:00
Marek Kaleta
51a8a5a7ae Fix SwiftPostObject options usage in SwiftService
SwiftService().post(cont, [SwiftPostObject(obj, options]) currently
ignores options['header'], raises exception when options['headers']
is set and make malformed metadata when options['meta'] is set.

Fix tipos in code, add unittest for SwiftService().post
Closes-Bug: #1560052

Change-Id: Ie460f753492e9b73836b4adfc7c9c0f2130a8a91
2016-03-23 15:38:10 +01:00
Hu Bing
b7d20b8a18 download method shouldn't download all object
in python-swiftclient/swiftclient/service.py,
there is a method
def download(self, container=None, objects=None, options=None):

if container is specified but objects not, it download all
objects in specified container.
if both container and objects are specified, it download all
specified objects in the container.

when it comes to the case that, objects argument is specified,
but it turned out to be empty array [ ], the download method
download all the objects under specified container.
this may be not reasonable.

for example,
the objects was not empty when it came from command line,
but it's filtered, maybe by --prefix argument.
at last, it turned out to be empty array.

when calling download method with objects arguments
being empty array, we should download nothing instead of
all the objects under the specified container.

Change-Id: I81aab935533a50b40679c8b3575f298c285233a8
Closes-bug: #1549881
2016-03-01 03:10:34 +08:00
Jenkins
50f3396620 Merge "Fix segmented upload to pseudo-dir via <container>" 2016-02-10 19:55:30 +00:00
James Nzomo
5ed02345d3 Fix segmented upload to pseudo-dir via <container>
This fix ensures creation and use of the correct default segment
container when pseudo-folder paths are passed via <container> arg.

Change-Id: I90356b041dc9dfbd55eb341271975621759476b9
Closes-Bug: 1532981
Related-Bug: 1478210
2016-02-04 05:02:17 +03:00
Tim Burke
337570a03a Don't trust X-Object-Meta-Mtime
Still use it if we can, but stop throwing ValueErrors if it's garbage.

Change-Id: I2cf25b535ad62cfacb7561954a92a4a73d91000a
2016-01-29 16:50:15 -08:00
Jenkins
6d52264c30 Merge "Error with uploading large object includes unicode path" 2016-01-28 00:22:35 +00:00
Jude Job
47f673ed9f Error with uploading large object includes unicode path
This patch include a test case to test unicode path.

Change-Id: I7697679f0034ce65b068791d7d5145286d992bd1
Closes-Bug: #1532096
2016-01-18 12:56:27 +05:30
Tim Burke
7a1e192803 Use bulk-delete middleware when available
When issuing `delete` commands that would require three or more
individual deletes, check whether the cluster supports bulk deletes
and use that if it's available.

Additionally, a new option is added to the `delete` command:

  * --prefix <prefix>

    Delete all objects that start with <prefix>. This is similar to the
    --prefix option for the `list` command.

Example:

$ swift delete c --prefix obj_prefix/

    ...will delete from container "c" all objects whose name begins with
    "obj_prefix/", such as "obj_prefix/foo" and "obj_prefix/bar".

Change-Id: I6b9504848d6ef562cf4f570bbcd17db4e3da8264
2016-01-12 15:40:57 -08:00
James Nzomo
2c6f367035 Fix upload to pseudo-dir passed by <container> arg
This fix makes it possible to upload objects to pseudo-folders by
passing the upload paths via <container> arg regardless of whether the
container or folder path exist or not.

Change-Id: I575e58aa12adcf71cdaa70d025a0ea5c63f46903
Closes-Bug: #1478210
Partial-Bug: #1432734
Related-Bug: #1432734
2016-01-04 22:48:07 +03:00
Joel Wright
a3a78be87b New API documentation for python-swiftclient
New documentation for python-swiftclient that introduces
the APIs available and gives some opinionated advice about
when to use the shell, the client API and the service API.

Change-Id: I19020f041fab2e72469979f712ffe3951c431d24
2015-11-25 15:15:09 +00:00
Jenkins
50978ddf63 Merge "Centralize header parsing" 2015-11-12 02:53:20 +00:00
Zack M. Davis
6b3638ecec enable autodocumentation for utils module; docstring fixes
This commit adds the utils module to those for which Sphinx
automatically generates documentation from docstrings. (Many of the
functions here may be of little interest to users, but
`generate_temp_url`, at least, definitely deserves to be in the
documentation; in this way, this commit can be seen as a spiritual
companion to ca70dd9e.)

Also, a few markup errors and perceived infelicities in existing
docstrings are amended.

Change-Id: I8c66a23cb359d7dd9302a16459fad9825fedb690
2015-09-10 15:19:26 -07:00
Tim Burke
ce569f4651 Centralize header parsing
All response headers are now exposed as unicode objects. Any
url-encoding is interpretted as UTF-8; if that causes decoding to fail,
the url-encoded form is returned.

As a result, deleting DLOs with unicode characters will no longer raise
UnicodeEncodeErrors under Python 2.

Related-Bug: #1431866
Change-Id: Idb111c5bf3ac1f5ccfa724b3f4ede8f37d5bfac4
2015-09-03 13:46:03 -07:00
Joel Wright
3c0289844f Log and report trace on service operation fails
This patch adds exception logging to the swift service API. Each
operation that results in failure of any operation will now log
the exception as well as report a timestamp and full stack trace
in the results returned by the service API calls.

Change-Id: I7336b28354e7740ea7d048bdf355e3c1a1b4436c
2015-08-31 22:03:26 +01:00
Joel Wright
a8c4df98ee Reduce memory usage for download/delete and add --no-shuffle option to st_download
The current code builds a full object listing before performing either a multiple
download or delete operation (and also shuffles this complete list in the case of
a download). This patch removes the creation of the full object list and adds the
ability to turn off shuffle for files when downloading. Also added is a limit on
the number of list results that can be queued by a single call to service.list
without consuming results (reduces memory overhead for large listings).

Some tests added for service.py download and list.

Change-Id: Ie737cbb7f8b1fa8a79bbb88914730b05aa7f2906
2015-07-20 20:44:51 +01:00