Change-Id: I3b3c0e7e4d5c8a1934378e2083a64bb686754f18
31 KiB
The swiftclient.SwiftService API
A higher-level API aimed at allowing developers an easy way to
perform multiple operations asynchronously using a configurable thread
pool. Documentation for each service method call can be found here:
swiftclient.service
.
Authentication
This section covers the various options for authenticating with a swift object store. The combinations of options required for each authentication version are detailed below. Once again, these are just a subset of those that can be used to successfully authenticate, but they are the most common and recommended.
The relevant authentication options are presented as python
dictionaries that should be added to any other options you are supplying
to your SwiftService
instance. As indicated in the python
code, you can also set these options as environment variables that will
be loaded automatically if the relevant option is not specified.
The SwiftService
authentication attempts to
automatically select the auth version based on the combination of
options specified, but supplying options from multiple different auth
versions can cause unexpected behaviour.
Note
Leftover environment variables are a common source of confusion when authorization fails.
Keystone V3
{
..."auth_version": environ.get('ST_AUTH_VERSION'), # Should be '3'
"os_username": environ.get('OS_USERNAME'),
"os_password": environ.get('OS_PASSWORD'),
"os_project_name": environ.get('OS_PROJECT_NAME'),
"os_project_domain_name": environ.get('OS_PROJECT_DOMAIN_NAME'),
"os_auth_url": environ.get('OS_AUTH_URL'),
... }
{
..."auth_version": environ.get('ST_AUTH_VERSION'), # Should be '3'
"os_username": environ.get('OS_USERNAME'),
"os_password": environ.get('OS_PASSWORD'),
"os_project_id": environ.get('OS_PROJECT_ID'),
"os_project_domain_id": environ.get('OS_PROJECT_DOMAIN_ID'),
"os_auth_url": environ.get('OS_AUTH_URL'),
... }
Keystone V2
{
..."auth_version": environ.get('ST_AUTH_VERSION'), # Should be '2.0'
"os_username": environ.get('OS_USERNAME'),
"os_password": environ.get('OS_PASSWORD'),
"os_tenant_name": environ.get('OS_TENANT_NAME'),
"os_auth_url": environ.get('OS_AUTH_URL'),
... }
Legacy Auth
{
..."auth_version": environ.get('ST_AUTH_VERSION'), # Should be '1.0'
"auth": environ.get('ST_AUTH'),
"user": environ.get('ST_USER'),
"key": environ.get('ST_KEY'),
... }
Configuration
When you create an instance of a SwiftService
, you can
override a collection of default options to suit your use case.
Typically, the defaults are sensible to get us started, but depending on
your needs you might want to tweak them to improve performance (options
affecting large objects and thread counts can significantly alter
performance in the right situation).
Service level defaults and some extra options can also be overridden on a per-operation (or even in some cases per-object) basis, and you will call out which options affect which operations later in the document.
The configuration of the service API is performed using an options
dictionary passed to the SwiftService
during
initialisation. The options available in this dictionary are described
below, along with their defaults:
Options
retries
:5
The number of times that the library should attempt to retry HTTP actions before giving up and reporting a failure.
container_threads
:10
object_dd_threads
:10
object_uu_threads
:10
segment_threads
:10
The above options determine the size of the available thread pools for performing swift operations. Container operations (such as listing a container) operate in the container threads, and a similar pattern applies to object and segment threads.
Note
Object threads are separated into two separate thread pools:
uu
anddd
. This stands for "upload/update" and "download/delete", and the corresponding actions will be run on separate threads pools.segment_size
:None
If specified, this option enables uploading of large objects. Should the object being uploaded be larger than 5G in size, this option is mandatory otherwise the upload will fail. This option should be specified as a size in bytes.
use_slo
:False
Used in combination with the above option,
use_slo
will upload large objects as static rather than dynamic. Only static large objects provide error checking for the downloaded object, so we recommend this option.segment_container
:None
Allows the user to select the container into which large object segments will be uploaded. We do not recommend changing this value as it could make locating orphaned segments more difficult in the case of errors.
leave_segments
:False
Setting this option to true means that when deleting or overwriting a large object, its segments will be left in the object store and must be cleaned up manually. This option can be useful when sharing large object segments between multiple objects in more advanced scenarios, but must be treated with care, as it could lead to ever increasing storage usage.
changed
:None
This option affects uploads and simply means that those objects which already exist in the object store will not be overwritten if the
mtime
and size of the source is the same as the existing object.skip_identical
:False
A slightly more thorough case of the above, but rather than
mtime
and size uses an object'sMD5 sum
.yes_all
:False
This options affects only download and delete, and in each case must be specified in order to download/delete the entire contents of an account. This option has no effect on any other calls.
no_download
:False
This option only affects download and means that all operations proceed as normal with the exception that no data is written to disk.
header
:[]
Used with upload and post operations to set headers on objects. Headers are specified as colon separated strings, e.g. "content-type:text/plain".
meta
:[]
Used to set metadata on an object similarly to headers.
Note
Setting metadata is a destructive operation, so when updating one of many metadata values all desired metadata for an object must be re-applied.
long
:False
Affects only list operations, and results in more metrics being made available in the results at the expense of lower performance.
fail_fast
:False
Applies to delete and upload operations, and attempts to abort queued tasks in the event of errors.
prefix
:None
Affects list operations; only objects with the given prefix will be returned/affected. It is not advisable to set at the service level, as those operations that call list to discover objects on which they should operate will also be affected.
delimiter
:None
Affects list operations, and means that listings only contain results up to the first instance of the delimiter in the object name. This is useful for working with objects containing '/' in their names to simulate folder structures.
dir_marker
:False
Affects uploads, and allows empty 'pseudofolder' objects to be created when the source of an upload is
None
.checksum
:True
Affects uploads and downloads. If set check md5 sum for the transfer.
shuffle
:False
When downloading objects, the default behaviour of the CLI is to shuffle lists of objects in order to spread the load on storage drives when multiple clients are downloading the same files to multiple locations (e.g. in the event of distributing an update). When using the
SwiftService
directly, object downloads are scheduled in the same order as they appear in the container listing. When combined with a single download thread this means that objects are downloaded in lexically-sorted order. Setting this option toTrue
gives the same shuffling behaviour as the CLI.destination
:None
When copying objects, this specifies the destination where the object will be copied to. The default of None means copy will be the same as source.
fresh_metadata
:None
When copying objects, this specifies that the object metadata on the source will not be applied to the destination object - the destination object will have a new fresh set of metadata that includes only the metadata specified in the meta option if any at all.
Other available options can be found in
swiftclient/service.py
in the source code for
python-swiftclient
. Each SwiftService
method
also allows for an optional dictionary to override those specified at
init time, and the appropriate docstrings show which options modify each
method's behaviour.
Available Operations
Each operation provided by the service API may raise a
SwiftError
or ClientException
for any call
that fails completely (or a call which performs only one operation at an
account or container level). In the case of a successful call an
operation returns one of the following:
- A dictionary detailing the results of a single operation.
- An iterator that produces result dictionaries (for calls that perform multiple sub-operations).
A result dictionary can indicate either the success or failure of an
individual operation (detailed in the success
key), and
will either contain the successful result, or an error
key
detailing the error encountered (usually an instance of Exception).
An example result dictionary is given below:
= {
result 'action': 'download_object',
'success': True,
'container': container,
'object': obj,
'path': path,
'start_time': start_time,
'finish_time': finish_time,
'headers_receipt': headers_receipt,
'auth_end_time': conn.auth_end_time,
'read_length': bytes_read,
'attempts': conn.attempts
}
All the possible action
values are detailed below:
['stat_account',
'stat_container',
'stat_object',
'post_account',
'post_container',
'post_object',
'list_part', # list yields zero or more 'list_part' results
'download_object',
'create_container', # from upload
'create_dir_marker', # from upload
'upload_object',
'upload_segment',
'delete_container',
'delete_object',
'delete_segment', # from delete_object operations
'capabilities',
]
Stat
Stat can be called against an account, a container, or a list of objects to get account stats, container stats or information about the given objects. In the first two cases a dictionary is returned containing the results of the operation, and in the case of a list of object names being supplied, an iterator over the results generated for each object is returned.
Information returned includes the amount of data used by the given object/container/account and any headers or metadata set (this includes user set data as well as content-type and modification times).
See swiftclient.service.SwiftService.stat
for docs
generated from the method docstring.
Valid calls for this method are as follows:
stat([options])
: Returns stats for the configured account.stat(<container>, [options])
: Returns stats for the given container.stat(<container>, <object_list>, [options])
: Returns stats for each of the given objects in the given container (through the returned iterator).
Results from stat are dictionaries indicating the success or failure of each operation. In the case of a successful stat against an account or container, the method returns immediately with one of the following results:
{'action': 'stat_account',
'success': True,
'items': items,
'headers': headers
}
{'action': 'stat_container',
'container': <container>,
'success': True,
'items': items,
'headers': headers
}
In the case of stat called against a list of objects, the method returns a generator that returns the results of individual object stat operations as they are performed on the thread pool:
{'action': 'stat_object',
'object': <object_name>,
'container': <container>,
'success': True,
'items': items,
'headers': headers
}
In the case of a failure the dictionary returned will indicate that the operation was not successful, and will include the keys below:
{'action': <'stat_object'|'stat_container'|'stat_account'>,
'object': <'object_name'>, # Only for stat with objects list
'container': <container>, # Only for stat with objects list or container
'success': False,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>
}
Example
The code below demonstrates the use of stat
to retrieve
the headers for a given list of objects in a container using 20 threads.
The code creates a mapping from object name to headers which is then
pretty printed to the log.
../../examples/stat.py
List
List can be called against an account or a container to retrieve the containers or objects contained within them. Each call returns an iterator that returns pages of results (by default, up to 10000 results in each page).
See swiftclient.service.SwiftService.list
for docs
generated from the method docstring.
If the given container or account does not exist, the list method
will raise a SwiftError
, but for all other success/failures
a dictionary is returned. Each successfully listed page returns a
dictionary as described below:
{'action': <'list_account_part'|'list_container_part'>,
'container': <container>, # Only for listing a container
'prefix': <prefix>, # The prefix of returned objects/containers
'success': True,
'listing': [Item], # A list of results
# (only in the event of success)
'marker': <marker> # The last item name in the list
# (only in the event of success)
}
Where an item contains the following keys:
{'name': <name>,
'bytes': 10485760,
'last_modified': '2014-12-11T12:02:38.774540',
'hash': 'fb938269cbeabe4c234e1127bbd3b74a',
'content_type': 'application/octet-stream',
'meta': <metadata> # Full metadata listing from stat'ing each object
# this key only exists if 'long' is specified in options
}
Any failure listing an account or container that exists will return a failure dictionary as described below:
{'action': <'list_account_part'|'list_container_part'>,,
'container': container, # Only for listing a container
'prefix': options['prefix'],
'success': success,
'marker': marker,
'error': error,
'traceback': <trace>,
'error_timestamp': <timestamp>
}
Example
The code below demonstrates the use of list
to list all
items in a container that are over 10MiB in size:
../../examples/list.py
Post
Post can be called against an account, container or list of objects in order to update the metadata attached to the given items. In the first two cases a single dictionary is returned containing the results of the operation, and in the case of a list of objects being supplied, an iterator over the results generated for each object post is returned.
Each element of the object list may be a plain string of the object
name, or a SwiftPostObject
that allows finer control over
the options and metadata applied to each of the individual post
operations. When a string is given for the object name, the options and
metadata applied are a combination of those supplied to the call to
post()
and the defaults of the SwiftService
object.
If the given container or account does not exist, the
post
method will raise a SwiftError
.
Successful metadata update results are dictionaries as described
below:
{'action': <'post_account'|'post_container'|'post_object'>,
'success': True,
'container': <container>,
'object': <object>,
'headers': {},
'response_dict': <HTTP response details>
}
Note
Updating user metadata keys will not only add any specified keys, but will also remove user metadata that has previously been set. This means that each time user metadata is updated, the complete set of desired key-value pairs must be specified.
Example
The code below demonstrates the use of post
to set an
archive folder in a given container to expire after a 24 hour delay:
../../examples/post.py
Download
Download can be called against an entire account, a single container, or a list of objects in a given container. Each element of the object list is a string detailing the full name of an object to download.
In order to download the full contents of an entire account, you must
set the value of yes_all
to True
in the
options
dictionary supplied to either the
SwiftService
instance or the call to
download
.
If the given container or account does not exist, the
download
method will raise a SwiftError
,
otherwise an iterator over the results generated for each object
download is returned.
See swiftclient.service.SwiftService.download
for docs
generated from the method docstring.
For each successfully downloaded object, the results returned by the iterator will be a dictionary as described below (results are not returned for completed container or object segment downloads):
{'action': 'download_object',
'container': <container>,
'object': <object name>,
'success': True,
'path': <local path to downloaded object>,
'pseudodir': <if true, the download created an empty directory>,
'start_time': <time download started>,
'end_time': <time download completed>,
'headers_receipt': <time the headers from the object were retrieved>,
'auth_end_time': <time authentication completed>,
'read_length': <bytes_read>,
'attempts': <attempt count>,
'response_dict': <HTTP response details>
}
Any failure uploading an object will return a failure dictionary as described below:
{'action': 'download_object',
'container': <container>,
'object': <object name>,
'success': False,
'path': <local path of the failed download>,
'pseudodir': <if true, the failed download was an empty directory>,
'attempts': <attempt count>,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>,
'response_dict': <HTTP response details>
}
Example
The code below demonstrates the use of download
to
download all PNG images from a dated archive folder in a given
container:
../../examples/download.py
Upload
Upload is always called against an account and container and with a
list of objects to upload. Each element of the object list may be a
plain string detailing the path of the object to upload, or a
SwiftUploadObject
that allows finer control over some
aspects of the individual operations.
When a simple string is supplied to specify a file to upload, the
name of the object uploaded is the full path of the specified file and
the options used for the upload are those supplied to the call to
upload
.
Constructing a SwiftUploadObject
allows the user to
supply an object name for the uploaded file, and modify the options used
by upload
at the granularity of individual files.
If the given container or account does not exist, the
upload
method will raise a SwiftError
,
otherwise an iterator over the results generated for each object upload
is returned.
See swiftclient.service.SwiftService.upload
for docs
generated from the method docstring.
For each successfully uploaded object (or object segment), the results returned by the iterator will be a dictionary as described below:
{'action': 'upload_object',
'container': <container>,
'object': <object name>,
'success': True,
'status': <'uploaded'|'skipped-identical'|'skipped-changed'>,
'attempts': <attempt count>,
'response_dict': <HTTP response details>
}
{'action': 'upload_segment',
'for_container': <container>,
'for_object': <object name>,
'segment_index': <segment_index>,
'segment_size': <segment_size>,
'segment_location': <segment_path>
'segment_etag': <etag>,
'log_line': <object segment n>
'success': True,
'response_dict': <HTTP response details>,
'attempts': <attempt count>
}
Any failure uploading an object will return a failure dictionary as described below:
{'action': 'upload_object',
'container': <container>,
'object': <object name>,
'success': False,
'attempts': <attempt count>,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>,
'response_dict': <HTTP response details>
}
{'action': 'upload_segment',
'for_container': <container>,
'for_object': <object name>,
'segment_index': <segment_index>,
'segment_size': <segment_size>,
'segment_location': <segment_path>,
'log_line': <object segment n>,
'success': False,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>,
'response_dict': <HTTP response details>,
'attempts': <attempt count>
}
Example
The code below demonstrates the use of upload
to upload
all files and folders in a given directory, and rename each object by
replacing the root directory name with 'my-<d>-objects', where
<d> is the name of the uploaded directory:
../../examples/upload.py
Delete
Delete can be called against an account or a container to remove the
containers or objects contained within them. Each call to
delete
returns an iterator over results of each resulting
sub-request.
If the number of requested delete operations is large and the target
swift cluster is running the bulk middleware, the call to
SwiftService.delete
will make use of bulk operations and
the returned result iterator will return bulk_delete
results rather than individual delete_object
,
delete_container
or delete_segment
results.
See swiftclient.service.SwiftService.delete
for docs
generated from the method docstring.
For each successfully deleted container, object or segment, the results returned by the iterator will be a dictionary as described below:
{'action': <'delete_object'|'delete_segment'>,
'container': <container>,
'object': <object name>,
'success': True,
'attempts': <attempt count>,
'response_dict': <HTTP response details>
}
{'action': 'delete_container',
'container': <container>,
'success': True,
'response_dict': <HTTP response details>,
'attempts': <attempt count>
}
{'action': 'bulk_delete',
'container': <container>,
'objects': <[objects]>,
'success': True,
'attempts': <attempt count>,
'response_dict': <HTTP response details>
}
Any failure in a delete operation will return a failure dictionary as described below:
{'action': ('delete_object'|'delete_segment'),
'container': <container>,
'object': <object name>,
'success': False,
'attempts': <attempt count>,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>,
'response_dict': <HTTP response details>
}
{'action': 'delete_container',
'container': <container>,
'success': False,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>,
'response_dict': <HTTP response details>,
'attempts': <attempt count>
}
{'action': 'bulk_delete',
'container': <container>,
'objects': <[objects]>,
'success': False,
'attempts': <attempt count>,
'error': <error>,
'traceback': <trace>,
'error_timestamp': <timestamp>,
'response_dict': <HTTP response details>
}
Example
The code below demonstrates the use of delete
to remove
a given list of objects from a specified container. As the objects are
deleted the transaction ID of the relevant request is printed along with
the object name and number of attempts required. By printing the
transaction ID, the printed operations can be easily linked to events in
the swift server logs:
../../examples/delete.py
Copy
Copy can be called to copy an object or update the metadata on the given items.
Each element of the object list may be a plain string of the object
name, or a SwiftCopyObject
that allows finer control over
the options applied to each of the individual copy operations
(destination, fresh_metadata, options).
Destination should be in format /container/object; if not set, the object will be copied onto itself. Fresh_metadata sets mode of operation on metadata. If not set, current object user metadata will be copied/preserved; if set, all current user metadata will be removed.
Returns an iterator over the results generated for each object copy (and may also include the results of creating destination containers).
When a string is given for the object name, destination and fresh metadata will default to None and None, which result in adding metadata to existing objects.
Successful copy results are dictionaries as described below:
{'action': 'copy_object',
'success': True,
'container': <container>,
'object': <object>,
'destination': <destination>,
'headers': {},
'fresh_metadata': <boolean>,
'response_dict': <HTTP response details>
}
Any failure in a copy operation will return a failure dictionary as described below:
{'action': 'copy_object',
'success': False,
'container': <container>,
'object': <object>,
'destination': <destination>,
'headers': {},
'fresh_metadata': <boolean>,
'response_dict': <HTTP response details>,
'error': <error>,
'traceback': <traceback>,
'error_timestamp': <timestamp>
}
Example
The code below demonstrates the use of copy
to add new
user metadata for objects a and b, and to copy object c to d (with added
metadata).
../../examples/copy.py
Capabilities
Capabilities can be called against an account or a particular proxy URL in order to determine the capabilities of the swift cluster. These capabilities include details about configuration options and the middlewares that are installed in the proxy pipeline.
See swiftclient.service.SwiftService.capabilities
for docs
generated from the method docstring.
For each successful call to list capabilities, a result dictionary will be returned with the contents described below:
{'action': 'capabilities',
'timestamp': <time of the call>,
'success': True,
'capabilities': <dictionary containing capability details>
}
The contents of the capabilities dictionary contain the core swift
capabilities under the key swift
; all other keys show the
configuration options for additional middlewares deployed in the proxy
pipeline. An example capabilities dictionary is given below:
{'account_quotas': {},
'bulk_delete': {
'max_deletes_per_request': 10000,
'max_failed_deletes': 1000
},'bulk_upload': {
'max_containers_per_extraction': 10000,
'max_failed_extractions': 1000
},'container_quotas': {},
'container_sync': {'realms': {}},
'formpost': {},
'keystoneauth': {},
'slo': {
'max_manifest_segments': 1000,
'max_manifest_size': 2097152,
'min_segment_size': 1048576
},'swift': {
'account_autocreate': True,
'account_listing_limit': 10000,
'allow_account_management': True,
'container_listing_limit': 10000,
'extra_header_count': 0,
'max_account_name_length': 256,
'max_container_name_length': 256,
'max_file_size': 5368709122,
'max_header_size': 8192,
'max_meta_count': 90,
'max_meta_name_length': 128,
'max_meta_overall_size': 4096,
'max_meta_value_length': 256,
'max_object_name_length': 1024,
'policies': [
'default': True, 'name': 'Policy-0'}
{
],'strict_cors_mode': False,
'version': '2.2.2'
},'tempurl': {
'methods': ['GET', 'HEAD', 'PUT']
} }
Example
The code below demonstrates the use of capabilities
to
determine if the Swift cluster supports static large objects, and if so,
the maximum number of segments that can be described in a single
manifest file, along with the size restrictions on those objects:
../../examples/capabilities.py