
The doc for these sections was missing because of an rst error - the source is there in rst file but didn't make it into the html output. Add doc for per_diff and max_diffs in account and container doc sections. Also, fix a bunch of other sphinx build errors and most of the warnings. Change-Id: If9ed2619b2f92c6c65a94f41d8819db8726d3893
305 lines
12 KiB
ReStructuredText
305 lines
12 KiB
ReStructuredText
=============
|
||
Large objects
|
||
=============
|
||
|
||
By default, the content of an object cannot be greater than 5 GB.
|
||
However, you can use a number of smaller objects to construct a large
|
||
object. The large object is comprised of two types of objects:
|
||
|
||
- **Segment objects** store the object content. You can divide your
|
||
content into segments, and upload each segment into its own segment
|
||
object. Segment objects do not have any special features. You create,
|
||
update, download, and delete segment objects just as you would normal
|
||
objects.
|
||
|
||
- A **manifest object** links the segment objects into one logical
|
||
large object. When you download a manifest object, Object Storage
|
||
concatenates and returns the contents of the segment objects in the
|
||
response body of the request. This behavior extends to the response
|
||
headers returned by **GET** and **HEAD** requests. The
|
||
``Content-Length`` response header value is the total size of all
|
||
segment objects. Object Storage calculates the ``ETag`` response
|
||
header value by taking the ``ETag`` value of each segment,
|
||
concatenating them together, and returning the MD5 checksum of the
|
||
result. The manifest object types are:
|
||
|
||
**Static large objects**
|
||
The manifest object content is an ordered list of the names of
|
||
the segment objects in JSON format.
|
||
|
||
**Dynamic large objects**
|
||
The manifest object has a ``X-Object-Manifest`` metadata header.
|
||
The value of this header is ``{container}/{prefix}``,
|
||
where ``{container}`` is the name of the container where the
|
||
segment objects are stored, and ``{prefix}`` is a string that all
|
||
segment objects have in common. The manifest object should have
|
||
no content. However, this is not enforced.
|
||
|
||
Note
|
||
~~~~
|
||
|
||
If you make a **COPY** request by using a manifest object as the source,
|
||
the new object is a normal, and not a segment, object. If the total size
|
||
of the source segment objects exceeds 5 GB, the **COPY** request fails.
|
||
However, you can make a duplicate of the manifest object and this new
|
||
object can be larger than 5 GB.
|
||
|
||
Static large objects
|
||
~~~~~~~~~~~~~~~~~~~~
|
||
|
||
To create a static large object, divide your content into pieces and
|
||
create (upload) a segment object to contain each piece.
|
||
|
||
You must record the ``ETag`` response header that the **PUT** operation
|
||
returns. Alternatively, you can calculate the MD5 checksum of the
|
||
segment prior to uploading and include this in the ``ETag`` request
|
||
header. This ensures that the upload cannot corrupt your data.
|
||
|
||
List the name of each segment object along with its size and MD5
|
||
checksum in order.
|
||
|
||
Create a manifest object. Include the *``?multipart-manifest=put``*
|
||
query string at the end of the manifest object name to indicate that
|
||
this is a manifest object.
|
||
|
||
The body of the **PUT** request on the manifest object comprises a json
|
||
list, where each element contains the following attributes:
|
||
|
||
- ``path``. The container and object name in the format:
|
||
``{container-name}/{object-name}``
|
||
|
||
- ``etag``. The MD5 checksum of the content of the segment object. This
|
||
value must match the ``ETag`` of that object.
|
||
|
||
- ``size_bytes``. The size of the segment object. This value must match
|
||
the ``Content-Length`` of that object.
|
||
|
||
**Example Static large object manifest list**
|
||
|
||
This example shows three segment objects. You can use several containers
|
||
and the object names do not have to conform to a specific pattern, in
|
||
contrast to dynamic large objects.
|
||
|
||
.. code::
|
||
|
||
[
|
||
{
|
||
"path": "mycontainer/objseg1",
|
||
"etag": "0228c7926b8b642dfb29554cd1f00963",
|
||
"size_bytes": 1468006
|
||
},
|
||
{
|
||
"path": "mycontainer/pseudodir/seg-obj2",
|
||
"etag": "5bfc9ea51a00b790717eeb934fb77b9b",
|
||
"size_bytes": 1572864
|
||
},
|
||
{
|
||
"path": "other-container/seg-final",
|
||
"etag": "b9c3da507d2557c1ddc51f27c54bae51",
|
||
"size_bytes": 256
|
||
}
|
||
]
|
||
|
||
|
|
||
|
||
The ``Content-Length`` request header must contain the length of the
|
||
json content—not the length of the segment objects. However, after the
|
||
**PUT** operation completes, the ``Content-Length`` metadata is set to
|
||
the total length of all the object segments. A similar situation applies
|
||
to the ``ETag``. If used in the **PUT** operation, it must contain the
|
||
MD5 checksum of the json content. The ``ETag`` metadata value is then
|
||
set to be the MD5 checksum of the concatenated ``ETag`` values of the
|
||
object segments. You can also set the ``Content-Type`` request header
|
||
and custom object metadata.
|
||
|
||
When the **PUT** operation sees the *``?multipart-manifest=put``* query
|
||
parameter, it reads the request body and verifies that each segment
|
||
object exists and that the sizes and ETags match. If there is a
|
||
mismatch, the **PUT**\ operation fails.
|
||
|
||
If everything matches, the manifest object is created. The
|
||
``X-Static-Large-Object`` metadata is set to ``true`` indicating that
|
||
this is a static object manifest.
|
||
|
||
Normally when you perform a **GET** operation on the manifest object,
|
||
the response body contains the concatenated content of the segment
|
||
objects. To download the manifest list, use the
|
||
*``?multipart-manifest=get``* query parameter. The resulting list is not
|
||
formatted the same as the manifest you originally used in the **PUT**
|
||
operation.
|
||
|
||
If you use the **DELETE** operation on a manifest object, the manifest
|
||
object is deleted. The segment objects are not affected. However, if you
|
||
add the *``?multipart-manifest=delete``* query parameter, the segment
|
||
objects are deleted and if all are successfully deleted, the manifest
|
||
object is also deleted.
|
||
|
||
To change the manifest, use a **PUT** operation with the
|
||
*``?multipart-manifest=put``* query parameter. This request creates a
|
||
manifest object. You can also update the object metadata in the usual
|
||
way.
|
||
|
||
Dynamic large objects
|
||
~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
You must segment objects that are larger than 5 GB before you can upload
|
||
them. You then upload the segment objects like you would any other
|
||
object and create a dynamic large manifest object. The manifest object
|
||
tells Object Storage how to find the segment objects that comprise the
|
||
large object. The segments remain individually addressable, but
|
||
retrieving the manifest object streams all the segments concatenated.
|
||
There is no limit to the number of segments that can be a part of a
|
||
single large object.
|
||
|
||
To ensure the download works correctly, you must upload all the object
|
||
segments to the same container and ensure that each object name is
|
||
prefixed in such a way that it sorts in the order in which it should be
|
||
concatenated. You also create and upload a manifest file. The manifest
|
||
file is a zero-byte file with the extra ``X-Object-Manifest``
|
||
``{container}/{prefix}`` header, where ``{container}`` is the container
|
||
the object segments are in and ``{prefix}`` is the common prefix for all
|
||
the segments. You must UTF-8-encode and then URL-encode the container
|
||
and common prefix in the ``X-Object-Manifest`` header.
|
||
|
||
It is best to upload all the segments first and then create or update
|
||
the manifest. With this method, the full object is not available for
|
||
downloading until the upload is complete. Also, you can upload a new set
|
||
of segments to a second location and update the manifest to point to
|
||
this new location. During the upload of the new segments, the original
|
||
manifest is still available to download the first set of segments.
|
||
|
||
**Example Upload segment of large object request: HTTP**
|
||
|
||
.. code::
|
||
|
||
PUT /{api_version}/{account}/{container}/{object} HTTP/1.1
|
||
Host: storage.clouddrive.com
|
||
X-Auth-Token: eaaafd18-0fed-4b3a-81b4-663c99ec1cbb
|
||
ETag: 8a964ee2a5e88be344f36c22562a6486
|
||
Content-Length: 1
|
||
X-Object-Meta-PIN: 1234
|
||
|
||
|
||
No response body is returned. A status code of 2\ *``nn``* (between 200
|
||
and 299, inclusive) indicates a successful write; status 411 Length
|
||
Required denotes a missing ``Content-Length`` or ``Content-Type`` header
|
||
in the request. If the MD5 checksum of the data written to the storage
|
||
system does NOT match the (optionally) supplied ETag value, a 422
|
||
Unprocessable Entity response is returned.
|
||
|
||
You can continue uploading segments like this example shows, prior to
|
||
uploading the manifest.
|
||
|
||
**Example Upload next segment of large object request: HTTP**
|
||
|
||
.. code::
|
||
|
||
PUT /{api_version}/{account}/{container}/{object} HTTP/1.1
|
||
Host: storage.clouddrive.com
|
||
X-Auth-Token: eaaafd18-0fed-4b3a-81b4-663c99ec1cbb
|
||
ETag: 8a964ee2a5e88be344f36c22562a6486
|
||
Content-Length: 1
|
||
X-Object-Meta-PIN: 1234
|
||
|
||
|
||
Next, upload the manifest you created that indicates the container the
|
||
object segments reside within. Note that uploading additional segments
|
||
after the manifest is created causes the concatenated object to be that
|
||
much larger but you do not need to recreate the manifest file for
|
||
subsequent additional segments.
|
||
|
||
**Example Upload manifest request: HTTP**
|
||
|
||
.. code::
|
||
|
||
PUT /{api_version}/{account}/{container}/{object} HTTP/1.1
|
||
Host: storage.clouddrive.com
|
||
X-Auth-Token: eaaafd18-0fed-4b3a-81b4-663c99ec1cbb
|
||
Content-Length: 0
|
||
X-Object-Meta-PIN: 1234
|
||
X-Object-Manifest: {container}/{prefix}
|
||
|
||
|
||
**Example Upload manifest response: HTTP**
|
||
|
||
.. code::
|
||
|
||
[...]
|
||
|
||
|
||
The ``Content-Type`` in the response for a **GET** or **HEAD** on the
|
||
manifest is the same as the ``Content-Type`` set during the **PUT**
|
||
request that created the manifest. You can easily change the
|
||
``Content-Type`` by reissuing the **PUT** request.
|
||
|
||
Comparison of static and dynamic large objects
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
While static and dynamic objects have similar behavior, here are
|
||
their differences:
|
||
|
||
**Comparing static and dynamic large objects**
|
||
|
||
Static large object: Assured end-to-end integrity. The list of segments
|
||
includes the MD5 checksum (``ETag``) of each segment. You cannot upload the
|
||
manifest object if the ``ETag`` in the list differs from the uploaded segment
|
||
object. If a segment is somehow lost, an attempt to download the manifest
|
||
object results in an error. You must upload the segment objects before you
|
||
upload the manifest object. You cannot add or remove segment objects from the
|
||
manifest. However, you can create a completely new manifest object of the same
|
||
name with a different manifest list.
|
||
|
||
With static large objects, you can upload new segment objects or remove
|
||
existing segments. The names must simply match the ``{prefix}`` supplied
|
||
in ``X-Object-Manifest``. The segment objects must be at least 1 MB in size
|
||
(by default). The final segment object can be any size. At most, 1000 segments
|
||
are supported (by default). The manifest list includes the container name of
|
||
each object. Segment objects can be in different containers.
|
||
|
||
Dynamic large object: End-to-end integrity is not guaranteed. The eventual
|
||
consistency model means that although you have uploaded a segment object, it
|
||
might not appear in the container listing until later. If you download the
|
||
manifest before it appears in the container, it does not form part of the
|
||
content returned in response to a **GET** request.
|
||
|
||
With dynamic large objects, you can upload manifest and segment objects
|
||
in any order. In case a premature download of the manifest occurs, we
|
||
recommend users upload the manifest object after the segments. However,
|
||
the system does not enforce the order. Segment objects can be any size. All
|
||
segment objects must be in the same container.
|
||
|
||
Manifest object metadata
|
||
------------------------
|
||
|
||
For static large objects, the object has ``X-Static-Large-Object`` set to
|
||
``true``. You do not set this metadata directly. Instead the system sets
|
||
it when you **PUT** a static manifest object.
|
||
|
||
For dynamic object,s the ``X-Object-Manifest`` value is the
|
||
``{container}/{prefix}``, which indicates where the segment objects are
|
||
located. You supply this request header in the **PUT** operation.
|
||
|
||
Copying the manifest object
|
||
---------------------------
|
||
|
||
With static large objects, you include the *``?multipart-manifest=get``*
|
||
query string in the **COPY** request. The new object contains the same
|
||
manifest as the original. The segment objects are not copied. Instead,
|
||
both the original and new manifest objects share the same set of segment
|
||
objects.
|
||
|
||
When creating dynamic large objects, the **COPY** operation does not create
|
||
a manifest object but a normal object with content same as what you would
|
||
get on a **GET** request to original manifest object.
|
||
|
||
To duplicate a manifest object:
|
||
|
||
* Use the **GET** operation to read the value of ``X-Object-Manifest`` and
|
||
use this value in the ``X-Object-Manifest`` request header in a **PUT**
|
||
operation.
|
||
* Alternatively, you can include *``?multipart-manifest=get``* query
|
||
string in the **COPY** request.
|
||
|
||
This creates a new manifest object that shares the same set of segment
|
||
objects as the original manifest object.
|