[admin-guide] Fix the rst markups for Object Storage files
Change-Id: I67fa565e6d6826d5486146278186997ce0dbe7d0
This commit is contained in:
parent
a68ac6d95a
commit
610345f2cc
@ -146,29 +146,29 @@ method (see the list below) overrides it. Currently, no logging calls
|
||||
override the sample rate, but it is conceivable that some meters may
|
||||
require accuracy (sample\_rate == 1) while others may not.
|
||||
|
||||
.. code::
|
||||
.. code-block:: ini
|
||||
|
||||
[DEFAULT]
|
||||
...
|
||||
log_statsd_host = 127.0.0.1
|
||||
log_statsd_port = 8125
|
||||
log_statsd_default_sample_rate = 1
|
||||
[DEFAULT]
|
||||
...
|
||||
log_statsd_host = 127.0.0.1
|
||||
log_statsd_port = 8125
|
||||
log_statsd_default_sample_rate = 1
|
||||
|
||||
Then the LogAdapter object returned by ``get_logger()``, usually stored
|
||||
in ``self.logger``, has these new methods:
|
||||
|
||||
- ``set_statsd_prefix(self, prefix)`` Sets the client library stat
|
||||
prefix value which gets prefixed to every meter. The default prefix
|
||||
is the "name" of the logger such as "object-server",
|
||||
"container-auditor", and so on. This is currently used to turn
|
||||
"proxy-server" into one of "proxy-server.Account",
|
||||
"proxy-server.Container", or "proxy-server.Object" as soon as the
|
||||
is the ``name`` of the logger such as ``object-server``,
|
||||
``container-auditor``, and so on. This is currently used to turn
|
||||
``proxy-server`` into one of ``proxy-server.Account``,
|
||||
``proxy-server.Container``, or ``proxy-server.Object`` as soon as the
|
||||
Controller object is determined and instantiated for the request.
|
||||
|
||||
- ``update_stats(self, metric, amount, sample_rate=1)`` Increments
|
||||
the supplied meter by the given amount. This is used when you need
|
||||
to add or subtract more that one from a counter, like incrementing
|
||||
"suffix.hashes" by the number of computed hashes in the object
|
||||
``suffix.hashes`` by the number of computed hashes in the object
|
||||
replicator.
|
||||
|
||||
- ``increment(self, metric, sample_rate=1)`` Increments the given counter
|
||||
@ -189,49 +189,49 @@ logger object. If StatsD logging has not been configured, the methods
|
||||
are no-ops. This avoids messy conditional logic each place a meter is
|
||||
recorded. These example usages show the new logging methods:
|
||||
|
||||
.. code-block:: bash
|
||||
.. code-block:: python
|
||||
|
||||
# swift/obj/replicator.py
|
||||
def update(self, job):
|
||||
# ...
|
||||
begin = time.time()
|
||||
try:
|
||||
hashed, local_hash = tpool.execute(tpooled_get_hashes, job['path'],
|
||||
do_listdir=(self.replication_count % 10) == 0,
|
||||
reclaim_age=self.reclaim_age)
|
||||
# See tpooled_get_hashes "Hack".
|
||||
if isinstance(hashed, BaseException):
|
||||
raise hashed
|
||||
self.suffix_hash += hashed
|
||||
self.logger.update_stats('suffix.hashes', hashed)
|
||||
# ...
|
||||
finally:
|
||||
self.partition_times.append(time.time() - begin)
|
||||
self.logger.timing_since('partition.update.timing', begin)
|
||||
# swift/obj/replicator.py
|
||||
def update(self, job):
|
||||
# ...
|
||||
begin = time.time()
|
||||
try:
|
||||
hashed, local_hash = tpool.execute(tpooled_get_hashes, job['path'],
|
||||
do_listdir=(self.replication_count % 10) == 0,
|
||||
reclaim_age=self.reclaim_age)
|
||||
# See tpooled_get_hashes "Hack".
|
||||
if isinstance(hashed, BaseException):
|
||||
raise hashed
|
||||
self.suffix_hash += hashed
|
||||
self.logger.update_stats('suffix.hashes', hashed)
|
||||
# ...
|
||||
finally:
|
||||
self.partition_times.append(time.time() - begin)
|
||||
self.logger.timing_since('partition.update.timing', begin)
|
||||
|
||||
.. code-block:: bash
|
||||
.. code-block:: python
|
||||
|
||||
# swift/container/updater.py
|
||||
def process_container(self, dbfile):
|
||||
# ...
|
||||
start_time = time.time()
|
||||
# ...
|
||||
for event in events:
|
||||
if 200 <= event.wait() < 300:
|
||||
successes += 1
|
||||
else:
|
||||
failures += 1
|
||||
if successes > failures:
|
||||
self.logger.increment('successes')
|
||||
# ...
|
||||
else:
|
||||
self.logger.increment('failures')
|
||||
# ...
|
||||
# Only track timing data for attempted updates:
|
||||
self.logger.timing_since('timing', start_time)
|
||||
else:
|
||||
self.logger.increment('no_changes')
|
||||
self.no_changes += 1
|
||||
# swift/container/updater.py
|
||||
def process_container(self, dbfile):
|
||||
# ...
|
||||
start_time = time.time()
|
||||
# ...
|
||||
for event in events:
|
||||
if 200 <= event.wait() < 300:
|
||||
successes += 1
|
||||
else:
|
||||
failures += 1
|
||||
if successes > failures:
|
||||
self.logger.increment('successes')
|
||||
# ...
|
||||
else:
|
||||
self.logger.increment('failures')
|
||||
# ...
|
||||
# Only track timing data for attempted updates:
|
||||
self.logger.timing_since('timing', start_time)
|
||||
else:
|
||||
self.logger.increment('no_changes')
|
||||
self.no_changes += 1
|
||||
|
||||
The development team of StatsD wanted to use the
|
||||
`pystatsd <https://github.com/sivy/py-statsd>`__ client library (not to
|
||||
@ -240,7 +240,7 @@ project <https://github.com/sivy/py-statsd>`__ also hosted on GitHub),
|
||||
but the released version on PyPI was missing two desired features the
|
||||
latest version in GitHub had: the ability to configure a meters prefix
|
||||
in the client object and a convenience method for sending timing data
|
||||
between "now" and a "start" timestamp you already have. So they just
|
||||
between ``now`` and a ``start`` timestamp you already have. So they just
|
||||
implemented a simple StatsD client library from scratch with the same
|
||||
interface. This has the nice fringe benefit of not introducing another
|
||||
external library dependency into Object Storage.
|
||||
|
@ -2,8 +2,8 @@
|
||||
Troubleshoot Object Storage
|
||||
===========================
|
||||
|
||||
For Object Storage, everything is logged in :file:`/var/log/syslog` (or
|
||||
:file:`messages` on some distros). Several settings enable further
|
||||
For Object Storage, everything is logged in ``/var/log/syslog`` (or
|
||||
``messages`` on some distros). Several settings enable further
|
||||
customization of logging, such as ``log_name``, ``log_facility``, and
|
||||
``log_level``, within the object server configuration files.
|
||||
|
||||
@ -22,7 +22,7 @@ replicas that were on that drive to be replicated elsewhere until the
|
||||
drive is replaced. Once the drive is replaced, it can be re-added to the
|
||||
ring.
|
||||
|
||||
You can look at error messages in :file:`/var/log/kern.log` for hints of
|
||||
You can look at error messages in the ``/var/log/kern.log`` file for hints of
|
||||
drive failure.
|
||||
|
||||
Server failure
|
||||
@ -49,7 +49,7 @@ Detect failed drives
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
It has been our experience that when a drive is about to fail, error
|
||||
messages appear in :file:`/var/log/kern.log`. There is a script called
|
||||
messages appear in the ``/var/log/kern.log`` file. There is a script called
|
||||
``swift-drive-audit`` that can be run via cron to watch for bad drives. If
|
||||
errors are detected, it will unmount the bad drive, so that Object
|
||||
Storage can work around it. The script takes a configuration file with
|
||||
@ -79,7 +79,7 @@ the following settings:
|
||||
* - ``log_to_console = False``
|
||||
- No help text available for this option.
|
||||
* - ``minutes = 60``
|
||||
- Number of minutes to look back in :file:`/var/log/kern.log`
|
||||
- Number of minutes to look back in ``/var/log/kern.log``
|
||||
* - ``recon_cache_path = /var/cache/swift``
|
||||
- Directory where stats for a few items will be stored
|
||||
* - ``regex_pattern_1 = \berror\b.*\b(dm-[0-9]{1,2}\d?)\b``
|
||||
@ -100,7 +100,7 @@ an emergency occurs, this procedure may assist in returning your cluster
|
||||
to an operational state.
|
||||
|
||||
Using existing swift tools, there is no way to recover a builder file
|
||||
from a :file:`ring.gz` file. However, if you have a knowledge of Python, it
|
||||
from a ``ring.gz`` file. However, if you have a knowledge of Python, it
|
||||
is possible to construct a builder file that is pretty close to the one
|
||||
you have lost.
|
||||
|
||||
@ -161,13 +161,13 @@ you have lost.
|
||||
>>> pickle.dump(builder.to_dict(), open('account.builder', 'wb'), protocol=2)
|
||||
>>> exit ()
|
||||
|
||||
#. You should now have a file called :file:`account.builder` in the current
|
||||
#. You should now have a file called ``account.builder`` in the current
|
||||
working directory. Run
|
||||
:command:`swift-ring-builder account.builder write_ring` and compare the new
|
||||
:file:`account.ring.gz` to the :file:`account.ring.gz` that you started
|
||||
``account.ring.gz`` to the ``account.ring.gz`` that you started
|
||||
from. They probably are not byte-for-byte identical, but if you load them
|
||||
in a REPL and their ``_replica2part2dev_id`` and ``devs`` attributes are
|
||||
the same (or nearly so), then you are in good shape.
|
||||
|
||||
#. Repeat the procedure for :file:`container.ring.gz` and
|
||||
:file:`object.ring.gz`, and you might get usable builder files.
|
||||
#. Repeat the procedure for ``container.ring.gz`` and
|
||||
``object.ring.gz``, and you might get usable builder files.
|
||||
|
@ -21,9 +21,10 @@ not only separate devices but possibly even entire nodes dedicated for erasure
|
||||
coding.
|
||||
|
||||
.. important::
|
||||
|
||||
The erasure code support in Object Storage is considered beta in Kilo.
|
||||
Most major functionality is included, but it has not been tested or
|
||||
validated at large scale. This feature relies on `ssync` for durability.
|
||||
validated at large scale. This feature relies on ``ssync`` for durability.
|
||||
We recommend deployers do extensive testing and not deploy production
|
||||
data using an erasure code storage policy.
|
||||
If any bugs are found during testing, please report them to
|
||||
|
@ -12,7 +12,7 @@ the account\_stat table in the account database and replicas to
|
||||
|
||||
Typically, a specific retention time or undelete are not provided.
|
||||
However, you can set a ``delay_reaping`` value in the
|
||||
``[account-reaper]`` section of the :file:`account-server.conf` file to
|
||||
``[account-reaper]`` section of the ``account-server.conf`` file to
|
||||
delay the actual deletion of data. At this time, to undelete you have to update
|
||||
the account database replicas directly, set the status column to an
|
||||
empty string and update the put\_timestamp to be greater than the
|
||||
@ -41,10 +41,12 @@ which point the database reclaim process within the db\_replicator will
|
||||
remove the database files.
|
||||
|
||||
A persistent error state may prevent the deletion of an object or
|
||||
container. If this happens, you will see a message in the log, for example::
|
||||
container. If this happens, you will see a message in the log, for example:
|
||||
|
||||
"Account <name> has not been reaped since <date>"
|
||||
.. code-block:: console
|
||||
|
||||
Account <name> has not been reaped since <date>
|
||||
|
||||
You can control when this is logged with the ``reap_warn_after`` value in the
|
||||
``[account-reaper]`` section of the :file:`account-server.conf` file.
|
||||
``[account-reaper]`` section of the ``account-server.conf`` file.
|
||||
The default value is 30 days.
|
||||
|
@ -12,18 +12,17 @@ and authentication services. It runs the (distributed) brain of the
|
||||
Object Storage system: the proxy server processes.
|
||||
|
||||
.. note::
|
||||
|
||||
If you want to use OpenStack Identity API v3 for authentication, you
|
||||
have the following options available in :file:`/etc/swift/dispersion.conf`:
|
||||
have the following options available in ``/etc/swift/dispersion.conf``:
|
||||
``auth_version``, ``user_domain_name``, ``project_domain_name``,
|
||||
and ``project_name``.
|
||||
|
||||
**Object Storage architecture**
|
||||
|
||||
|
|
||||
|
||||
.. image:: figures/objectstorage-arch.png
|
||||
.. figure:: figures/objectstorage-arch.png
|
||||
|
||||
|
|
||||
|
||||
Because access servers are collocated in their own tier, you can scale
|
||||
out read/write access regardless of the storage capacity. For example,
|
||||
@ -39,11 +38,12 @@ Typically, the tier consists of a collection of 1U servers. These
|
||||
machines use a moderate amount of RAM and are network I/O intensive.
|
||||
Since these systems field each incoming API request, you should
|
||||
provision them with two high-throughput (10GbE) interfaces - one for the
|
||||
incoming "front-end" requests and the other for the "back-end" access to
|
||||
incoming ``front-end`` requests and the other for the ``back-end`` access to
|
||||
the object storage nodes to put and fetch data.
|
||||
|
||||
Factors to consider
|
||||
-------------------
|
||||
|
||||
For most publicly facing deployments as well as private deployments
|
||||
available across a wide-reaching corporate network, you use SSL to
|
||||
encrypt traffic to the client. SSL adds significant processing load to
|
||||
@ -53,6 +53,7 @@ deployments on trusted networks.
|
||||
|
||||
Storage nodes
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
In most configurations, each of the five zones should have an equal
|
||||
amount of storage capacity. Storage nodes use a reasonable amount of
|
||||
memory and CPU. Metadata needs to be readily available to return objects
|
||||
@ -64,11 +65,10 @@ workload and desired performance.
|
||||
|
||||
**Object Storage (swift)**
|
||||
|
||||
|
|
||||
|
||||
.. image:: figures/objectstorage-nodes.png
|
||||
.. figure:: figures/objectstorage-nodes.png
|
||||
|
||||
|
||||
|
|
||||
|
||||
Currently, a 2 TB or 3 TB SATA disk delivers good performance for the
|
||||
price. You can use desktop-grade drives if you have responsive remote
|
||||
@ -76,6 +76,7 @@ hands in the datacenter and enterprise-grade drives if you don't.
|
||||
|
||||
Factors to consider
|
||||
-------------------
|
||||
|
||||
You should keep in mind the desired I/O performance for single-threaded
|
||||
requests. This system does not use RAID, so a single disk handles each
|
||||
request for an object. Disk performance impacts single-threaded response
|
||||
|
@ -6,20 +6,26 @@ On system failures, the XFS file system can sometimes truncate files it is
|
||||
trying to write and produce zero-byte files. The object-auditor will catch
|
||||
these problems but in the case of a system crash it is advisable to run
|
||||
an extra, less rate limited sweep, to check for these specific files.
|
||||
You can run this command as follows::
|
||||
You can run this command as follows:
|
||||
|
||||
swift-object-auditor /path/to/object-server/config/file.conf once -z 1000
|
||||
.. code-block:: console
|
||||
|
||||
$ swift-object-auditor /path/to/object-server/config/file.conf once -z 1000
|
||||
|
||||
.. note::
|
||||
|
||||
"-z" means to only check for zero-byte files at 1000 files per second.
|
||||
|
||||
It is useful to run the object auditor on a specific device or set of devices.
|
||||
You can run the object-auditor once as follows::
|
||||
You can run the object-auditor once as follows:
|
||||
|
||||
swift-object-auditor /path/to/object-server/config/file.conf once /
|
||||
--devices=sda,sdb
|
||||
.. code-block:: console
|
||||
|
||||
$ swift-object-auditor /path/to/object-server/config/file.conf once \
|
||||
--devices=sda,sdb
|
||||
|
||||
.. note::
|
||||
This will run the object auditor on only the sda and sdb devices.
|
||||
|
||||
This will run the object auditor on only the ``sda`` and ``sdb`` devices.
|
||||
This parameter accepts a comma-separated list of values.
|
||||
|
||||
|
@ -25,7 +25,6 @@ high durability, and high concurrency are:
|
||||
container databases and helps manage locations where data lives in
|
||||
the cluster.
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-building-blocks-figure:
|
||||
|
||||
@ -33,7 +32,6 @@ high durability, and high concurrency are:
|
||||
|
||||
.. figure:: figures/objectstorage-buildingblocks.png
|
||||
|
||||
|
|
||||
|
||||
Proxy servers
|
||||
-------------
|
||||
@ -85,7 +83,6 @@ sized drives are used in a cluster.
|
||||
The ring is used by the proxy server and several background processes
|
||||
(like replication).
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-ring-figure:
|
||||
|
||||
@ -93,7 +90,6 @@ The ring is used by the proxy server and several background processes
|
||||
|
||||
.. figure:: figures/objectstorage-ring.png
|
||||
|
||||
|
|
||||
|
||||
These rings are externally managed, in that the server processes
|
||||
themselves do not modify the rings, they are instead given new rings
|
||||
@ -134,7 +130,6 @@ durability. This means that when choosing a replica location, Object
|
||||
Storage chooses a server in an unused zone before an unused server in a
|
||||
zone that already has a replica of the data.
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-zones-figure:
|
||||
|
||||
@ -142,7 +137,6 @@ zone that already has a replica of the data.
|
||||
|
||||
.. figure:: figures/objectstorage-zones.png
|
||||
|
||||
|
|
||||
|
||||
When a disk fails, replica data is automatically distributed to the
|
||||
other zones to ensure there are three copies of the data.
|
||||
@ -155,7 +149,6 @@ distributed across the cluster. An account database contains the list of
|
||||
containers in that account. A container database contains the list of
|
||||
objects in that container.
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-accountscontainers-figure:
|
||||
|
||||
@ -163,7 +156,6 @@ objects in that container.
|
||||
|
||||
.. figure:: figures/objectstorage-accountscontainers.png
|
||||
|
||||
|
|
||||
|
||||
To keep track of object data locations, each account in the system has a
|
||||
database that references all of its containers, and each container
|
||||
@ -190,7 +182,6 @@ Implementing a partition is conceptually simple, a partition is just a
|
||||
directory sitting on a disk with a corresponding hash table of what it
|
||||
contains.
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-partitions-figure:
|
||||
|
||||
@ -198,7 +189,6 @@ contains.
|
||||
|
||||
.. figure:: figures/objectstorage-partitions.png
|
||||
|
||||
|
|
||||
|
||||
Replicators
|
||||
-----------
|
||||
@ -224,7 +214,6 @@ of hashes to compare.
|
||||
The cluster eventually has a consistent behavior where the newest data
|
||||
has a priority.
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-replication-figure:
|
||||
|
||||
@ -232,7 +221,6 @@ has a priority.
|
||||
|
||||
.. figure:: figures/objectstorage-replication.png
|
||||
|
||||
|
|
||||
|
||||
If a zone goes down, one of the nodes containing a replica notices and
|
||||
proactively copies data to a handoff location.
|
||||
@ -263,7 +251,6 @@ successful before the client is notified that the upload was successful.
|
||||
Next, the container database is updated asynchronously to reflect that
|
||||
there is a new object in it.
|
||||
|
||||
|
|
||||
|
||||
.. _objectstorage-usecase-figure:
|
||||
|
||||
@ -271,7 +258,6 @@ there is a new object in it.
|
||||
|
||||
.. figure:: figures/objectstorage-usecase.png
|
||||
|
||||
|
|
||||
|
||||
Download
|
||||
~~~~~~~~
|
||||
|
@ -2,7 +2,7 @@
|
||||
Introduction to Object Storage
|
||||
==============================
|
||||
|
||||
OpenStack Object Storage (code-named swift) is open source software for
|
||||
OpenStack Object Storage (code-named swift) is an open source software for
|
||||
creating redundant, scalable data storage using clusters of standardized
|
||||
servers to store petabytes of accessible data. It is a long-term storage
|
||||
system for large amounts of static data that can be retrieved,
|
||||
|
@ -20,8 +20,8 @@ that data gets to where it belongs. The ring handles replica placement.
|
||||
|
||||
To replicate deletions in addition to creations, every deleted record or
|
||||
file in the system is marked by a tombstone. The replication process
|
||||
cleans up tombstones after a time period known as the *consistency
|
||||
window*. This window defines the duration of the replication and how
|
||||
cleans up tombstones after a time period known as the ``consistency
|
||||
window``. This window defines the duration of the replication and how
|
||||
long transient failure can remove a node from the cluster. Tombstone
|
||||
cleanup must be tied to replication to reach replica convergence.
|
||||
|
||||
@ -47,6 +47,7 @@ The main replication types are:
|
||||
|
||||
Database replication
|
||||
~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Database replication completes a low-cost hash comparison to determine
|
||||
whether two replicas already match. Normally, this check can quickly
|
||||
verify that most databases in the system are already synchronized. If
|
||||
@ -73,6 +74,7 @@ performed.
|
||||
|
||||
Object replication
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The initial implementation of object replication performed an rsync to
|
||||
push data from a local partition to all remote servers where it was
|
||||
expected to reside. While this worked at small scale, replication times
|
||||
|
@ -25,6 +25,7 @@ loss is possible, but data would be unreachable for an extended time.
|
||||
|
||||
Ring data structure
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The ring data structure consists of three top level fields: a list of
|
||||
devices in the cluster, a list of lists of device ids indicating
|
||||
partition to device assignments, and an integer indicating the number of
|
||||
@ -32,6 +33,7 @@ bits to shift an MD5 hash to calculate the partition for the hash.
|
||||
|
||||
Partition assignment list
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
This is a list of ``array('H')`` of devices ids. The outermost list
|
||||
contains an ``array('H')`` for each replica. Each ``array('H')`` has a
|
||||
length equal to the partition count for the ring. Each integer in the
|
||||
@ -39,10 +41,12 @@ length equal to the partition count for the ring. Each integer in the
|
||||
list is known internally to the Ring class as ``_replica2part2dev_id``.
|
||||
|
||||
So, to create a list of device dictionaries assigned to a partition, the
|
||||
Python code would look like::
|
||||
Python code would look like:
|
||||
|
||||
devices = [self.devs[part2dev_id[partition]] for
|
||||
part2dev_id in self._replica2part2dev_id]
|
||||
.. code-block:: python
|
||||
|
||||
devices = [self.devs[part2dev_id[partition]] for
|
||||
part2dev_id in self._replica2part2dev_id]
|
||||
|
||||
That code is a little simplistic because it does not account for the
|
||||
removal of duplicate devices. If a ring has more replicas than devices,
|
||||
@ -91,6 +95,7 @@ and B's disks are only 72.7% full.
|
||||
|
||||
Replica counts
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
To support the gradual change in replica counts, a ring can have a real
|
||||
number of replicas and is not restricted to an integer number of
|
||||
replicas.
|
||||
@ -100,12 +105,12 @@ partitions. It indicates the average number of replicas for each
|
||||
partition. For example, a replica count of 3.2 means that 20 percent of
|
||||
partitions have four replicas and 80 percent have three replicas.
|
||||
|
||||
The replica count is adjustable.
|
||||
The replica count is adjustable. For example:
|
||||
|
||||
Example::
|
||||
.. code-block:: console
|
||||
|
||||
$ swift-ring-builder account.builder set_replicas 4
|
||||
$ swift-ring-builder account.builder rebalance
|
||||
$ swift-ring-builder account.builder set_replicas 4
|
||||
$ swift-ring-builder account.builder rebalance
|
||||
|
||||
You must rebalance the replica ring in globally distributed clusters.
|
||||
Operators of these clusters generally want an equal number of replicas
|
||||
@ -114,42 +119,46 @@ operator adds or removes a replica. Removing unneeded replicas saves on
|
||||
the cost of disks.
|
||||
|
||||
You can gradually increase the replica count at a rate that does not
|
||||
adversely affect cluster performance.
|
||||
adversely affect cluster performance. For example:
|
||||
|
||||
For example::
|
||||
.. code-block:: console
|
||||
|
||||
$ swift-ring-builder object.builder set_replicas 3.01
|
||||
$ swift-ring-builder object.builder rebalance
|
||||
<distribute rings and wait>...
|
||||
$ swift-ring-builder object.builder set_replicas 3.01
|
||||
$ swift-ring-builder object.builder rebalance
|
||||
<distribute rings and wait>...
|
||||
|
||||
$ swift-ring-builder object.builder set_replicas 3.02
|
||||
$ swift-ring-builder object.builder rebalance
|
||||
<distribute rings and wait>...
|
||||
$ swift-ring-builder object.builder set_replicas 3.02
|
||||
$ swift-ring-builder object.builder rebalance
|
||||
<distribute rings and wait>...
|
||||
|
||||
Changes take effect after the ring is rebalanced. Therefore, if you
|
||||
intend to change from 3 replicas to 3.01 but you accidentally type
|
||||
2.01, no data is lost.
|
||||
|
||||
Additionally, the ``swift-ring-builder X.builder create`` command can now
|
||||
take a decimal argument for the number of replicas.
|
||||
Additionally, the :command:`swift-ring-builder X.builder create` command can
|
||||
now take a decimal argument for the number of replicas.
|
||||
|
||||
Partition shift value
|
||||
~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The partition shift value is known internally to the Ring class as
|
||||
``_part_shift``. This value is used to shift an MD5 hash to calculate
|
||||
the partition where the data for that hash should reside. Only the top
|
||||
four bytes of the hash is used in this process. For example, to compute
|
||||
the partition for the :file:`/account/container/object` path using Python::
|
||||
the partition for the ``/account/container/object`` path using Python:
|
||||
|
||||
partition = unpack_from('>I',
|
||||
md5('/account/container/object').digest())[0] >>
|
||||
self._part_shift
|
||||
.. code-block:: python
|
||||
|
||||
partition = unpack_from('>I',
|
||||
md5('/account/container/object').digest())[0] >>
|
||||
self._part_shift
|
||||
|
||||
For a ring generated with part\_power P, the partition shift value is
|
||||
``32 - P``.
|
||||
|
||||
Build the ring
|
||||
~~~~~~~~~~~~~~
|
||||
|
||||
The ring builder process includes these high-level steps:
|
||||
|
||||
#. The utility calculates the number of partitions to assign to each
|
||||
|
@ -15,9 +15,9 @@ created image:
|
||||
**To configure tenant-specific image locations**
|
||||
|
||||
#. Configure swift as your ``default_store`` in the
|
||||
:file:`glance-api.conf` file.
|
||||
``glance-api.conf`` file.
|
||||
|
||||
#. Set these configuration options in the :file:`glance-api.conf` file:
|
||||
#. Set these configuration options in the ``glance-api.conf`` file:
|
||||
|
||||
- swift_store_multi_tenant
|
||||
Set to ``True`` to enable tenant-specific storage locations.
|
||||
|
Loading…
Reference in New Issue
Block a user