Merge "[User Guides] Object Storage chapter edits"

This commit is contained in:
Jenkins 2016-03-23 04:29:55 +00:00 committed by Gerrit Code Review
commit 3214af2522
4 changed files with 106 additions and 136 deletions

@ -2,8 +2,11 @@
Object Storage monitoring
=========================
Excerpted from a blog post by `Darrell
Bishop <http://swiftstack.com/blog/2012/04/11/swift-monitoring-with-statsd>`__
.. note::
This section was excerpted from a blog post by `Darrell
Bishop <http://swiftstack.com/blog/2012/04/11/swift-monitoring-with-statsd>`_ and
has since been edited.
An OpenStack Object Storage cluster is a collection of many daemons that
work together across many nodes. With so many different components, you
@ -11,30 +14,22 @@ must be able to tell what is going on inside the cluster. Tracking
server-level meters like CPU utilization, load, memory consumption, disk
usage and utilization, and so on is necessary, but not sufficient.
What are different daemons are doing on each server? What is the volume
of object replication on node8? How long is it taking? Are there errors?
If so, when did they happen?
In such a complex ecosystem, you can use multiple approaches to get the
answers to these questions. This section describes several approaches.
Swift Recon
~~~~~~~~~~~
The Swift Recon middleware (see
http://swift.openstack.org/admin_guide.html#cluster-telemetry-and-monitoring)
`Defining Storage Policies <http://swift.openstack.org/admin_guide.html#cluster-telemetry-and-monitoring>`_)
provides general machine statistics, such as load average, socket
statistics, ``/proc/meminfo`` contents, and so on, as well as
Swift-specific meters:
statistics, ``/proc/meminfo`` contents, as well as Swift-specific meters:
- The MD5 sum of each ring file.
- The ``MD5`` sum of each ring file.
- The most recent object replication time.
- Count of each type of quarantined file: Account, container, or
object.
- Count of "async\_pendings" (deferred container updates) on disk.
- Count of "async_pendings" (deferred container updates) on disk.
Swift Recon is middleware that is installed in the object servers
pipeline and takes one required option: A local cache directory. To
@ -43,24 +38,23 @@ each object server. You access data by either sending HTTP requests
directly to the object server or using the ``swift-recon`` command-line
client.
There are some good Object Storage cluster statistics but the general
There are Object Storage cluster statistics but the typical
server meters overlap with existing server monitoring systems. To get
the Swift-specific meters into a monitoring system, they must be polled.
Swift Recon essentially acts as a middleware meters collector. The
Swift Recon acts as a middleware meters collector. The
process that feeds meters to your statistics system, such as
``collectd`` and ``gmond``, probably already runs on the storage node.
So, you can choose to either talk to Swift Recon or collect the meters
``collectd`` and ``gmond``, should already run on the storage node.
You can choose to either talk to Swift Recon or collect the meters
directly.
Swift-Informant
~~~~~~~~~~~~~~~
Florian Hines developed the Swift-Informant middleware (see
https://github.com/pandemicsyn/swift-informant) to get real-time
visibility into Object Storage client requests. It sits in the pipeline
for the proxy server, and after each request to the proxy server, sends
three meters to a StatsD server (see
http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/):
Swift-Informant middleware (see
`swift-informant <https://github.com/pandemicsyn/swift-informant>_`) has
real-time visibility into Object Storage client requests. It sits in the
pipeline for the proxy server, and after each request to the proxy server it
sends three meters to a ``StatsD`` server:
- A counter increment for a meter like ``obj.GET.200`` or
``cont.PUT.404``.
@ -77,26 +71,24 @@ http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/):
- A counter increase by the bytes transferred for a meter like
``tfer.obj.PUT.201``.
This is good for getting a feel for the quality of service clients are
experiencing with the timing meters, as well as getting a feel for the
volume of the various permutations of request server type, command, and
response code. Swift-Informant also requires no change to core Object
This is used for receiving information on the quality of service clients
experience with the timing meters, as well as sensing the volume of the
various modifications of a request server type, command, and response
code. Swift-Informant requires no change to core Object
Storage code because it is implemented as middleware. However, it gives
you no insight into the workings of the cluster past the proxy server.
no insight into the workings of the cluster past the proxy server.
If the responsiveness of one storage node degrades, you can only see
that some of your requests are bad, either as high latency or error
status codes. You do not know exactly why or where that request tried to
go. Maybe the container server in question was on a good node but the
object server was on a different, poorly-performing node.
that some of the requests are bad, either as high latency or error
status codes.
Statsdlog
~~~~~~~~~
Florian's `Statsdlog <https://github.com/pandemicsyn/statsdlog>`__
The `Statsdlog <https://github.com/pandemicsyn/statsdlog>`_
project increments StatsD counters based on logged events. Like
Swift-Informant, it is also non-intrusive, but statsdlog can track
Swift-Informant, it is also non-intrusive, however statsdlog can track
events from all Object Storage daemons, not just proxy-server. The
daemon listens to a UDP stream of syslog messages and StatsD counters
daemon listens to a UDP stream of syslog messages, and StatsD counters
are incremented when a log line matches a regular expression. Meter
names are mapped to regex match patterns in a JSON file, allowing
flexible configuration of what meters are extracted from the log stream.
@ -123,7 +115,7 @@ Swift StatsD logging
StatsD (see
http://codeascraft.etsy.com/2011/02/15/measure-anything-measure-everything/)
was designed for application code to be deeply instrumented; meters are
was designed for application code to be deeply instrumented. Meters are
sent in real-time by the code that just noticed or did something. The
overhead of sending a meter is extremely low: a ``sendto`` of one UDP
packet. If that overhead is still too high, the StatsD client library
@ -133,10 +125,10 @@ actual number when flushing meters upstream.
To avoid the problems inherent with middleware-based monitoring and
after-the-fact log processing, the sending of StatsD meters is
integrated into Object Storage itself. The submitted change set (see
https://review.openstack.org/#change,6058) currently reports 124 meters
`<https://review.openstack.org/#change,6058>`_) currently reports 124 meters
across 15 Object Storage daemons and the tempauth middleware. Details of
the meters tracked are in the `Administrator's
Guide <http://docs.openstack.org/developer/swift/admin_guide.html>`__.
Guide <http://docs.openstack.org/developer/swift/admin_guide.html>`_.
The sending of meters is integrated with the logging framework. To
enable, configure ``log_statsd_host`` in the relevant config file. You
@ -144,7 +136,7 @@ can also specify the port and a default sample rate. The specified
default sample rate is used unless a specific call to a statsd logging
method (see the list below) overrides it. Currently, no logging calls
override the sample rate, but it is conceivable that some meters may
require accuracy (sample\_rate == 1) while others may not.
require accuracy (sample_rate == 1) while others may not.
.. code-block:: ini
@ -184,63 +176,53 @@ in ``self.logger``, has these new methods:
Convenience method to record a timing meter whose value is "now"
minus an existing timestamp.
Note that these logging methods may safely be called anywhere you have a
logger object. If StatsD logging has not been configured, the methods
are no-ops. This avoids messy conditional logic each place a meter is
recorded. These example usages show the new logging methods:
.. note::
.. code-block:: python
These logging methods may safely be called anywhere you have a
logger object. If StatsD logging has not been configured, the methods
are no-ops. This avoids messy conditional logic each place a meter is
recorded. These example usages show the new logging methods:
# swift/obj/replicator.py
def update(self, job):
# ...
begin = time.time()
try:
hashed, local_hash = tpool.execute(tpooled_get_hashes, job['path'],
do_listdir=(self.replication_count % 10) == 0,
reclaim_age=self.reclaim_age)
# See tpooled_get_hashes "Hack".
if isinstance(hashed, BaseException):
raise hashed
self.suffix_hash += hashed
self.logger.update_stats('suffix.hashes', hashed)
.. code-block:: python
# swift/obj/replicator.py
def update(self, job):
# ...
finally:
self.partition_times.append(time.time() - begin)
self.logger.timing_since('partition.update.timing', begin)
begin = time.time()
try:
hashed, local_hash = tpool.execute(tpooled_get_hashes, job['path'],
do_listdir=(self.replication_count % 10) == 0,
reclaim_age=self.reclaim_age)
# See tpooled_get_hashes "Hack".
if isinstance(hashed, BaseException):
raise hashed
self.suffix_hash += hashed
self.logger.update_stats('suffix.hashes', hashed)
# ...
finally:
self.partition_times.append(time.time() - begin)
self.logger.timing_since('partition.update.timing', begin)
.. code-block:: python
.. code-block:: python
# swift/container/updater.py
def process_container(self, dbfile):
# ...
start_time = time.time()
# ...
for event in events:
if 200 <= event.wait() < 300:
successes += 1
else:
failures += 1
if successes > failures:
self.logger.increment('successes')
# ...
else:
self.logger.increment('failures')
# ...
# Only track timing data for attempted updates:
self.logger.timing_since('timing', start_time)
else:
self.logger.increment('no_changes')
self.no_changes += 1
The development team of StatsD wanted to use the
`pystatsd <https://github.com/sivy/py-statsd>`__ client library (not to
be confused with a `similar-looking
project <https://github.com/sivy/py-statsd>`__ also hosted on GitHub),
but the released version on PyPI was missing two desired features the
latest version in GitHub had: the ability to configure a meters prefix
in the client object and a convenience method for sending timing data
between ``now`` and a ``start`` timestamp you already have. So they just
implemented a simple StatsD client library from scratch with the same
interface. This has the nice fringe benefit of not introducing another
external library dependency into Object Storage.
# swift/container/updater.py
def process_container(self, dbfile):
# ...
start_time = time.time()
# ...
for event in events:
if 200 <= event.wait() < 300:
successes += 1
else:
failures += 1
if successes > failures:
self.logger.increment('successes')
# ...
else:
self.logger.increment('failures')
# ...
# Only track timing data for attempted updates:
self.logger.timing_since('timing', start_time)
else:
self.logger.increment('no_changes')
self.no_changes += 1

@ -2,12 +2,11 @@
Account reaper
==============
In the background, the account reaper removes data from the deleted
accounts.
The purpose of the account reaper is to remove data from the deleted accounts.
A reseller marks an account for deletion by issuing a ``DELETE`` request
on the account's storage URL. This action sets the ``status`` column of
the account\_stat table in the account database and replicas to
the account_stat table in the account database and replicas to
``DELETED``, marking the account's data for deletion.
Typically, a specific retention time or undelete are not provided.
@ -15,8 +14,8 @@ However, you can set a ``delay_reaping`` value in the
``[account-reaper]`` section of the ``account-server.conf`` file to
delay the actual deletion of data. At this time, to undelete you have to update
the account database replicas directly, set the status column to an
empty string and update the put\_timestamp to be greater than the
delete\_timestamp.
empty string and update the put_timestamp to be greater than the
delete_timestamp.
.. note::

@ -2,26 +2,26 @@
Components
==========
The components that enable Object Storage to deliver high availability,
high durability, and high concurrency are:
Object Storage uses the following components to deliver high
availability, high durability, and high concurrency:
- **Proxy servers.** Handle all of the incoming API requests.
- **Proxy servers** - Handle all of the incoming API requests.
- **Rings.** Map logical names of data to locations on particular
- **Rings** - Map logical names of data to locations on particular
disks.
- **Zones.** Isolate data from other zones. A failure in one zone
doesn't impact the rest of the cluster because data is replicated
- **Zones** - Isolate data from other zones. A failure in one zone
does not impact the rest of the cluster as data replicates
across zones.
- **Accounts and containers.** Each account and container are
- **Accounts and containers** - Each account and container are
individual databases that are distributed across the cluster. An
account database contains the list of containers in that account. A
container database contains the list of objects in that container.
- **Objects.** The data itself.
- **Objects** - The data itself.
- **Partitions.** A partition stores objects, account databases, and
- **Partitions** - A partition stores objects, account databases, and
container databases and helps manage locations where data lives in
the cluster.
@ -38,7 +38,7 @@ Proxy servers
Proxy servers are the public face of Object Storage and handle all of
the incoming API requests. Once a proxy server receives a request, it
determines the storage node based on the object's URL, for example,
determines the storage node based on the object's URL, for example:
https://swift.example.com/v1/account/container/object. Proxy servers
also coordinate responses, handle failures, and coordinate timestamps.
@ -47,14 +47,14 @@ needed based on projected workloads. A minimum of two proxy servers
should be deployed for redundancy. If one proxy server fails, the others
take over.
For more information concerning proxy server configuration, please see the
For more information concerning proxy server configuration, see
`Configuration Reference
<http://docs.openstack.org/liberty/config-reference/content/proxy-server-configuration.html>`__.
<http://docs.openstack.org/liberty/config-reference/content/proxy-server-configuration.html>`_.
Rings
-----
A ring represents a mapping between the names of entities stored on disk
A ring represents a mapping between the names of entities stored on disks
and their physical locations. There are separate rings for accounts,
containers, and objects. When other components need to perform any
operation on an object, container, or account, they need to interact
@ -90,15 +90,14 @@ The ring is used by the proxy server and several background processes
.. figure:: figures/objectstorage-ring.png
These rings are externally managed. The server processes themselves
do not modify the rings, they are instead given new rings modified by
other tools.
These rings are externally managed, in that the server processes
themselves do not modify the rings, they are instead given new rings
modified by other tools.
The ring uses a configurable number of bits from an MD5 hash for a path
The ring uses a configurable number of bits from an ``MD5`` hash for a path
as a partition index that designates a device. The number of bits kept
from the hash is known as the partition power, and 2 to the partition
power indicates the partition count. Partitioning the full MD5 hash ring
power indicates the partition count. Partitioning the full ``MD5`` hash ring
allows other parts of the cluster to work in batches of items at once
which ends up either more efficient or at least less complex than
working with each item separately or the entire cluster all at once.
@ -115,7 +114,7 @@ Zones
-----
Object Storage allows configuring zones in order to isolate failure
boundaries. Each data replica resides in a separate zone, if possible.
boundaries. If possible, each data replica resides in a separate zone.
At the smallest level, a zone could be a single drive or a grouping of a
few drives. If there were five object storage servers, then each server
would represent its own zone. Larger deployments would have an entire
@ -123,13 +122,6 @@ rack (or multiple racks) of object servers, each representing a zone.
The goal of zones is to allow the cluster to tolerate significant
outages of storage servers without losing all replicas of the data.
As mentioned earlier, everything in Object Storage is stored, by
default, three times. Swift will place each replica
"as-uniquely-as-possible" to ensure both high availability and high
durability. This means that when choosing a replica location, Object
Storage chooses a server in an unused zone before an unused server in a
zone that already has a replica of the data.
.. _objectstorage-zones-figure:
@ -138,9 +130,6 @@ zone that already has a replica of the data.
.. figure:: figures/objectstorage-zones.png
When a disk fails, replica data is automatically distributed to the
other zones to ensure there are three copies of the data.
Accounts and containers
-----------------------
@ -164,7 +153,7 @@ database references each object.
Partitions
----------
A partition is a collection of stored data, including account databases,
A partition is a collection of stored data. This includes account databases,
container databases, and objects. Partitions are core to the replication
system.

@ -2,11 +2,11 @@
Introduction to Object Storage
==============================
OpenStack Object Storage (code-named swift) is an open source software for
creating redundant, scalable data storage using clusters of standardized
servers to store petabytes of accessible data. It is a long-term storage
system for large amounts of static data that can be retrieved,
leveraged, and updated. Object Storage uses a distributed architecture
OpenStack Object Storage (swift) is used for redundant, scalable data
storage using clusters of standardized servers to store petabytes of
accessible data. It is a long-term storage system for large amounts of
static data which can be retrieved and updated. Object Storage uses a
distributed architecture
with no central point of control, providing greater scalability,
redundancy, and permanence. Objects are written to multiple hardware
devices, with the OpenStack software responsible for ensuring data