Restructure Object Storage chapter of Cloud Admin Guide

Restores Troubleshoot Object Storage Removes Monitoring section, which was based on a blog backport: havana Closes-Bug: #1251515 author: nermina miller Change-Id: I580b077a0124d7cd54dced6c0d340e05d5d5f983
2013-12-10 02:46:28 -05:00 · 2013-12-10 02:46:28 -05:00 · 2163ad9a00
commit 2163ad9a00
parent 7e8c23eb28
23 changed files with 959 additions and 147 deletions
--- a/doc/admin-guide-cloud/ch_objectstorage.xml
+++ b/doc/admin-guide-cloud/ch_objectstorage.xml
@ -5,6 +5,13 @@
    xml:id="ch_admin-openstack-object-storage">
    <?dbhtml stop-chunking?>
    <title>Object Storage</title>
-    <xi:include href="../common/section_about-object-storage.xml"/>
+    <xi:include href="../common/section_objectstorage-intro.xml"/>
+    <xi:include href="../common/section_objectstorage-features.xml"/>
+    <xi:include href="../common/section_objectstorage-characteristics.xml"/>
+    <xi:include href="../common/section_objectstorage-components.xml"/>
+    <xi:include href="../common/section_objectstorage-ringbuilder.xml"/>
+    <xi:include href="../common/section_objectstorage-arch.xml"/>
+    <xi:include href="../common/section_objectstorage-replication.xml"/>
    <xi:include href="section_object-storage-monitoring.xml"/>
+    <xi:include href="../common/section_objectstorage-troubleshoot.xml"/>
 </chapter>
--- a/doc/admin-guide-cloud/section_object-storage-monitoring.xml
+++ b/doc/admin-guide-cloud/section_object-storage-monitoring.xml
@ -3,6 +3,7 @@
    xmlns:xi="http://www.w3.org/2001/XInclude"
    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
    xml:id="ch_introduction-to-openstack-object-storage-monitoring">
+    <!-- ... Based on a blog, should be replaced with original material... -->
    <title>Object Storage monitoring</title>
    <?dbhtml stop-chunking?>
    <para>Excerpted from a blog post by <link
--- a/doc/common/figures/objectstorage-accountscontainers.png
+++ b/doc/common/figures/objectstorage-accountscontainers.png
--- a/doc/common/figures/objectstorage-arch.png
+++ b/doc/common/figures/objectstorage-arch.png
--- a/doc/common/figures/objectstorage-buildingblocks.png
+++ b/doc/common/figures/objectstorage-buildingblocks.png
--- a/doc/common/figures/objectstorage-nodes.png
+++ b/doc/common/figures/objectstorage-nodes.png
--- a/doc/common/figures/objectstorage-partitions.png
+++ b/doc/common/figures/objectstorage-partitions.png
--- a/doc/common/figures/objectstorage-replication.png
+++ b/doc/common/figures/objectstorage-replication.png
--- a/doc/common/figures/objectstorage-ring.png
+++ b/doc/common/figures/objectstorage-ring.png
--- a/doc/common/figures/objectstorage-usecase.png
+++ b/doc/common/figures/objectstorage-usecase.png
--- a/doc/common/figures/objectstorage-zones.png
+++ b/doc/common/figures/objectstorage-zones.png
--- a/doc/common/figures/objectstorage.png
+++ b/doc/common/figures/objectstorage.png
--- a/doc/common/section_objectstorage-account-reaper.xml
+++ b/doc/common/section_objectstorage-account-reaper.xml
@ -0,0 +1,40 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage-account-reaper">
+    <!-- ... Old module003-ch008-account-reaper edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Account reaper</title>
+    <para>In the background, the account reaper removes data from the deleted accounts.</para>
+    <para>A reseller marks an account for deletion by issuing a <code>DELETE</code> request on the account’s
+        storage URL. This action sets the <code>status</code> column of the account_stat table in the account
+        database and replicas to <code>DELETED</code>, marking the account's data for deletion.</para>
+    <para>Typically, a specific retention time or undelete are not provided. However, you can set a
+            <code>delay_reaping</code> value in the <code>[account-reaper]</code> section of the
+        account-server.conf to delay the actual deletion of data. At this time, to undelete you have
+        to update the account database replicas directly, setting the status column to an empty
+        string and updating the put_timestamp to be greater than the delete_timestamp.
+                <note><para>It's on the developers' to-do list to write a utility that performs this task, preferably
+                through a ReST call.</para></note>
+    </para>
+    <para>The account reaper runs on each account server and scans the server occasionally for
+        account databases marked for deletion. It only fires up on the accounts for which the server
+        is the primary node, so that multiple account servers aren’t trying to do it simultaneously.
+        Using multiple servers to delete one account might improve the deletion speed but requires
+        coordination to avoid duplication. Speed really is not a big concern with data deletion, and
+        large accounts aren’t deleted often.</para>
+    <para>Deleting an account is simple. For each account container, all objects are deleted and
+        then the container is deleted. Deletion requests that fail will not stop the overall process
+        but will cause the overall process to fail eventually (for example, if an object delete
+        times out, you will not be able to delete the container or the account). The account reaper
+        keeps trying to delete an account until it is empty, at which point the database reclaim
+        process within the db_replicator will remove the database files.</para>
+    <para>A persistent error state may prevent the deletion of an object
+        or container. If this happens, you will see
+        a message such as <code>“Account &lt;name&gt; has not been reaped
+            since &lt;date&gt;”</code> in the log. You can control when this is
+        logged with the <code>reap_warn_after</code> value in the <code>[account-reaper]</code>
+        section of the account-server.conf file. The default value is 30
+        days.</para>
+</section>
--- a/doc/common/section_objectstorage-arch.xml
+++ b/doc/common/section_objectstorage-arch.xml
@ -0,0 +1,75 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage-cluster-architecture">
+    <!-- ... Old module003-ch007-swift-cluster-architecture edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Cluster architecture</title>
+    <section xml:id="section_access-tier">
+        <title>Access tier</title>
+    <para>Large-scale deployments segment off an access tier, which is considered the Object Storage
+            system's central hub. The access tier fields the incoming API requests from clients and
+            moves data in and out of the system. This tier consists of front-end load balancers,
+            ssl-terminators, and authentication services. It runs the (distributed) brain of the
+            Object Storage system&#151;the proxy server processes.</para>
+    <figure>
+        <title>Object Storage architecture</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-arch.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+        <para>Because access servers are collocated in their own tier, you can scale out read/write
+            access regardless of the storage capacity. For example, if a cluster is on the public
+            Internet, requires SSL termination, and has a high demand for data access, you can
+            provision many access servers. However, if the cluster is on a private network and used
+            primarily for archival purposes, you need fewer access servers.</para>
+        <para>Since this is an HTTP addressable storage service, you may incorporate a load balancer
+            into the access tier.</para>
+        <para>Typically, the tier consists of a collection of 1U servers. These machines use a
+            moderate amount of RAM and are network I/O intensive. Since these systems field each
+            incoming API request, you should provision them with two high-throughput (10GbE)
+            interfacesone for the incoming "front-end"  requests and the other for the "back-end"
+            access to the object storage nodes to put and fetch data.</para>
+        <section xml:id="section_access-tier-considerations">
+            <title>Factors to consider</title>
+            <para>For most publicly facing deployments as well as private deployments available
+                across a wide-reaching corporate network, you use SSL to encrypt traffic to the
+                client. SSL adds significant processing load to establish sessions between clients,
+                which is why you have to provision more capacity in the access layer. SSL may not be
+                required for private deployments on trusted networks.</para>
+        </section>
+    </section>
+    <section xml:id="section_storage-nodes">
+        <title>Storage nodes</title>
+        <para>In most configurations, each of the five zones should have an equal amount of storage
+            capacity. Storage nodes use a reasonable amount of memory and CPU. Metadata needs to be
+            readily available to return objects quickly. The object stores run services not only to
+            field incoming requests from the access tier, but to also run replicators, auditors, and
+            reapers. You can provision object stores provisioned with single gigabit or 10 gigabit
+            network interface depending on the expected workload and desired performance.</para>
+    <figure>
+        <title>Object Storage (Swift)</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-nodes.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+        <para>Currently, 2TB or 3TB SATA disks deliver good price/performance value. You can use
+            desktop-grade drives if you have responsive remote hands in the datacenter and
+            enterprise-grade drives if you don't.</para>
+    <section xml:id="section_storage-nodes-considerations">
+            <title>Factors to consider</title>
+            <para>You should keep in mind the desired I/O performance for single-threaded requests .
+                This system does not use RAID, so a single disk handles each request for an object.
+                Disk performance impacts single-threaded response rates.</para>
+            <para>To achieve apparent higher throughput, the object storage system is designed to
+                handle concurrent uploads/downloads. The network I/O capacity (1GbE, bonded 1GbE
+                pair, or 10GbE) should match your desired concurrent throughput needs for reads and
+                writes.</para>
+    </section>
+</section>
+</section>
--- a/doc/common/section_objectstorage-characteristics.xml
+++ b/doc/common/section_objectstorage-characteristics.xml
@ -0,0 +1,59 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="objectstorage_characteristics">
+    <!-- ... Old module003-ch003-obj-store-capabilities edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Object Storage characteristics</title>
+    <para>The key characteristics of Object Storage are that:</para>
+    <itemizedlist>
+        <listitem>
+            <para>All objects stored in Object Storage have a URL.</para>
+        </listitem>
+        <listitem>
+            <para>All objects stored are replicated 3&#10005; in as-unique-as-possible zones, which
+                    can be defined as a group of drives, a node, a rack, and so on.</para>
+        </listitem>
+        <listitem>
+            <para>All objects have their own metadata.</para>
+        </listitem>
+        <listitem>
+            <para>Developers interact with the object storage system through a RESTful HTTP
+                    API.</para>
+        </listitem>
+        <listitem>
+            <para>Object data can be located anywhere in the cluster.</para>
+        </listitem>
+        <listitem>
+            <para>The cluster scales by adding additional nodes without sacrificing performance,
+                which allows a more cost-effective linear storage expansion than fork-lift
+                upgrades.</para>
+        </listitem>
+        <listitem>
+            <para>Data doesn't have to be migrate to an entirely new storage system.</para>
+        </listitem>
+        <listitem>
+            <para>New nodes can be added to the cluster without downtime.</para>
+        </listitem>
+        <listitem>
+            <para>Failed nodes and disks can be swapped out without downtime.</para>
+        </listitem>
+        <listitem>
+            <para>It runs on industry-standard hardware, such as Dell, HP, and Supermicro.</para>
+        </listitem>
+    </itemizedlist>
+    <figure>
+        <title>Object Storage (Swift)</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+    <para>Developers can either write directly to the Swift API or use one of the many client
+        libraries that exist for all of the popular programming languages, such as Java, Python,
+        Ruby, and C#. Amazon S3 and RackSpace Cloud Files users should be very familiar with Object
+        Storage. Users new to object storage systems will have to adjust to a different approach and
+        mindset than those required for a traditional filesystem.</para>
+</section>
--- a/doc/common/section_objectstorage-components.xml
+++ b/doc/common/section_objectstorage-components.xml
@ -0,0 +1,236 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage-components">
+    <!-- ... Old module003-ch004-swift-building-blocks edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Components</title>
+        <para>The components that enable Object Storage to deliver high availability, high
+        durability, and high concurrency are:</para>
+        <itemizedlist>
+            <listitem>
+                <para><emphasis role="bold">Proxy servers&#151;</emphasis>Handle all of the incoming
+                API requests.</para>
+            </listitem>
+            <listitem>
+                <para><emphasis role="bold">Rings&#151;</emphasis>Map logical names of data to
+                locations on particular disks.</para>
+            </listitem>
+            <listitem>
+                <para><emphasis role="bold">Zones&#151;</emphasis>Isolate data from other zones. A
+                failure in one zone doesn’t impact the rest of the cluster because data is
+                replicated across zones.</para>
+            </listitem>
+            <listitem>
+                <para><emphasis role="bold">Accounts and containers&#151;</emphasis>Each account and
+                container are individual databases that are distributed across the cluster. An
+                account database contains the list of containers in that account. A container
+                database contains the list of objects in that container.</para>
+            </listitem>
+            <listitem>
+                <para><emphasis role="bold">Objects&#151;</emphasis>The data itself.</para>
+            </listitem>
+            <listitem>
+                <para><emphasis role="bold">Partitions&#151;</emphasis>A partition stores objects,
+                account databases, and container databases and helps manage locations where data
+                lives in the cluster.</para>
+            </listitem>
+        </itemizedlist>
+    <figure>
+        <title>Object Storage building blocks</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-buildingblocks.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+    <section xml:id="section_proxy-servers">
+        <title>Proxy servers</title>
+        <para>Proxy servers are the public face of Object Storage and handle all of the incoming API
+            requests. Once a proxy server receives a request, it determines the storage node based
+            on the object's URL, for example, https://swift.example.com/v1/account/container/object.
+            Proxy servers also coordinate responses, handle failures, and coordinate
+            timestamps.</para>
+        <para>Proxy servers use a shared-nothing architecture and can be scaled as needed based on
+            projected workloads. A minimum of two proxy servers should be deployed for redundancy.
+            If one proxy server fails, the others take over.</para>
+    </section>
+    <section xml:id="section_ring">
+    <title>Rings</title>
+    <para>A ring represents a mapping between the names of entities stored on disk and their
+            physical locations. There are separate rings for accounts, containers, and objects. When
+            other components need to perform any operation on an object, container, or account, they
+            need to interact with the appropriate ring to determine their location in the
+            cluster.</para>
+    <para>The ring maintains this mapping using zones, devices, partitions, and replicas. Each
+            partition in the ring is replicated, by default, three times across the cluster, and
+            partition locations are stored in the mapping maintained by the ring. The ring is also
+            responsible for determining which devices are used for handoff in failure
+            scenarios.</para>
+        <para>Data can be isolated into zones in the ring. Each partition replica is guaranteed to
+            reside in a different zone. A zone could represent a drive, a server, a cabinet, a
+            switch, or even a data center.</para>
+        <para>The partitions of the ring are equally divided among all of the devices in the Object
+            Storage installation. When partitions need to be moved around (for example, if a device
+            is added to the cluster), the ring ensures that a minimum number of partitions are moved
+            at a time, and only one replica of a partition is moved at a time.</para>
+        <para>Weights can be used to balance the distribution of partitions on drives across the
+            cluster. This can be useful, for example, when differently sized drives are used in a
+            cluster.</para>
+        <para>The ring is used by the proxy server and several background processes (like
+            replication).</para>
+    <figure>
+        <title>The <emphasis role="bold">ring</emphasis></title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-ring.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+        <para>These rings are externally managed, in that the server processes themselves do not
+            modify the rings, they are instead given new rings modified by other tools.</para>
+        <para>The ring uses a configurable number of bits from a
+            path’s MD5 hash as a partition index that designates a
+            device. The number of bits kept from the hash is known as
+            the partition power, and 2 to the partition power
+            indicates the partition count. Partitioning the full MD5
+            hash ring allows other parts of the cluster to work in
+            batches of items at once which ends up either more
+            efficient or at least less complex than working with each
+            item separately or the entire cluster all at once.</para>
+        <para>Another configurable value is the replica count, which indicates how many of the
+            partition-device assignments make up a single ring. For a given partition number, each
+            replica’s device will not be in the same zone as any other replica's device. Zones can
+            be used to group devices based on physical locations, power separations, network
+            separations, or any other attribute that would improve the availability of multiple
+            replicas at the same time.</para>
+    </section>
+<section xml:id="section_zones">
+        <title>Zones</title>
+            <para>Object Storage allows configuring zones in order to isolate failure boundaries.
+            Each data replica resides in a separate zone, if possible. At the smallest level, a zone
+            could be a single drive or a grouping of a few drives. If there were five object storage
+            servers, then each server would represent its own zone. Larger deployments would have an
+            entire rack (or multiple racks) of object servers, each representing a zone. The goal of
+            zones is to allow the cluster to tolerate significant outages of storage servers without
+            losing all replicas of the data.</para>
+            <para>As mentioned earlier, everything in Object Storage is stored, by default, three
+            times. Swift will place each replica "as-uniquely-as-possible" to ensure both high
+            availability and high durability. This means that when chosing a replica location,
+            Object Storage chooses a server in an unused zone before an unused server in a zone that
+            already has a replica of the data.</para>
+    <figure>
+        <title>Zones</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-zones.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+            <para>When a disk fails, replica data is automatically distributed to the other zones to
+            ensure there are three copies of the data.</para>
+    </section>
+    <section xml:id="section_accounts-containers">
+            <title>Accounts and containers</title>
+            <para>Each account and container is an individual SQLite
+                database that is distributed across the cluster. An
+                account database contains the list of containers in
+                that account. A container database contains the list
+                of objects in that container.</para>
+    <figure>
+        <title>Accounts and containers</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-accountscontainers.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+            <para>To keep track of object data locations, each account in the system has a database
+            that references all of its containers, and each container database references each
+            object.</para>
+    </section>
+    <section xml:id="section_partitions">
+            <title>Partitions</title>
+            <para>A partition is a collection of stored data, including account databases, container
+            databases, and objects. Partitions are core to the replication system.</para>
+            <para>Think of a partition as a bin moving throughout a fulfillment center warehouse.
+            Individual orders get thrown into the bin. The system treats that bin as a cohesive
+            entity as it moves throughout the system. A bin is easier to deal with than many little
+            things. It makes for fewer moving parts throughout the system.</para>
+            <para>System replicators and object uploads/downloads operate on partitions. As the
+            system scales up, its behavior continues to be predictable because the number of
+            partitions is a fixed number.</para>
+            <para>Implementing a partition is conceptually simple&#151;a partition is just a
+            directory sitting on a disk with a corresponding hash table of what it contains.</para>
+    <figure>
+        <title>Partitions</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-partitions.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+    </section>
+    <section xml:id="section_replicators">
+            <title>Replicators</title>
+        <para>In order to ensure that there are three copies of the data everywhere, replicators
+            continuously examine each partition. For each local partition, the replicator compares
+            it against the replicated copies in the other zones to see if there are any
+            differences.</para>
+            <para>The replicator knowd if replication needs to take plac by examining hashes. A hash
+            file is created for each partition, which contains hashes of each directory in the
+            partition. Each of the three hash files is compared. For a given partition, the hash
+            files for each of the partition's copies are compared. If the hashes are different, then
+            it is time to replicate, and the directory that needs to be replicated is copied
+            over.</para>
+            <para>This is where partitions come in handy. With fewer things in the system, larger
+            chunks of data are transferred around (rather than lots of little TCP connections, which
+            is inefficient) and there is a consistent number of hashes to compare.</para>
+            <para>The cluster eventually has a consistent behavior where the newest data has a
+            priority.</para>
+    <figure>
+        <title>Replication</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-replication.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+            <para>If a zone goes down, one of the nodes containing a replica notices and proactively
+            copies data to a handoff location.</para>
+    </section>
+    <section xml:id="section_usecases">
+    <title>Use cases</title>
+        <para>The following sections show use cases for object uploads and downloads and introduce the components.</para>
+        <section xml:id="upload">
+    <title>Upload</title>
+        <para>A client uses the REST API to make a HTTP request to PUT an object into an existing
+                container. The cluster receives the request. First, the system must figure out where
+                the data is going to go. To do this, the account name, container name, and object
+                name are all used to determine the partition where this object should live.</para>
+        <para>Then a lookup in the ring figures out which storage nodes contain the partitions in
+                question.</para>
+        <para>The data is then sent to each storage node where it is placed in the appropriate
+                partition. At least two of the three writes must be successful before the client is
+                notified that the upload was successful.</para>
+        <para>Next, the container database is updated asynchronously to reflect that there is a new
+                object in it.</para>
+    <figure>
+        <title>Object Storage in use</title>
+        <mediaobject>
+            <imageobject>
+                <imagedata fileref="../common/figures/objectstorage-usecase.png"/>
+            </imageobject>
+        </mediaobject>
+    </figure>
+        </section>
+    <section xml:id="section_swift-component-download">
+    <title>Download</title>
+        <para>A request comes in for an acount/container/object. Using the same consistent hashing,
+                the partition name is generated. A lookup in the ring reveals which storage nodes
+                contain that partition. A request is made to one of the storage nodes to fetch the
+                object and, if that fails, requests are made to the other nodes.</para>
+        </section>
+</section>
+</section>
--- a/doc/common/section_objectstorage-features.xml
+++ b/doc/common/section_objectstorage-features.xml
@ -0,0 +1,180 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage_features">
+    <!-- ... Old module003-ch002-features-benefits edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Features and benefits</title>
+    <para>
+        <informaltable class="c19">
+            <tbody>
+                <tr>
+                    <th rowspan="1" colspan="1">Features</th>
+                    <th rowspan="1" colspan="1">Benefits</th>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Leverages commodity
+                        hardware</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >No
+                        lock-in, lower
+                        price/GB</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >HDD/node failure agnostic</emphasis></td>
+                    <td rowspan="1" colspan="1">Self-healing, reliable, data redundancy protects
+                        from failures</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Unlimited storage</emphasis></td>
+                    <td rowspan="1" colspan="1">Large and flat namespace, highly scalable read/write
+                        access, able to serve content directly from storage system</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Multi-dimensional scalability</emphasis>
+                       </td>
+                    <td rowspan="1" colspan="1">Scale-out architecture&#151;Scale vertically and
+                        horizontally-distributed storage Backs up and archives large amounts of data
+                        with linear performance</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold">Account/container/object
+                            structure</emphasis></td>
+                    <td rowspan="1" colspan="1">No nesting, not a traditional file
+                        system&#151;Optimized for scale, it scales to multiple petabytes and
+                        billions of objects</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold">Built-in replication 3&#10005;
+                        + data redundancy (compared with 2&#10005; on RAID)</emphasis></td>
+                    <td rowspan="1" colspan="1">A configurable number of accounts, containers and
+                        object copies for high availability</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                        >Easily add capacity (unlike
+                        RAID resize)</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Elastic
+                        data scaling with
+                        ease</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >No central database</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Higher
+                        performance, no
+                        bottlenecks</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >RAID not required</emphasis></td>
+                    <td rowspan="1" colspan="1">Handle many small, random reads and writes
+                        efficiently</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Built-in management
+                        utilities</emphasis></td>
+                    <td rowspan="1" colspan="1">Account management&#151;Create, add, verify, and
+                        delete users; Container management&#151;Upload, download, and verify;
+                        Monitoring&#151;Capacity, host, network, log trawling, and cluster
+                        health</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Drive auditing</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Detect
+                        drive failures preempting data
+                        corruption</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Expiring objects</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Users
+                        can set an expiration time or a TTL on an
+                        object to control
+                        access</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Direct object access</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Enable
+                        direct browser access to content, such as for
+                        a control
+                        panel</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Realtime visibility into client
+                            requests</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Know
+                        what users are
+                        requesting</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Supports S3 API</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Utilize
+                        tools that were designed for the popular S3
+                        API</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Restrict containers per
+                            account</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Limit
+                        access to control usage by
+                        user</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Support for NetApp, Nexenta,
+                            SolidFire</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Unified
+                        support for block volumes using a variety of
+                        storage
+                        systems</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Snapshot and backup API for block
+                            volumes</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Data
+                        protection and recovery for VM
+                        data</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Standalone volume API
+                            available</emphasis></td>
+                    <td rowspan="1" colspan="1"
+                        >Separate
+                        endpoint and API for integration with other
+                        compute
+                        systems</td>
+                </tr>
+                <tr>
+                    <td rowspan="1" colspan="1"><emphasis role="bold"
+                            >Integration with Compute</emphasis></td>
+                    <td rowspan="1" colspan="1">Fully integrated with Compute for attaching block
+                        volumes and reporting on usage</td>
+                </tr>
+            </tbody>
+        </informaltable>
+    </para>
+</section>
--- a/doc/common/section_objectstorage-intro.xml
+++ b/doc/common/section_objectstorage-intro.xml
@ -0,0 +1,23 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage-intro">
+    <!-- ... Old module003-ch001-intro-objstore edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Introduction to Object Storage</title>
+    <para>OpenStack Object Storage (code-named Swift) is open source software for creating
+        redundant, scalable data storage using clusters of standardized servers to store petabytes
+        of accessible data. It is a long-term storage system for large amounts of static data that
+        can be retrieved, leveraged, and updated. Object Storage uses a distributed architecture
+        with no central point of control, providing greater scalability, redundancy, and permanence.
+        Objects are written to multiple hardware devices, with the OpenStack software responsible
+        for ensuring data replication and integrity across the cluster. Storage clusters scale
+        horizontally by adding new nodes. Should a node fail, OpenStack works to replicate its
+        content from other active nodes. Because OpenStack uses software logic to ensure data
+        replication and distribution across different devices, inexpensive commodity hard drives and
+        servers can be used in lieu of more expensive equipment.</para>
+    <para>Object Storage is ideal for cost effective, scale-out storage. It provides a fully
+        distributed, API-accessible storage platform that can be integrated directly into
+        applications or used for backup, archiving, and data retention.</para>
+</section>
--- a/doc/common/section_objectstorage-replication.xml
+++ b/doc/common/section_objectstorage-replication.xml
@ -0,0 +1,99 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage-replication">
+    <!-- ... Old module003-ch009-replication edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Replication</title>
+    <para>Because each replica in Object Storage functions independently, and clients generally
+        require only a simple majority of nodes responding to consider an operation successful,
+        transient failures like network partitions can quickly cause replicas to diverge. These
+        differences are eventually reconciled by asynchronous, peer-to-peer replicator processes.
+        The replicator processes traverse their local filesystems, concurrently performing
+        operations in a manner that balances load across physical disks.</para>
+    <para>Replication uses a push model, with records and files
+        generally only being copied from local to remote replicas.
+        This is important because data on the node may not belong
+        there (as in the case of handoffs and ring changes), and a
+        replicator can’t know what data exists elsewhere in the
+        cluster that it should pull in. It’s the duty of any node that
+        contains data to ensure that data gets to where it belongs.
+        Replica placement is handled by the ring.</para>
+    <para>Every deleted record or file in the system is marked by a
+        tombstone, so that deletions can be replicated alongside
+        creations. The replication process cleans up tombstones after
+        a time period known as the consistency window. The consistency
+        window encompasses replication duration and how long transient
+        failure can remove a node from the cluster. Tombstone cleanup
+        must be tied to replication to reach replica
+        convergence.</para>
+    <para>If a replicator detects that a remote drive has failed, the
+        replicator uses the get_more_nodes interface for the ring to
+        choose an alternate node with which to synchronize. The
+        replicator can maintain desired levels of replication in the
+        face of disk failures, though some replicas may not be in an
+        immediately usable location. Note that the replicator doesn’t
+        maintain desired levels of replication when other failures,
+        such as entire node failures, occur because most failure are
+        transient.</para>
+    <para>Replication is an area of active development, and likely
+        rife with potential improvements to speed and
+        correctness.</para>
+    <para>There are two major classes of replicator&#151;the db replicator, which replicates
+        accounts and containers, and the object replicator, which replicates object data.</para>
+    <section xml:id="section_database-replication">
+    <title>Database replication</title>
+        <para>The first step performed by db replication is a low-cost
+            hash comparison to determine whether two replicas already
+            match. Under normal operation, this check is able to
+            verify that most databases in the system are already
+            synchronized very quickly. If the hashes differ, the
+            replicator brings the databases in sync by sharing records
+            added since the last sync point.</para>
+        <para>This sync point is a high water mark noting the last
+            record at which two databases were known to be in sync,
+            and is stored in each database as a tuple of the remote
+            database id and record id. Database ids are unique amongst
+            all replicas of the database, and record ids are
+            monotonically increasing integers. After all new records
+            have been pushed to the remote database, the entire sync
+            table of the local database is pushed, so the remote
+            database can guarantee that it is in sync with everything
+            with which the local database has previously
+            synchronized.</para>
+        <para>If a replica is found to be missing entirely, the whole
+            local database file is transmitted to the peer using
+            rsync(1) and vested with a new unique id.</para>
+        <para>In practice, DB replication can process hundreds of
+            databases per concurrency setting per second (up to the
+            number of available CPUs or disks) and is bound by the
+            number of DB transactions that must be performed.</para>
+    </section>
+    <section xml:id="section_object-replication">
+        <title>Object replication</title>
+        <para>The initial implementation of object replication simply
+            performed an rsync to push data from a local partition to
+            all remote servers it was expected to exist on. While this
+            performed adequately at small scale, replication times
+            skyrocketed once directory structures could no longer be
+            held in RAM. We now use a modification of this scheme in
+            which a hash of the contents for each suffix directory is
+            saved to a per-partition hashes file. The hash for a
+            suffix directory is invalidated when the contents of that
+            suffix directory are modified.</para>
+        <para>The object replication process reads in these hash
+            files, calculating any invalidated hashes. It then
+            transmits the hashes to each remote server that should
+            hold the partition, and only suffix directories with
+            differing hashes on the remote server are rsynced. After
+            pushing files to the remote server, the replication
+            process notifies it to recalculate hashes for the rsynced
+            suffix directories.</para>
+        <para>Performance of object replication is generally bound by the number of uncached
+            directories it has to traverse, usually as a result of invalidated suffix directory
+            hashes. Using write volume and partition counts from our running systems, it was
+            designed so that around 2 percent of the hash space on a normal node will be invalidated
+            per day, which has experimentally given us acceptable replication speeds.</para>
+    </section>
+</section>
--- a/doc/common/section_objectstorage-ringbuilder.xml
+++ b/doc/common/section_objectstorage-ringbuilder.xml
@ -0,0 +1,129 @@
+<?xml version="1.0" encoding="utf-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink"
+    version="5.0"
+    xml:id="section_objectstorage-ringbuilder">
+    <!-- ... Old module003-ch005-the-ring edited, renamed, and stored in doc/common for use by both Cloud Admin and Operator Training Guides... -->
+    <title>Ring-builder</title>
+    <para>Rings are built and managed manually by a utility called the ring-builder. The
+        ring-builder assigns partitions to devices and writes an optimized Python structure to a
+        gzipped, serialized file on disk for shipping out to the servers. The server processes just
+        check the modification time of the file occasionally and reload their in-memory copies of
+        the ring structure as needed. Because of how the ring-builder manages changes to the ring,
+        using a slightly older ring usually just means one of the three replicas for a subset of the
+        partitions will be incorrect, which can be easily worked around.</para>
+    <para>The ring-builder also keeps its own builder file with the ring information and additional
+        data required to build future rings. It is very important to keep multiple backup copies of
+        these builder files. One option is to copy the builder files out to every server while
+        copying the ring files themselves. Another is to upload the builder files into the cluster
+        itself. If you lose the builder file, you have to create a new ring from scratch. Nearly all
+        partitions would be assigned to different devices and, therefore, nearly all of the stored
+        data would have to be replicated to new locations. So, recovery from a builder file loss is
+        possible, but data would be unreachable for an extended time.</para>
+    <section xml:id="section_ring-data-structure">
+        <title>Ring data structure</title>
+        <para>The ring data structure consists of three top level
+            fields: a list of devices in the cluster, a list of lists
+            of device ids indicating partition to device assignments,
+            and an integer indicating the number of bits to shift an
+            MD5 hash to calculate the partition for the hash.</para>
+    </section>
+    <section xml:id="section_partition-assignment">
+            <title>Partition assignment list</title>
+        <para>This is a list of <literal>array(‘H’)</literal> of devices ids. The
+            outermost list contains an <literal>array(‘H’)</literal> for each
+            replica. Each <literal>array(‘H’)</literal> has a length equal to the
+                partition count for the ring. Each integer in the
+                <literal>array(‘H’)</literal> is an index into the above list of devices.
+                The partition list is known internally to the Ring
+                class as <literal>_replica2part2dev_id</literal>.</para>
+            <para>So, to create a list of device dictionaries assigned to a partition, the Python
+            code would look like:
+            <programlisting>devices = [self.devs[part2dev_id[partition]] for
+part2dev_id in self._replica2part2dev_id]</programlisting></para>
+            <para>That code is a little simplistic, as it does not account for the removal of
+            duplicate devices. If a ring has more replicas than devices, then a partition will have
+            more than one replica on one device.</para>
+        <para><literal>array(‘H’)</literal> is used for memory conservation as there
+                may be millions of partitions.</para>
+    </section>
+    <section xml:id="section_fractional-replicas">
+            <title>Fractional replicas</title>
+            <para>A ring is not restricted to having an integer number
+                of replicas. In order to support the gradual changing
+                of replica counts, the ring is able to have a real
+                number of replicas.</para>
+            <para>When the number of replicas is not an integer, then the last element of
+                <literal>_replica2part2dev_id</literal> will have a length that is less than the
+            partition count for the ring. This means that some partitions will have more replicas
+            than others. For example, if a ring has 3.25 replicas, then 25 percent of its partitions
+            will have four replicas, while the remaining 75 percent will have just three.</para>
+    </section>
+<section xml:id="section_partition-shift-value">
+            <title>Partition shift value</title>
+            <para>The partition shift value is known internally to the
+                Ring class as <literal>_part_shift</literal>. This value used to shift an
+                MD5 hash to calculate the partition on which the data
+                for that hash should reside. Only the top four bytes
+                of the hash is used in this process. For example, to
+                compute the partition for the path
+                /account/container/object the Python code might look
+                like:
+<programlisting>partition = unpack_from('&gt;I',
+md5('/account/container/object').digest())[0] &gt;&gt;
+self._part_shift</programlisting></para>
+            <para>For a ring generated with part_power P, the
+                partition shift value is <literal>32 - P</literal>.</para>
+</section>
+    <section xml:id="section_build-ring">
+        <title>Build the ring</title>
+        <para>The initial building of the ring first calculates the
+            number of partitions that should ideally be assigned to
+            each device based the device’s weight. For example, given
+            a partition power of 20, the ring will have 1,048,576
+            partitions. If there are 1,000 devices of equal weight
+            they will each desire 1,048.576 partitions. The devices
+            are then sorted by the number of partitions they desire
+            and kept in order throughout the initialization
+            process.</para>
+        <note><para>Each device is also assigned a random tiebreaker
+            value that is used when two devices desire the same number
+            of partitions. This tiebreaker is not stored on disk
+            anywhere, and so two different rings created with the same
+            parameters will have different partition assignments. For
+            repeatable partition assignments, <literal>RingBuilder.rebalance()</literal>
+            takes an optional seed value that will be used to seed
+            Python’s pseudo-random number generator.</para></note>
+        <para>Then, the ring builder assigns each replica of each partition to the device that
+            requires most partitions at that point while keeping it as far away as possible from
+            other replicas. The ring builder prefers to assign a replica to a device in a region
+            does not already have a replica. If no such region is available, the ring builder tries
+            to find a device in a different zone. If that's not possible, it will look on a
+            different server. If it doesn't find one there, it will just look for a device that has
+            no replicas. Finally, if all of the other options are exhausted, the ring builder
+            assigns the replica to the device that has the fewest replicas already assigned. Note
+            that assignment of multiple replicas to one device will only happen if the ring has
+            fewer devices than it has replicas.</para>
+        <para>When building a new ring based on an old ring, the desired number of partitions each
+            device wants is recalculated. Next, the partitions to be reassigned are gathered up. Any
+            removed devices have all their assigned partitions unassigned and added to the gathered
+            list. Any partition replicas that (due to the addition of new devices) can be spread out
+            for better durability are unassigned and added to the gathered list. Any devices that
+            have more partitions than they now need have random partitions unassigned from them and
+            added to the gathered list. Lastly, the gathered partitions are then reassigned to
+            devices using a similar method as in the initial assignment described above.</para>
+        <para>Whenever a partition has a replica reassigned, the time of the reassignment is
+            recorded. This is taken into account when gathering partitions to reassign so that no
+            partition is moved twice in a configurable amount of time. This configurable amount of
+            time is known internally to the RingBuilder class as <literal>min_part_hours</literal>.
+            This restriction is ignored for replicas of partitions on devices that have been removed
+            since removing a device only happens on device failure and reasssignment is the only
+            choice.</para>
+        <para>The above processes don’t always perfectly rebalance a ring due to the random nature
+            of gathering partitions for reassignment. To help reach a more balanced ring, the
+            rebalance process is repeated until near perfect (less than 1 percent off) or when the
+            balance doesn’t improve by at least 1 percent (indicating we probably can’t get perfect
+            balance due to wildly imbalanced zones or too many partitions recently moved).</para>
+    </section>
+</section>
--- a/doc/common/section_objectstorage-troubleshoot.xml
+++ b/doc/common/section_objectstorage-troubleshoot.xml
@ -0,0 +1,106 @@
+<?xml version="1.0" encoding="UTF-8"?>
+<section xmlns="http://docbook.org/ns/docbook"
+    xmlns:xi="http://www.w3.org/2001/XInclude"
+    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
+    xml:id="troubleshooting-openstack-object-storage">
+    <title>Troubleshoot Object Storage</title>
+    <para>For Object Storage, everything is logged in <filename>/var/log/syslog</filename> (or messages on some distros).
+        Several settings enable further customization of logging, such as <literal>log_name</literal>, <literal>log_facility</literal>,
+        and <literal>log_level</literal>, within the object server configuration files.</para>
+    <section xml:id="drive-failure">
+        <title>Drive failure</title>
+        <para>In the event that a drive has failed, the first step is to make sure the drive is
+            unmounted. This will make it easier for Object Storage to work around the failure until
+            it has been resolved. If the drive is going to be replaced immediately, then it is just
+            best to replace the drive, format it, remount it, and let replication fill it up.</para>
+        <para>If the drive can’t be replaced immediately, then it is best to leave it
+            unmounted, and remove the drive from the ring. This will allow all the replicas
+            that were on that drive to be replicated elsewhere until the drive is replaced.
+            Once the drive is replaced, it can be re-added to the ring.</para>
+        <para>You can look at error messages in <filename>/var/log/kern.log</filename> for hints of drive failure.</para>
+    </section>
+    <section xml:id="server-failure">
+        <title>Server failure</title>
+        <para>If a server is having hardware issues, it is a good idea to make sure the
+            Object Storage services are not running. This will allow Object Storage to
+            work around the failure while you troubleshoot.</para>
+        <para>If the server just needs a reboot, or a small amount of work that should only
+            last a couple of hours, then it is probably best to let Object Storage work
+            around the failure and get the machine fixed and back online. When the machine
+            comes back online, replication will make sure that anything that is missing
+            during the downtime will get updated.</para>
+        <para>If the server has more serious issues, then it is probably best to remove all
+            of the server’s devices from the ring. Once the server has been repaired and is
+            back online, the server’s devices can be added back into the ring. It is
+            important that the devices are reformatted before putting them back into the
+            ring as it is likely to be responsible for a different set of partitions than
+            before.</para>
+    </section>
+    <section xml:id="detect-failed-drives">
+        <title>Detect failed drives</title>
+        <para>It has been our experience that when a drive is about to fail, error messages will spew into
+            /var/log/kern.log. There is a script called swift-drive-audit that can be run via cron
+            to watch for bad drives. If errors are detected, it will unmount the bad drive, so that
+            Object Storage can work around it. The script takes a configuration file with the
+            following settings:</para>
+        <xi:include href="tables/swift-drive-audit-drive-audit.xml"/>
+        <para>This script has only been tested on Ubuntu 10.04, so if you are using a
+            different distro or OS, some care should be taken before using in production.
+        </para>
+    </section>
+    <section xml:id="recover-ring-builder-file">
+        <title>Emergency recovery of ring builder files</title>
+        <para>You should always keep a backup of Swift ring builder files. However, if an
+            emergency occurs, this procedure may assist in returning your cluster to an
+            operational state.</para>
+        <para>Using existing Swift tools, there is no way to recover a builder file from a
+            <filename>ring.gz</filename> file. However, if you have a knowledge of Python, it is possible to
+            construct a builder file that is pretty close to the one you have lost. The
+            following is what you will need to do.</para>
+        <warning>
+            <para>This procedure is a last-resort for emergency circumstances&#151;it
+                requires knowledge of the swift python code and may not succeed.</para>
+        </warning>
+        <para>First, load the ring and a new ringbuilder object in a Python REPL:</para>
+        <programlisting language="python">>>> from swift.common.ring import RingData, RingBuilder
+>>> ring = RingData.load('/path/to/account.ring.gz')</programlisting>
+        <para>Now, start copying the data we have in the ring into the builder.</para>
+        <programlisting language="python">
+>>> import math
+>>> partitions = len(ring._replica2part2dev_id[0])
+>>> replicas = len(ring._replica2part2dev_id)
+
+>>> builder = RingBuilder(int(Math.log(partitions, 2)), replicas, 1)
+>>> builder.devs = ring.devs
+>>> builder._replica2part2dev = ring.replica2part2dev_id
+>>> builder._last_part_moves_epoch = 0
+>>> builder._last_part_moves = array('B', (0 for _ in xrange(self.parts)))
+>>> builder._set_parts_wanted()
+>>> for d in builder._iter_devs():
+            d['parts'] = 0
+>>> for p2d in builder._replica2part2dev:
+            for dev_id in p2d:
+                builder.devs[dev_id]['parts'] += 1</programlisting>
+        <para>This is the extent of the recoverable fields. For
+            <literal>min_part_hours</literal>  you'll either have to remember what the
+            value you used was, or just make up a new one.</para>
+        <programlisting language="python">
+>>> builder.change_min_part_hours(24) # or whatever you want it to be</programlisting>
+        <para>Try some validation: if this doesn't raise an exception, you may feel some
+            hope. Not too much, though.</para>
+        <programlisting language="python">>>> builder.validate()</programlisting>
+        <para>Save the builder.</para>
+        <programlisting language="python">
+>>> import pickle
+>>> pickle.dump(builder.to_dict(), open('account.builder', 'wb'), protocol=2)</programlisting>
+        <para>You should now have a file called 'account.builder' in the current working
+            directory. Next, run <literal>swift-ring-builder account.builder write_ring</literal>
+            and compare the new account.ring.gz to the account.ring.gz that you started
+            from. They probably won't be byte-for-byte identical, but if you load them up
+            in a REPL and their <literal>_replica2part2dev_id</literal> and
+            <literal>devs</literal> attributes are the same (or nearly so), then you're
+            in good shape.</para>
+        <para>Next, repeat the procedure for <filename>container.ring.gz</filename>
+            and <filename>object.ring.gz</filename>, and you might get usable builder files.</para>
+    </section>
+</section>
--- a/doc/common/section_support-object-storage.xml
+++ b/doc/common/section_support-object-storage.xml
@ -1,144 +0,0 @@
-<?xml version="1.0" encoding="UTF-8"?>
-<chapter xmlns="http://docbook.org/ns/docbook"
-    xmlns:xi="http://www.w3.org/2001/XInclude"
-    xmlns:xlink="http://www.w3.org/1999/xlink" version="5.0"
-    xml:id="troubleshooting-openstack-object-storage">
-    <title>Troubleshoot Object Storage</title>
-    <para>For OpenStack Object Storage, everything is logged in
-            <filename>/var/log/syslog</filename> (or messages on some
-        distros). Several settings enable further customization of
-        logging, such as <option>log_name</option>,
-            <option>log_facility</option>, and
-            <option>log_level</option>, within the object server
-        configuration files.</para>
-    <section xml:id="handling-drive-failure">
-        <title>Recover drive failures</title>
-        <para>If a drive fails, make sure the
-            drive is unmounted to make it easier for Object
-            Storage to work around the failure while you resolve
-            it. If you plan to replace the drive immediately, replace
-            the drive, format it, remount it, and let replication fill
-            it.</para>
-        <para>If you cannot replace the drive immediately, leave it
-            unmounted and remove the drive from the ring. This enables
-            you to replicate all the replicas on that drive elsewhere
-            until you can replace the drive. After you replace the
-            drive, you can add it to the ring again.</para>
-        <note>
-            <para>Rackspace has seen hints at drive failures by
-                looking at error messages in
-                    <filename>/var/log/kern.log</filename>. Check this
-                file in your monitoring.</para>
-        </note>
-    </section>
-    <section xml:id="handling-server-failure">
-        <title>Recover server failures</title>
-        <para>If a server has hardware issues, make sure that the
-            Object Storage services are not running. This enables
-            Object Storage to work around the failure while you
-            troubleshoot.</para>
-        <para>If the server needs a reboot or a minimal amount of
-            work, let Object Storage work around the failure while you
-            fix the machine and get it back online. When the machine
-            comes back online, replication updates anything that was
-            missing during the downtime.</para>
-        <para>If the server has more serious issues,remove all server
-            devices from the ring. After you repair and put the server
-            online, you can add the devices for the server back to the
-            ring. You must reformat the devices before you add them to
-            the ring because they might be responsible for a different
-            set of partitions than before.</para>
-    </section>
-    <section xml:id="detecting-failed-drives">
-        <title>Detect failed drives</title>
-        <para>When a drive is about to fail, many error messages
-            appear in the <filename>/var/log/kern.log</filename> file.
-            You can run the <package>swift-drive-audit</package>
-            script through <command>cron</command> to watch for bad
-            drives. If errors are detected, it unmounts the bad drive
-            so that Object Storage can work around it. The script uses
-            a configuration file with these settings:</para>
-        <xi:include href="tables/swift-drive-audit-drive-audit.xml"/>
-        <para>This script has been tested on only Ubuntu 10.04. If you
-            use a different distribution or operating system, take
-            care before using the script in production.</para>
-    </section>
-    <section xml:id="recover-ring-builder-file">
-        <title>Recover ring builder files (emergency)</title>
-        <para>You should always keep a backup of Swift ring builder
-            files. However, if an emergency occurs, use this procedure
-            to return your cluster to an operational state.</para>
-        <para>Existing Swift tools do not enable you to recover a
-            builder file from a <filename>ring.gz</filename> file.
-            However, if you have Python knowledge, you can construct a
-            builder file similar to the one you have lost.</para>
-        <warning>
-            <para>This procedure is a last-resort in an emergency. It
-                requires knowledge of the swift Python code and might
-                not succeed.</para>
-        </warning>
-        <procedure>
-            <step>
-                <para>Load the ring and a new ringbuilder object in a
-                    Python REPL:</para>
-                <programlisting language="python">>>> from swift.common.ring import RingData, RingBuilder
->>> ring = RingData.load('/path/to/account.ring.gz')</programlisting>
-            </step>
-            <step>
-                <para>Copy the data in the ring into the
-                    builder.</para>
-                <programlisting language="python">>>> import math
->>> partitions = len(ring._replica2part2dev_id[0])
->>> replicas = len(ring._replica2part2dev_id)
-
->>> builder = RingBuilder(int(Math.log(partitions, 2)), replicas, 1)
->>> builder.devs = ring.devs
->>> builder._replica2part2dev = ring.replica2part2dev_id
->>> builder._last_part_moves_epoch = 0
->>> builder._last_part_moves = array('B', (0 for _ in xrange(self.parts)))
->>> builder._set_parts_wanted()
->>> for d in builder._iter_devs():
-            d['parts'] = 0
->>> for p2d in builder._replica2part2dev:
-            for dev_id in p2d:
-                builder.devs[dev_id]['parts'] += 1</programlisting>
-                <para>This is the extent of the recoverable
-                    fields.</para>
-            </step>
-            <step>
-                <para>For <option>min_part_hours</option>, you must
-                    remember the value that you used previously or
-                    create a new value.</para>
-                <programlisting language="python">>>> builder.change_min_part_hours(24) # or whatever you want it to be</programlisting>
-                <para>If validation succeeds without raising an
-                    exception, you have succeeded.</para>
-                <programlisting language="python">>>> builder.validate()</programlisting>
-            </step>
-            <step>
-                <para>Save the builder.</para>
-                <programlisting language="python">>>> import pickle
->>> pickle.dump(builder.to_dict(), open('account.builder', 'wb'), protocol=2)</programlisting>
-                <para>The <filename>account.builder</filename> file
-                    appears in the current working directory.</para>
-            </step>
-            <step>
-                <para>Run <literal>swift-ring-builder account.builder
-                        write_ring</literal>.</para>
-                <para>Compare the new
-                        <filename>account.ring.gz</filename> to the
-                    original <filename>account.ring.gz</filename>
-                    file. They might not be byte-for-byte identical,
-                    but if you load them in REPL and their
-                        <option>_replica2part2dev_id</option> and
-                        <option>devs</option> attributes are the same
-                    (or nearly so), you have succeeded.</para>
-            </step>
-            <step>
-                <para>Repeat this procedure for the
-                        <filename>container.ring.gz</filename> and
-                        <filename>object.ring.gz</filename> files, and
-                    you might get usable builder files.</para>
-            </step>
-        </procedure>
-    </section>
-</chapter>
--- a/doc/config-reference/networking/section_networking-multi-dhcp-agents.xml
+++ b/doc/config-reference/networking/section_networking-multi-dhcp-agents.xml
@ -69,7 +69,8 @@ format="PNG" />
            </imageobject>
        </mediaobject>
    </informalfigure>
-    <para>There will be three hosts in the setup.<table rules="all">
+    <para>There will be three hosts in the setup.</para>
+        <table rules="all">
            <caption>Hosts for Demo</caption>
            <thead>
                <tr>
@ -103,7 +104,7 @@ format="PNG" />
                    <td>Same as HostA</td>
                </tr>
            </tbody>
-        </table></para>
+        </table>
    <section xml:id="multi_agent_demo_configuration">
        <title>Configuration</title>
        <itemizedlist>