Made spacing after period consistent throughout, removed a couple of spurious commas

This commit is contained in:
Brian K. Jones 2010-07-23 12:05:14 -04:00
parent f5a2c5374c
commit 4a855fa21e

View File

@ -13,9 +13,9 @@ architecture. For each request, it will look up the location of the account,
container, or object in the ring (see below) and route the request accordingly. container, or object in the ring (see below) and route the request accordingly.
The public API is also exposed through the Proxy Server. The public API is also exposed through the Proxy Server.
A large number of failures are also handled in the Proxy Server. For A large number of failures are also handled in the Proxy Server. For
example, if a server is unavailable for an object PUT, it will ask the example, if a server is unavailable for an object PUT, it will ask the
ring for a handoff server, and route there instead. ring for a handoff server and route there instead.
When objects are streamed to or from an object server, they are streamed When objects are streamed to or from an object server, they are streamed
directly through the proxy server to of from the user -- the proxy server directly through the proxy server to of from the user -- the proxy server
@ -25,7 +25,7 @@ does not spool them.
The Ring The Ring
-------- --------
A ring represents a mapping between the names of entities stored on disk, and A ring represents a mapping between the names of entities stored on disk and
their physical location. There are separate rings for accounts, containers, and their physical location. There are separate rings for accounts, containers, and
objects. When other components need to perform any operation on an object, objects. When other components need to perform any operation on an object,
container, or account, they need to interact with the appropriate ring to container, or account, they need to interact with the appropriate ring to
@ -37,18 +37,18 @@ cluster, and the locations for a partition are stored in the mapping maintained
by the ring. The ring is also responsible for determining which devices are by the ring. The ring is also responsible for determining which devices are
used for handoff in failure scenarios. used for handoff in failure scenarios.
Data can be isolated with the concept of zones in the ring. Each replica Data can be isolated with the concept of zones in the ring. Each replica
of a partition is guaranteed to reside in a different zone. A zone could of a partition is guaranteed to reside in a different zone. A zone could
represent a drive, a server, a cabinet, a switch, or even a datacenter. represent a drive, a server, a cabinet, a switch, or even a datacenter.
The partitions of the ring are equally divided among all the devices in the The partitions of the ring are equally divided among all the devices in the
Swift installation. When an event occurs that requires partitions to be Swift installation. When partitions need to be moved around (for example if a
moved around (for example if a device is added to the cluster), the ring device is added to the cluster), the ring ensures that a minimum number of
ensures that a minimum number of partitions are moved at a time, and only partitions are moved at a time, and only one replica of a partition is moved at
one replica of a partition is moved at a time. a time.
Weights can be used to balance the distribution of partitions on drives Weights can be used to balance the distribution of partitions on drives
across the cluster. This can be useful, for example, if different sized across the cluster. This can be useful, for example, when different sized
drives are used in a cluster. drives are used in a cluster.
The ring is used by the Proxy server and several background processes The ring is used by the Proxy server and several background processes
@ -62,24 +62,24 @@ The Object Server is a very simple blob storage server that can store,
retrieve and delete objects stored on local devices. Objects are stored retrieve and delete objects stored on local devices. Objects are stored
as binary files on the filesystem with metadata stored in the file's as binary files on the filesystem with metadata stored in the file's
extended attributes (xattrs). This requires that the underlying filesystem extended attributes (xattrs). This requires that the underlying filesystem
choice for object servers must support xattrs on files. Some filesystems, choice for object servers support xattrs on files. Some filesystems,
like ext3, have xattrs turned off by default. like ext3, have xattrs turned off by default.
Each object is stored using a path derived from the object name's hash and Each object is stored using a path derived from the object name's hash and
the operation's timestamp. Last write always wins, and ensures that the the operation's timestamp. Last write always wins, and ensures that the
latest object version will be served. A deletion is also treated as a latest object version will be served. A deletion is also treated as a
version of the file (a 0 byte file ending with ".ts", which stands for version of the file (a 0 byte file ending with ".ts", which stands for
tombstone). This ensures that deleted files are replicated correctly and tombstone). This ensures that deleted files are replicated correctly and
older versions don't magically reappear due to failure scenarios. older versions don't magically reappear due to failure scenarios.
---------------- ----------------
Container Server Container Server
---------------- ----------------
The Container Server's primary job is to handle listings of objects. It The Container Server's primary job is to handle listings of objects. It
doesn't know where those object's are, just what objects are in a specific doesn't know where those object's are, just what objects are in a specific
container. The listings are stored as sqlite database files, and replicated container. The listings are stored as sqlite database files, and replicated
across the cluster similar to how objects are. Statistics are also tracked across the cluster similar to how objects are. Statistics are also tracked
that include the total number of objects, and total storage usage for that that include the total number of objects, and total storage usage for that
container. container.
@ -95,15 +95,15 @@ Replication
----------- -----------
Replication is designed to keep the system in a consistent state in the face Replication is designed to keep the system in a consistent state in the face
of temporary error conditions like network partitions or drive failures. of temporary error conditions like network outages or drive failures.
The replication processes compare local data with each remote copy to ensure The replication processes compare local data with each remote copy to ensure
they all contain the latest version. Object replication uses a hash list to they all contain the latest version. Object replication uses a hash list to
quickly compare subsections of each partition, and container and account quickly compare subsections of each partition, and container and account
replication use a combination of hashes and shared high water marks. replication use a combination of hashes and shared high water marks.
Replication updates are push based. For object replication, updating is Replication updates are push based. For object replication, updating is
just a matter of rsyncing files to the peer. Account and container just a matter of rsyncing files to the peer. Account and container
replication push missing records over HTTP or rsync whole database files. replication push missing records over HTTP or rsync whole database files.
The replicator also ensures that data is removed from the system. When an The replicator also ensures that data is removed from the system. When an
@ -116,9 +116,9 @@ Updaters
-------- --------
There are times when container or account data can not be immediately There are times when container or account data can not be immediately
updated. This usually occurs during failure scenarios or periods of high updated. This usually occurs during failure scenarios or periods of high
load. If an update fails, the update is queued locally on the filesystem, load. If an update fails, the update is queued locally on the filesystem,
and the updater will process the failed updates. This is where an eventual and the updater will process the failed updates. This is where an eventual
consistency window will most likely come in to play. For example, suppose a consistency window will most likely come in to play. For example, suppose a
container server is under load and a new object is put in to the system. The container server is under load and a new object is put in to the system. The
object will be immediately available for reads as soon as the proxy server object will be immediately available for reads as soon as the proxy server
@ -137,9 +137,9 @@ Auditors
-------- --------
Auditors crawl the local server checking the integrity of the objects, Auditors crawl the local server checking the integrity of the objects,
containers, and accounts. If corruption is found (in the case of bit rot, containers, and accounts. If corruption is found (in the case of bit rot,
for example), the file is quarantined, and replication will replace the bad for example), the file is quarantined, and replication will replace the bad
file from another replica. If other errors are found they are logged (for file from another replica. If other errors are found they are logged (for
example, an object's listing can't be found on any container server it example, an object's listing can't be found on any container server it
should be). should be).