Made spacing after period consistent throughout, removed a couple of spurious commas
This commit is contained in:
		@@ -13,9 +13,9 @@ architecture. For each request, it will look up the location of the account,
 | 
				
			|||||||
container, or object in the ring (see below) and route the request accordingly.
 | 
					container, or object in the ring (see below) and route the request accordingly.
 | 
				
			||||||
The public API is also exposed through the Proxy Server.
 | 
					The public API is also exposed through the Proxy Server.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
A large number of failures are also handled in the Proxy Server.  For
 | 
					A large number of failures are also handled in the Proxy Server. For
 | 
				
			||||||
example, if a server is unavailable for an object PUT, it will ask the
 | 
					example, if a server is unavailable for an object PUT, it will ask the
 | 
				
			||||||
ring for a handoff server, and route there instead.
 | 
					ring for a handoff server and route there instead.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
When objects are streamed to or from an object server, they are streamed
 | 
					When objects are streamed to or from an object server, they are streamed
 | 
				
			||||||
directly through the proxy server to of from the user -- the proxy server
 | 
					directly through the proxy server to of from the user -- the proxy server
 | 
				
			||||||
@@ -25,7 +25,7 @@ does not spool them.
 | 
				
			|||||||
The Ring
 | 
					The Ring
 | 
				
			||||||
--------
 | 
					--------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
A ring represents a mapping between the names of entities stored on disk, and
 | 
					A ring represents a mapping between the names of entities stored on disk and
 | 
				
			||||||
their physical location. There are separate rings for accounts, containers, and
 | 
					their physical location. There are separate rings for accounts, containers, and
 | 
				
			||||||
objects. When other components need to perform any operation on an object,
 | 
					objects. When other components need to perform any operation on an object,
 | 
				
			||||||
container, or account, they need to interact with the appropriate ring to
 | 
					container, or account, they need to interact with the appropriate ring to
 | 
				
			||||||
@@ -37,18 +37,18 @@ cluster, and the locations for a partition are stored in the mapping maintained
 | 
				
			|||||||
by the ring. The ring is also responsible for determining which devices are
 | 
					by the ring. The ring is also responsible for determining which devices are
 | 
				
			||||||
used for handoff in failure scenarios.
 | 
					used for handoff in failure scenarios.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Data can be isolated with the concept of zones in the ring.  Each replica
 | 
					Data can be isolated with the concept of zones in the ring. Each replica
 | 
				
			||||||
of a partition is guaranteed to reside in a different zone. A zone could
 | 
					of a partition is guaranteed to reside in a different zone. A zone could
 | 
				
			||||||
represent a drive, a server, a cabinet, a switch, or even a datacenter.
 | 
					represent a drive, a server, a cabinet, a switch, or even a datacenter.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The partitions of the ring are equally divided among all the devices in the
 | 
					The partitions of the ring are equally divided among all the devices in the
 | 
				
			||||||
Swift installation.  When an event occurs that requires partitions to be
 | 
					Swift installation. When partitions need to be moved around (for example if a
 | 
				
			||||||
moved around (for example if a device is added to the cluster), the ring
 | 
					device is added to the cluster), the ring ensures that a minimum number of
 | 
				
			||||||
ensures that a minimum number of partitions are moved at a time, and only
 | 
					partitions are moved at a time, and only one replica of a partition is moved at
 | 
				
			||||||
one replica of a partition is moved at a time.
 | 
					a time.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Weights can be used to balance the distribution of partitions on drives
 | 
					Weights can be used to balance the distribution of partitions on drives
 | 
				
			||||||
across the cluster.  This can be useful, for example, if different sized
 | 
					across the cluster. This can be useful, for example, when different sized
 | 
				
			||||||
drives are used in a cluster.
 | 
					drives are used in a cluster.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The ring is used by the Proxy server and several background processes
 | 
					The ring is used by the Proxy server and several background processes
 | 
				
			||||||
@@ -62,24 +62,24 @@ The Object Server is a very simple blob storage server that can store,
 | 
				
			|||||||
retrieve and delete objects stored on local devices. Objects are stored
 | 
					retrieve and delete objects stored on local devices. Objects are stored
 | 
				
			||||||
as binary files on the filesystem with metadata stored in the file's
 | 
					as binary files on the filesystem with metadata stored in the file's
 | 
				
			||||||
extended attributes (xattrs). This requires that the underlying filesystem
 | 
					extended attributes (xattrs). This requires that the underlying filesystem
 | 
				
			||||||
choice for object servers must support xattrs on files. Some filesystems,
 | 
					choice for object servers support xattrs on files. Some filesystems,
 | 
				
			||||||
like ext3, have xattrs turned off by default.
 | 
					like ext3, have xattrs turned off by default.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Each object is stored using a path derived from the object name's hash and
 | 
					Each object is stored using a path derived from the object name's hash and
 | 
				
			||||||
the operation's timestamp.  Last write always wins, and ensures that the
 | 
					the operation's timestamp. Last write always wins, and ensures that the
 | 
				
			||||||
latest object version will be served.  A deletion is also treated as a
 | 
					latest object version will be served. A deletion is also treated as a
 | 
				
			||||||
version of the file (a 0 byte file ending with ".ts", which stands for
 | 
					version of the file (a 0 byte file ending with ".ts", which stands for
 | 
				
			||||||
tombstone).  This ensures that deleted files are replicated correctly and
 | 
					tombstone). This ensures that deleted files are replicated correctly and
 | 
				
			||||||
older versions don't magically reappear due to failure scenarios.
 | 
					older versions don't magically reappear due to failure scenarios.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
----------------
 | 
					----------------
 | 
				
			||||||
Container Server
 | 
					Container Server
 | 
				
			||||||
----------------
 | 
					----------------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The Container Server's primary job is to handle listings of objects.  It
 | 
					The Container Server's primary job is to handle listings of objects. It
 | 
				
			||||||
doesn't know where those object's are, just what objects are in a specific
 | 
					doesn't know where those object's are, just what objects are in a specific
 | 
				
			||||||
container.  The listings are stored as sqlite database files, and replicated
 | 
					container. The listings are stored as sqlite database files, and replicated
 | 
				
			||||||
across the cluster similar to how objects are.  Statistics are also tracked
 | 
					across the cluster similar to how objects are. Statistics are also tracked
 | 
				
			||||||
that include the total number of objects, and total storage usage for that
 | 
					that include the total number of objects, and total storage usage for that
 | 
				
			||||||
container.
 | 
					container.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
@@ -95,15 +95,15 @@ Replication
 | 
				
			|||||||
-----------
 | 
					-----------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Replication is designed to keep the system in a consistent state in the face
 | 
					Replication is designed to keep the system in a consistent state in the face
 | 
				
			||||||
of temporary error conditions like network partitions or drive failures.
 | 
					of temporary error conditions like network outages or drive failures.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The replication processes compare local data with each remote copy to ensure
 | 
					The replication processes compare local data with each remote copy to ensure
 | 
				
			||||||
they all contain the latest version. Object replication uses a hash list to
 | 
					they all contain the latest version. Object replication uses a hash list to
 | 
				
			||||||
quickly compare subsections of each partition, and container and account
 | 
					quickly compare subsections of each partition, and container and account
 | 
				
			||||||
replication use a combination of hashes and shared high water marks.
 | 
					replication use a combination of hashes and shared high water marks.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Replication updates are push based.  For object replication, updating is
 | 
					Replication updates are push based. For object replication, updating is
 | 
				
			||||||
just a matter of rsyncing files to the peer.  Account and container
 | 
					just a matter of rsyncing files to the peer. Account and container
 | 
				
			||||||
replication push missing records over HTTP or rsync whole database files.
 | 
					replication push missing records over HTTP or rsync whole database files.
 | 
				
			||||||
 | 
					
 | 
				
			||||||
The replicator also ensures that data is removed from the system. When an
 | 
					The replicator also ensures that data is removed from the system. When an
 | 
				
			||||||
@@ -116,9 +116,9 @@ Updaters
 | 
				
			|||||||
--------
 | 
					--------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
There are times when container or account data can not be immediately
 | 
					There are times when container or account data can not be immediately
 | 
				
			||||||
updated.  This usually occurs during failure scenarios or periods of high
 | 
					updated. This usually occurs during failure scenarios or periods of high
 | 
				
			||||||
load.  If an update fails, the update is queued locally on the filesystem,
 | 
					load. If an update fails, the update is queued locally on the filesystem,
 | 
				
			||||||
and the updater will process the failed updates.  This is where an eventual
 | 
					and the updater will process the failed updates. This is where an eventual
 | 
				
			||||||
consistency window will most likely come in to play. For example, suppose a
 | 
					consistency window will most likely come in to play. For example, suppose a
 | 
				
			||||||
container server is under load and a new object is put in to the system. The
 | 
					container server is under load and a new object is put in to the system. The
 | 
				
			||||||
object will be immediately available for reads as soon as the proxy server
 | 
					object will be immediately available for reads as soon as the proxy server
 | 
				
			||||||
@@ -137,9 +137,9 @@ Auditors
 | 
				
			|||||||
--------
 | 
					--------
 | 
				
			||||||
 | 
					
 | 
				
			||||||
Auditors crawl the local server checking the integrity of the objects,
 | 
					Auditors crawl the local server checking the integrity of the objects,
 | 
				
			||||||
containers, and accounts.  If corruption is found (in the case of bit rot,
 | 
					containers, and accounts. If corruption is found (in the case of bit rot,
 | 
				
			||||||
for example), the file is quarantined, and replication will replace the bad
 | 
					for example), the file is quarantined, and replication will replace the bad
 | 
				
			||||||
file from another replica.  If other errors are found they are logged (for
 | 
					file from another replica. If other errors are found they are logged (for
 | 
				
			||||||
example, an object's listing can't be found on any container server it
 | 
					example, an object's listing can't be found on any container server it
 | 
				
			||||||
should be).
 | 
					should be).
 | 
				
			||||||
 | 
					
 | 
				
			||||||
 
 | 
				
			|||||||
		Reference in New Issue
	
	Block a user