From 89a871d42f1226c2dd292ea739dfda01d6f4b3f2 Mon Sep 17 00:00:00 2001 From: Samuel Merritt Date: Wed, 21 Nov 2012 14:57:21 -0800 Subject: [PATCH] Improve container-sync docs. Two improvements: first, document that the container-sync process connects to the remote cluster's proxy server, so outbound connectivity is required. Second, rewrite the behind-the-scenes container-sync example and add some ASCII-art diagrams. Fixes bug 1068430. Bonus fix of docstring in wsgi.py to squelch a sphinx warning. Change-Id: I85bd56c2bd14431e13f7c57a43852777f14014fb --- doc/source/overview_container_sync.rst | 109 ++++++++++++++++++------- swift/common/wsgi.py | 2 +- 2 files changed, 80 insertions(+), 31 deletions(-) diff --git a/doc/source/overview_container_sync.rst b/doc/source/overview_container_sync.rst index af0168791a..b62136d258 100644 --- a/doc/source/overview_container_sync.rst +++ b/doc/source/overview_container_sync.rst @@ -172,6 +172,13 @@ checking for x-container-sync-to and x-container-sync-key metadata values. If they exist, newer rows since the last sync will trigger PUTs or DELETEs to the other container. +.. note:: + + The swift-container-sync process runs on each container server in + the cluster and talks to the proxy servers in the remote cluster. + Therefore, the container servers must be permitted to initiate + outbound connections to the remote proxy servers. + .. note:: Container sync will sync object POSTs only if the proxy server is set to @@ -184,39 +191,81 @@ The actual syncing is slightly more complicated to make use of the three do the exact same work but also without missing work if one node happens to be down. -Two sync points are kept per container database. All rows between the two -sync points trigger updates. Any rows newer than both sync points cause -updates depending on the node's position for the container (primary nodes -do one third, etc. depending on the replica count of course). After a sync -run, the first sync point is set to the newest ROWID known and the second -sync point is set to newest ROWID for which all updates have been sent. +Two sync points are kept in each container database. When syncing a +container, the container-sync process figures out which replica of the +container it has. In a standard 3-replica scenario, the process will +have either replica number 0, 1, or 2. This is used to figure out +which rows are belong to this sync process and which ones don't. -An example may help. Assume replica count is 3 and perfectly matching -ROWIDs starting at 1. +An example may help. Assume a replica count of 3 and database row IDs +are 1..6. Also, assume that container-sync is running on this +container for the first time, hence SP1 = SP2 = -1. :: - First sync run, database has 6 rows: + SP1 + SP2 + | + v + -1 0 1 2 3 4 5 6 - * SyncPoint1 starts as -1. - * SyncPoint2 starts as -1. - * No rows between points, so no "all updates" rows. - * Six rows newer than SyncPoint1, so a third of the rows are sent - by node 1, another third by node 2, remaining third by node 3. - * SyncPoint1 is set as 6 (the newest ROWID known). - * SyncPoint2 is left as -1 since no "all updates" rows were synced. +First, the container-sync process looks for rows with id between SP1 +and SP2. Since this is the first run, SP1 = SP2 = -1, and there aren't +any such rows. :: - Next sync run, database has 12 rows: + SP1 + SP2 + | + v + -1 0 1 2 3 4 5 6 - * SyncPoint1 starts as 6. - * SyncPoint2 starts as -1. - * The rows between -1 and 6 all trigger updates (most of which - should short-circuit on the remote end as having already been - done). - * Six more rows newer than SyncPoint1, so a third of the rows are - sent by node 1, another third by node 2, remaining third by node - 3. - * SyncPoint1 is set as 12 (the newest ROWID known). - * SyncPoint2 is set as 6 (the newest "all updates" ROWID). +Second, the container-sync process looks for rows with id greater than +SP1, and syncs those rows which it owns. Ownership is based on the +hash of the object name, so it's not always guaranteed to be exactly +one out of every three rows, but it usually gets close. For the sake +of example, let's say that this process ends up owning rows 2 and 5. -In this way, under normal circumstances each node sends its share of -updates each run and just sends a batch of older updates to ensure nothing -was missed. +Once it's finished syncing those rows, it updates SP1 to be the +biggest row-id that it's seen, which is 6 in this example. :: + + SP2 SP1 + | | + v v + -1 0 1 2 3 4 5 6 + +While all that was going on, clients uploaded new objects into the +container, creating new rows in the database. :: + + SP2 SP1 + | | + v v + -1 0 1 2 3 4 5 6 7 8 9 10 11 12 + +On the next run, the container-sync starts off looking at rows with +ids between SP1 and SP2. This time, there are a bunch of them. The +sync process takes the ones it *does not* own and syncs them. Again, +this is based on the hashes, so this will be everything it didn't sync +before. In this example, that's rows 0, 1, 3, 4, and 6. + +Under normal circumstances, the container-sync processes for the other +replicas will have already taken care of synchronizing those rows, so +this is a set of quick checks. However, if one of those other sync +processes failed for some reason, then this is a vital fallback to +make sure all the objects in the container get synchronized. Without +this seemingly-redundant work, any container-sync failure results in +unsynchronized objects. + +Once it's done with the fallback rows, SP2 is advanced to SP1. :: + + SP2 + SP1 + | + v + -1 0 1 2 3 4 5 6 7 8 9 10 11 12 + +Then, rows with row ID greater than SP1 are synchronized (provided +this container-sync process is responsible for them), and SP1 is moved +up to the greatest row ID seen. :: + + SP2 SP1 + | | + v v + -1 0 1 2 3 4 5 6 7 8 9 10 11 12 diff --git a/swift/common/wsgi.py b/swift/common/wsgi.py index 150afeebe7..54e520a956 100644 --- a/swift/common/wsgi.py +++ b/swift/common/wsgi.py @@ -207,7 +207,7 @@ def init_request_processor(conf_file, app_section, *args, **kwargs): :param conf_file: Path to paste.deploy style configuration file :param app_section: App name from conf file to load config from - :returns the loaded application entry point + :returns: the loaded application entry point :raises ConfigFileError: Exception is raised for config file error """ try: