229 lines
		
	
	
		
			9.1 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
			
		
		
	
	
			229 lines
		
	
	
		
			9.1 KiB
		
	
	
	
		
			ReStructuredText
		
	
	
	
	
	
======================================
 | 
						|
Container to Container Synchronization
 | 
						|
======================================
 | 
						|
 | 
						|
--------
 | 
						|
Overview
 | 
						|
--------
 | 
						|
 | 
						|
Swift has a feature where all the contents of a container can be mirrored to
 | 
						|
another container through background synchronization. Swift cluster operators
 | 
						|
configure their cluster to allow/accept sync requests to/from other clusters,
 | 
						|
and the user specifies where to sync their container to along with a secret
 | 
						|
synchronization key.
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
    Container sync will sync object POSTs only if the proxy server is set to
 | 
						|
    use "object_post_as_copy = true" which is the default. So-called fast
 | 
						|
    object posts, "object_post_as_copy = false" do not update the container
 | 
						|
    listings and therefore can't be detected for synchronization.
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
    If you are using the large objects feature you will need to ensure both
 | 
						|
    your manifest file and your segment files are synced if they happen to be
 | 
						|
    in different containers.
 | 
						|
 | 
						|
--------------------------------------------
 | 
						|
Configuring a Cluster's Allowable Sync Hosts
 | 
						|
--------------------------------------------
 | 
						|
 | 
						|
The Swift cluster operator must allow synchronization with a set of hosts
 | 
						|
before the user can enable container synchronization. First, the backend
 | 
						|
container server needs to be given this list of hosts in the
 | 
						|
container-server.conf file::
 | 
						|
 | 
						|
    [DEFAULT]
 | 
						|
    # This is a comma separated list of hosts allowed in the
 | 
						|
    # X-Container-Sync-To field for containers.
 | 
						|
    # allowed_sync_hosts = 127.0.0.1
 | 
						|
    allowed_sync_hosts = host1,host2,etc.
 | 
						|
    ...
 | 
						|
 | 
						|
    [container-sync]
 | 
						|
    # You can override the default log routing for this app here (don't
 | 
						|
    # use set!):
 | 
						|
    # log_name = container-sync
 | 
						|
    # log_facility = LOG_LOCAL0
 | 
						|
    # log_level = INFO
 | 
						|
    # Will sync, at most, each container once per interval
 | 
						|
    # interval = 300
 | 
						|
    # Maximum amount of time to spend syncing each container
 | 
						|
    # container_time = 60
 | 
						|
 | 
						|
Tracking sync progress, problems, and just general activity can only be
 | 
						|
achieved with log processing for this first release of container
 | 
						|
synchronization. In that light, you may wish to set the above `log_` options to
 | 
						|
direct the container-sync logs to a different file for easier monitoring.
 | 
						|
Additionally, it should be noted there is no way for an end user to detect sync
 | 
						|
progress or problems other than HEADing both containers and comparing the
 | 
						|
overall information.
 | 
						|
 | 
						|
The authentication system also needs to be configured to allow synchronization
 | 
						|
requests. Here is an example with TempAuth::
 | 
						|
 | 
						|
    [filter:tempauth]
 | 
						|
    # This is a comma separated list of hosts allowed to send
 | 
						|
    # X-Container-Sync-Key requests.
 | 
						|
    # allowed_sync_hosts = 127.0.0.1
 | 
						|
    allowed_sync_hosts = host1,host2,etc.
 | 
						|
 | 
						|
The default of 127.0.0.1 is just so no configuration is required for SAIO
 | 
						|
setups -- for testing.
 | 
						|
 | 
						|
----------------------------------------------------------
 | 
						|
Using the ``swift`` tool to set up synchronized containers
 | 
						|
----------------------------------------------------------
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
    You must be the account admin on the account to set synchronization targets
 | 
						|
    and keys.
 | 
						|
 | 
						|
You simply tell each container where to sync to and give it a secret
 | 
						|
synchronization key. First, let's get the account details for our two cluster
 | 
						|
accounts::
 | 
						|
 | 
						|
    $ swift -A http://cluster1/auth/v1.0 -U test:tester -K testing stat -v
 | 
						|
    StorageURL: http://cluster1/v1/AUTH_208d1854-e475-4500-b315-81de645d060e
 | 
						|
    Auth Token: AUTH_tkd5359e46ff9e419fa193dbd367f3cd19
 | 
						|
       Account: AUTH_208d1854-e475-4500-b315-81de645d060e
 | 
						|
    Containers: 0
 | 
						|
       Objects: 0
 | 
						|
         Bytes: 0
 | 
						|
 | 
						|
    $ swift -A http://cluster2/auth/v1.0 -U test2:tester2 -K testing2 stat -v
 | 
						|
    StorageURL: http://cluster2/v1/AUTH_33cdcad8-09fb-4940-90da-0f00cbf21c7c
 | 
						|
    Auth Token: AUTH_tk816a1aaf403c49adb92ecfca2f88e430
 | 
						|
       Account: AUTH_33cdcad8-09fb-4940-90da-0f00cbf21c7c
 | 
						|
    Containers: 0
 | 
						|
       Objects: 0
 | 
						|
         Bytes: 0
 | 
						|
 | 
						|
Now, let's make our first container and tell it to synchronize to a second
 | 
						|
we'll make next::
 | 
						|
 | 
						|
    $ swift -A http://cluster1/auth/v1.0 -U test:tester -K testing post \
 | 
						|
      -t 'http://cluster2/v1/AUTH_33cdcad8-09fb-4940-90da-0f00cbf21c7c/container2' \
 | 
						|
      -k 'secret' container1
 | 
						|
 | 
						|
The ``-t`` indicates the URL to sync to, which is the ``StorageURL`` from
 | 
						|
cluster2 we retrieved above plus the container name. The ``-k`` specifies the
 | 
						|
secret key the two containers will share for synchronization. Now, we'll do
 | 
						|
something similar for the second cluster's container::
 | 
						|
 | 
						|
    $ swift -A http://cluster2/auth/v1.0 -U test2:tester2 -K testing2 post \
 | 
						|
      -t 'http://cluster1/v1/AUTH_208d1854-e475-4500-b315-81de645d060e/container1' \
 | 
						|
      -k 'secret' container2
 | 
						|
 | 
						|
That's it. Now we can upload a bunch of stuff to the first container and watch
 | 
						|
as it gets synchronized over to the second::
 | 
						|
 | 
						|
    $ swift -A http://cluster1/auth/v1.0 -U test:tester -K testing \
 | 
						|
      upload container1 .
 | 
						|
    photo002.png
 | 
						|
    photo004.png
 | 
						|
    photo001.png
 | 
						|
    photo003.png
 | 
						|
 | 
						|
    $ swift -A http://cluster2/auth/v1.0 -U test2:tester2 -K testing2 \
 | 
						|
      list container2
 | 
						|
 | 
						|
    [Nothing there yet, so we wait a bit...]
 | 
						|
    [If you're an operator running SAIO and just testing, you may need to
 | 
						|
     run 'swift-init container-sync once' to perform a sync scan.]
 | 
						|
 | 
						|
    $ swift -A http://cluster2/auth/v1.0 -U test2:tester2 -K testing2 \
 | 
						|
      list container2
 | 
						|
    photo001.png
 | 
						|
    photo002.png
 | 
						|
    photo003.png
 | 
						|
    photo004.png
 | 
						|
 | 
						|
You can also set up a chain of synced containers if you want more than two.
 | 
						|
You'd point 1 -> 2, then 2 -> 3, and finally 3 -> 1 for three containers.
 | 
						|
They'd all need to share the same secret synchronization key.
 | 
						|
 | 
						|
-----------------------------------
 | 
						|
Using curl (or other tools) instead
 | 
						|
-----------------------------------
 | 
						|
 | 
						|
So what's ``swift`` doing behind the scenes? Nothing overly complicated. It
 | 
						|
translates the ``-t <value>`` option into an ``X-Container-Sync-To: <value>``
 | 
						|
header and the ``-k <value>`` option into an ``X-Container-Sync-Key: <value>``
 | 
						|
header.
 | 
						|
 | 
						|
For instance, when we created the first container above and told it to
 | 
						|
synchronize to the second, we could have used this curl command::
 | 
						|
 | 
						|
    $ curl -i -X POST -H 'X-Auth-Token: AUTH_tkd5359e46ff9e419fa193dbd367f3cd19' \
 | 
						|
      -H 'X-Container-Sync-To: http://cluster2/v1/AUTH_33cdcad8-09fb-4940-90da-0f00cbf21c7c/container2' \
 | 
						|
      -H 'X-Container-Sync-Key: secret' \
 | 
						|
      'http://cluster1/v1/AUTH_208d1854-e475-4500-b315-81de645d060e/container1'
 | 
						|
    HTTP/1.1 204 No Content
 | 
						|
    Content-Length: 0
 | 
						|
    Content-Type: text/plain; charset=UTF-8
 | 
						|
    Date: Thu, 24 Feb 2011 22:39:14 GMT
 | 
						|
 | 
						|
--------------------------------------------------
 | 
						|
What's going on behind the scenes, in the cluster?
 | 
						|
--------------------------------------------------
 | 
						|
 | 
						|
The swift-container-sync does the job of sending updates to the remote
 | 
						|
container.
 | 
						|
 | 
						|
This is done by scanning the local devices for container databases and
 | 
						|
checking for x-container-sync-to and x-container-sync-key metadata values.
 | 
						|
If they exist, newer rows since the last sync will trigger PUTs or DELETEs
 | 
						|
to the other container.
 | 
						|
 | 
						|
.. note::
 | 
						|
 | 
						|
    Container sync will sync object POSTs only if the proxy server is set to
 | 
						|
    use "object_post_as_copy = true" which is the default. So-called fast
 | 
						|
    object posts, "object_post_as_copy = false" do not update the container
 | 
						|
    listings and therefore can't be detected for synchronization.
 | 
						|
 | 
						|
The actual syncing is slightly more complicated to make use of the three
 | 
						|
(or number-of-replicas) main nodes for a container without each trying to
 | 
						|
do the exact same work but also without missing work if one node happens to
 | 
						|
be down.
 | 
						|
 | 
						|
Two sync points are kept per container database. All rows between the two
 | 
						|
sync points trigger updates. Any rows newer than both sync points cause
 | 
						|
updates depending on the node's position for the container (primary nodes
 | 
						|
do one third, etc. depending on the replica count of course). After a sync
 | 
						|
run, the first sync point is set to the newest ROWID known and the second
 | 
						|
sync point is set to newest ROWID for which all updates have been sent.
 | 
						|
 | 
						|
An example may help. Assume replica count is 3 and perfectly matching
 | 
						|
ROWIDs starting at 1.
 | 
						|
 | 
						|
    First sync run, database has 6 rows:
 | 
						|
 | 
						|
        * SyncPoint1 starts as -1.
 | 
						|
        * SyncPoint2 starts as -1.
 | 
						|
        * No rows between points, so no "all updates" rows.
 | 
						|
        * Six rows newer than SyncPoint1, so a third of the rows are sent
 | 
						|
          by node 1, another third by node 2, remaining third by node 3.
 | 
						|
        * SyncPoint1 is set as 6 (the newest ROWID known).
 | 
						|
        * SyncPoint2 is left as -1 since no "all updates" rows were synced.
 | 
						|
 | 
						|
    Next sync run, database has 12 rows:
 | 
						|
 | 
						|
        * SyncPoint1 starts as 6.
 | 
						|
        * SyncPoint2 starts as -1.
 | 
						|
        * The rows between -1 and 6 all trigger updates (most of which
 | 
						|
          should short-circuit on the remote end as having already been
 | 
						|
          done).
 | 
						|
        * Six more rows newer than SyncPoint1, so a third of the rows are
 | 
						|
          sent by node 1, another third by node 2, remaining third by node
 | 
						|
          3.
 | 
						|
        * SyncPoint1 is set as 12 (the newest ROWID known).
 | 
						|
        * SyncPoint2 is set as 6 (the newest "all updates" ROWID).
 | 
						|
 | 
						|
In this way, under normal circumstances each node sends its share of
 | 
						|
updates each run and just sends a batch of older updates to ensure nothing
 | 
						|
was missed.
 |