Add ability to failback for replication V2.1

Initially we had setup replication V2.1 (Cheesecake) to NOT do fail-back at least in the initial version. It turns out that fail-back in the Cinder code is rather easy, we just enable calling failover-host on a host that's already failed-over and use the *special* keyword of "default" as the backend_id argument which signifies we want to switch back to whatever is configured as the default in the cinder.conf file. To do this we just add some logic that checks the secondary_backend_id param in volume.manager:failover_host and set service fields appropriately. Note that we're sending the call to the driver first and giving it a chance to raise an exception if it can't satisfy the request at the current time. We also needed to modify the volume.api:failover_host to allow failed-over as a valid transition state, and again update the Service query to include disabled services. It's up to drivers to figure out if they want to require some extra admin steps and document exactly how this works. It's also possible that during an initial failover that you might want to return a status update for all volumes NOT replicated and mark their volume-status to "error". Expected behavior is depicted in the service output here: http://paste.openstack.org/show/488294/ Change-Id: I4531ab65424a7a9600b2f93ee5b5e1a0dd47d63d
2016-02-26 01:09:21 +00:00 · 2016-02-26 01:09:21 +00:00 · 4eaf8f5909
parent faba1e5f20
commit 4eaf8f5909
2 changed files with 16 additions and 7 deletions
--- a/cinder/volume/api.py
+++ b/cinder/volume/api.py
@ -1593,9 +1593,10 @@ class API(base.Base):
        ctxt = context.get_admin_context()
        svc_host = volume_utils.extract_host(host, 'backend')

-        service = objects.Service.get_by_host_and_topic(
+        service = objects.Service.get_by_args(
            ctxt, svc_host, CONF.volume_topic)
-        expected = {'replication_status': fields.ReplicationStatus.ENABLED}
+        expected = {'replication_status': [fields.ReplicationStatus.ENABLED,
+                    fields.ReplicationStatus.FAILED_OVER]}
        result = service.conditional_update(
            {'replication_status': fields.ReplicationStatus.FAILING_OVER},
            expected)
--- a/cinder/volume/manager.py
+++ b/cinder/volume/manager.py
@ -3296,11 +3296,19 @@ class VolumeManager(manager.SchedulerDependentManager):
                 secondary_backend_id})
            return None

-        service.replication_status = fields.ReplicationStatus.FAILED_OVER
-        service.active_backend_id = active_backend_id
-        service.disabled = True
-        service.disabled_reason = "failed-over"
-        service.save()
+        if secondary_backend_id == "default":
+            service.replication_status = fields.ReplicationStatus.ENABLED
+            service.active_backend_id = ""
+            service.disabled = False
+            service.disabled_reason = ""
+            service.save()
+
+        else:
+            service.replication_status = fields.ReplicationStatus.FAILED_OVER
+            service.active_backend_id = active_backend_id
+            service.disabled = True
+            service.disabled_reason = "failed-over"
+            service.save()

        for update in volume_update_list:
            # Response must include an id key: {volume_id: <cinder-uuid>}