Add Cinder volume replication
Add new page cinder-volume-replication.rst and the accompanying cinder-volume-replication-overlay.rst Separate the disaster scenario from the main body as per team consensus Related-Bug: #1925035 Change-Id: Id9e4c8fff27a678d78aa0b606ec9e8a00208a894
This commit is contained in:
parent
cbe50300e5
commit
c680f1bf70
|
@ -0,0 +1,275 @@
|
||||||
|
:orphan:
|
||||||
|
|
||||||
|
.. _cinder_volume_replication_dr:
|
||||||
|
|
||||||
|
=============================================
|
||||||
|
Cinder volume replication - Disaster recovery
|
||||||
|
=============================================
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
This is the disaster recovery scenario of a Cinder volume replication
|
||||||
|
deployment. It should be read in conjunction with the :doc:`Cinder volume
|
||||||
|
replication <cinder-volume-replication>` page.
|
||||||
|
|
||||||
|
Scenario description
|
||||||
|
--------------------
|
||||||
|
|
||||||
|
Disaster recovery involves an uncontrolled failover to the secondary site.
|
||||||
|
Site-b takes over from a troubled site-a and becomes the de-facto primary site,
|
||||||
|
which includes writes to its images. Control is passed back to site-a once it
|
||||||
|
is repaired.
|
||||||
|
|
||||||
|
.. warning::
|
||||||
|
|
||||||
|
The charms support the underlying OpenStack servcies in their native ability
|
||||||
|
to failover and failback. However, a significant degree of administrative
|
||||||
|
care is still needed in order to ensure a successful recovery.
|
||||||
|
|
||||||
|
For example,
|
||||||
|
|
||||||
|
* primary volume images that are currently in use may experience difficulty
|
||||||
|
during their demotion to secondary status
|
||||||
|
|
||||||
|
* running VMs will lose connectivity to their volumes
|
||||||
|
|
||||||
|
* subsequent image resyncs may not be straightforward
|
||||||
|
|
||||||
|
Any work necessary to rectify data issues resulting from an uncontrolled
|
||||||
|
failover is beyond the scope of the OpenStack charms and this document.
|
||||||
|
|
||||||
|
Simulation
|
||||||
|
----------
|
||||||
|
|
||||||
|
For the sake of understanding some of the rudimentary aspects involved in
|
||||||
|
disaster recovery a simulation is provided.
|
||||||
|
|
||||||
|
Preparation
|
||||||
|
~~~~~~~~~~~
|
||||||
|
|
||||||
|
Create the replicated data volume and confirm it is available:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume create --size 5 --type site-a-repl vol-site-a-repl-data
|
||||||
|
openstack volume list
|
||||||
|
|
||||||
|
Simulate a failure in site-a by turning off all of its Ceph MON daemons:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju ssh site-a-ceph-mon/0 sudo systemctl stop ceph-mon.target
|
||||||
|
juju ssh site-a-ceph-mon/1 sudo systemctl stop ceph-mon.target
|
||||||
|
juju ssh site-a-ceph-mon/2 sudo systemctl stop ceph-mon.target
|
||||||
|
|
||||||
|
Modify timeout and retry settings
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
When a Ceph cluster fails communication between Cinder and the failed cluster
|
||||||
|
will be interrupted and the RBD driver will accommodate with retries and
|
||||||
|
timeouts.
|
||||||
|
|
||||||
|
To accelerate the failover mechanism, timeout and retry settings on the
|
||||||
|
cinder-ceph unit in site-a can be modified:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju ssh cinder-ceph-a/0
|
||||||
|
> sudo apt install -y crudini
|
||||||
|
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout 1
|
||||||
|
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries 1
|
||||||
|
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval 0
|
||||||
|
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout 1
|
||||||
|
> sudo systemctl restart cinder-volume
|
||||||
|
> exit
|
||||||
|
|
||||||
|
These configuration changes are only intended to be in effect during the
|
||||||
|
failover transition period. They should be reverted afterwards since the
|
||||||
|
default values are fine for normal operations.
|
||||||
|
|
||||||
|
Failover
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
Perform the failover of site-a, confirm its cinder-volume host is disabled, and
|
||||||
|
that the volume remains available:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a
|
||||||
|
cinder service-list
|
||||||
|
openstack volume list
|
||||||
|
|
||||||
|
Confirm that the Cinder log file (``/var/log/cinder/cinder-volume.log``) on
|
||||||
|
unit ``cinder/0`` contains the successful failover message: ``Failed over to
|
||||||
|
replication target successfully.``.
|
||||||
|
|
||||||
|
Revert timeout and retry settings
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Revert the configuration changes made to the cinder-ceph backend:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju ssh cinder-ceph-a/0
|
||||||
|
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout
|
||||||
|
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries
|
||||||
|
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval
|
||||||
|
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout
|
||||||
|
> sudo systemctl restart cinder-volume
|
||||||
|
> exit
|
||||||
|
|
||||||
|
Write to the volume
|
||||||
|
~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Create a VM (named 'vm-with-data-volume'):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server create --image focal-amd64 --flavor m1.tiny \
|
||||||
|
--key-name mykey --network int_net vm-with-data-volume
|
||||||
|
|
||||||
|
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||||
|
openstack server add floating ip vm-with-data-volume $FLOATING_IP
|
||||||
|
|
||||||
|
Attach the volume to the VM, write some data to it, and detach it:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||||
|
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||||
|
> sudo mkfs.ext4 /dev/vdc
|
||||||
|
> mkdir data
|
||||||
|
> sudo mount /dev/vdc data
|
||||||
|
> sudo chown ubuntu: data
|
||||||
|
> echo "This is a test." > data/test.txt
|
||||||
|
> sync
|
||||||
|
> sudo umount /dev/vdc
|
||||||
|
> exit
|
||||||
|
|
||||||
|
openstack server remove volume vm-with-data-volume vol-site-a-repl-data
|
||||||
|
|
||||||
|
Repair site-a
|
||||||
|
~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
In the current example, site-a is repaired by starting the Ceph MON daemons:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju ssh site-a-ceph-mon/0 sudo systemctl start ceph-mon.target
|
||||||
|
juju ssh site-a-ceph-mon/1 sudo systemctl start ceph-mon.target
|
||||||
|
juju ssh site-a-ceph-mon/2 sudo systemctl start ceph-mon.target
|
||||||
|
|
||||||
|
Confirm that the MON cluster is now healthy (it may take a while):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju status site-a-ceph-mon
|
||||||
|
|
||||||
|
Unit Workload Agent Machine Public address Ports Message
|
||||||
|
site-a-ceph-mon/0 active idle 14 10.5.0.15 Unit is ready and clustered
|
||||||
|
site-a-ceph-mon/1* active idle 15 10.5.0.31 Unit is ready and clustered
|
||||||
|
site-a-ceph-mon/2 active idle 16 10.5.0.11 Unit is ready and clustered
|
||||||
|
|
||||||
|
Image resync
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Putting site-a back online at this point will lead to two primary images for
|
||||||
|
each replicated volume. This is a split-brain condition that cannot be resolved
|
||||||
|
by the RBD mirror daemon. Hence, before failback is invoked each replicated
|
||||||
|
volume will need a resync of its images (site-b images are more recent than the
|
||||||
|
site-a images).
|
||||||
|
|
||||||
|
The image resync is a two-step process that is initiated on the ceph-rbd-mirror
|
||||||
|
unit in site-a:
|
||||||
|
|
||||||
|
Demote the site-a images with the ``demote`` action:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-a-ceph-rbd-mirror/0 demote pools=cinder-ceph-a
|
||||||
|
|
||||||
|
Flag the site-a images for a resync with the ``resync-pools`` action. The
|
||||||
|
``pools`` argument should point to the corresponding site's pool, which by
|
||||||
|
default is the name of the cinder-ceph application for the site (here
|
||||||
|
'cinder-ceph-a'):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-a-ceph-rbd-mirror/0 resync-pools i-really-mean-it=true pools=cinder-ceph-a
|
||||||
|
|
||||||
|
The Ceph RBD mirror daemon will perform the resync in the background.
|
||||||
|
|
||||||
|
Failback
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
Prior to failback, confirm that the images of all replicated volumes in site-a
|
||||||
|
are fully synchronised. Perform a check with the ceph-rbd-mirror charm's
|
||||||
|
``status`` action as per :ref:`RBD image status <rbd_image_status>`:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-a-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||||
|
|
||||||
|
This will take a while.
|
||||||
|
|
||||||
|
The state and description for site-a images will transition to:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
state: up+syncing
|
||||||
|
description: bootstrapping, IMAGE_SYNC/CREATE_SYNC_POINT
|
||||||
|
|
||||||
|
The intermediate values will look like:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
state: up+replaying
|
||||||
|
description: replaying, {"bytes_per_second":110318.93,"entries_behind_primary":4712.....
|
||||||
|
|
||||||
|
The final values, as expected, will become:
|
||||||
|
|
||||||
|
.. code-block:: console
|
||||||
|
|
||||||
|
state: up+replaying
|
||||||
|
description: replaying, {"bytes_per_second":0.0,"entries_behind_primary":0.....
|
||||||
|
|
||||||
|
The failback of site-a can now proceed:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||||
|
|
||||||
|
Confirm the original health of Cinder services (as per :ref:`Cinder service
|
||||||
|
list <cinder_service_list>`):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder service-list
|
||||||
|
|
||||||
|
Verification
|
||||||
|
~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Re-attach the volume to the VM and verify that the secondary device contains
|
||||||
|
the expected data:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||||
|
> sudo mount /dev/vdc data
|
||||||
|
> cat data/test.txt
|
||||||
|
This is a test.
|
||||||
|
|
||||||
|
We can also check the status of the image as per :ref:`RBD image status
|
||||||
|
<rbd_image_status>` to verify that the primary indeed resides in site-a again:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-a-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||||
|
|
||||||
|
volume-c44d4d20-6ede-422a-903d-588d1b0d51b0:
|
||||||
|
global_id: 3a4aa755-c9ee-4319-8ba4-fc494d20d783
|
||||||
|
state: up+stopped
|
||||||
|
description: local image is primary
|
|
@ -0,0 +1,138 @@
|
||||||
|
:orphan:
|
||||||
|
|
||||||
|
.. _cinder_volume_replication_custom_overlay:
|
||||||
|
|
||||||
|
========================================
|
||||||
|
Cinder volume replication custom overlay
|
||||||
|
========================================
|
||||||
|
|
||||||
|
The below bundle overlay is used in the instructions given on the :doc:`Cinder
|
||||||
|
volume replication <cinder-volume-replication>` page.
|
||||||
|
|
||||||
|
.. code-block:: yaml
|
||||||
|
|
||||||
|
series: focal
|
||||||
|
|
||||||
|
# Change these variables according to the local environment, 'osd-devices'
|
||||||
|
# and 'data-port' in particular.
|
||||||
|
variables:
|
||||||
|
openstack-origin: &openstack-origin cloud:focal-victoria
|
||||||
|
osd-devices: &osd-devices /dev/sdb /dev/vdb
|
||||||
|
expected-osd-count: &expected-osd-count 3
|
||||||
|
expected-mon-count: &expected-mon-count 3
|
||||||
|
data-port: &data-port br-ex:ens7
|
||||||
|
|
||||||
|
relations:
|
||||||
|
- - cinder-ceph-a:storage-backend
|
||||||
|
- cinder:storage-backend
|
||||||
|
- - cinder-ceph-b:storage-backend
|
||||||
|
- cinder:storage-backend
|
||||||
|
|
||||||
|
- - site-a-ceph-osd:mon
|
||||||
|
- site-a-ceph-mon:osd
|
||||||
|
- - site-b-ceph-osd:mon
|
||||||
|
- site-b-ceph-mon:osd
|
||||||
|
|
||||||
|
- - site-a-ceph-mon:client
|
||||||
|
- nova-compute:ceph
|
||||||
|
- - site-b-ceph-mon:client
|
||||||
|
- nova-compute:ceph
|
||||||
|
|
||||||
|
- - site-a-ceph-mon:client
|
||||||
|
- cinder-ceph-a:ceph
|
||||||
|
- - site-b-ceph-mon:client
|
||||||
|
- cinder-ceph-b:ceph
|
||||||
|
|
||||||
|
- - nova-compute:ceph-access
|
||||||
|
- cinder-ceph-a:ceph-access
|
||||||
|
- - nova-compute:ceph-access
|
||||||
|
- cinder-ceph-b:ceph-access
|
||||||
|
|
||||||
|
- - site-a-ceph-mon:client
|
||||||
|
- glance:ceph
|
||||||
|
|
||||||
|
- - site-a-ceph-mon:rbd-mirror
|
||||||
|
- site-a-ceph-rbd-mirror:ceph-local
|
||||||
|
- - site-b-ceph-mon:rbd-mirror
|
||||||
|
- site-b-ceph-rbd-mirror:ceph-local
|
||||||
|
|
||||||
|
- - site-a-ceph-mon
|
||||||
|
- site-b-ceph-rbd-mirror:ceph-remote
|
||||||
|
- - site-b-ceph-mon
|
||||||
|
- site-a-ceph-rbd-mirror:ceph-remote
|
||||||
|
|
||||||
|
- - site-a-ceph-mon:client
|
||||||
|
- cinder-ceph-b:ceph-replication-device
|
||||||
|
- - site-b-ceph-mon:client
|
||||||
|
- cinder-ceph-a:ceph-replication-device
|
||||||
|
|
||||||
|
applications:
|
||||||
|
|
||||||
|
# Prevent some applications in the main bundle from being deployed.
|
||||||
|
ceph-radosgw:
|
||||||
|
ceph-osd:
|
||||||
|
ceph-mon:
|
||||||
|
cinder-ceph:
|
||||||
|
|
||||||
|
# Deploy ceph-osd applications with the appropriate names.
|
||||||
|
site-a-ceph-osd:
|
||||||
|
charm: cs:ceph-osd
|
||||||
|
num_units: 3
|
||||||
|
options:
|
||||||
|
osd-devices: *osd-devices
|
||||||
|
source: *openstack-origin
|
||||||
|
|
||||||
|
site-b-ceph-osd:
|
||||||
|
charm: cs:ceph-osd
|
||||||
|
num_units: 3
|
||||||
|
options:
|
||||||
|
osd-devices: *osd-devices
|
||||||
|
source: *openstack-origin
|
||||||
|
|
||||||
|
# Deploy ceph-mon applications with the appropriate names.
|
||||||
|
site-a-ceph-mon:
|
||||||
|
charm: cs:ceph-mon
|
||||||
|
num_units: 3
|
||||||
|
options:
|
||||||
|
expected-osd-count: *expected-osd-count
|
||||||
|
monitor-count: *expected-mon-count
|
||||||
|
source: *openstack-origin
|
||||||
|
|
||||||
|
site-b-ceph-mon:
|
||||||
|
charm: cs:ceph-mon
|
||||||
|
num_units: 3
|
||||||
|
options:
|
||||||
|
expected-osd-count: *expected-osd-count
|
||||||
|
monitor-count: *expected-mon-count
|
||||||
|
source: *openstack-origin
|
||||||
|
|
||||||
|
# Deploy cinder-ceph applications with the appropriate names.
|
||||||
|
cinder-ceph-a:
|
||||||
|
charm: cs:cinder-ceph
|
||||||
|
num_units: 0
|
||||||
|
options:
|
||||||
|
rbd-mirroring-mode: image
|
||||||
|
|
||||||
|
cinder-ceph-b:
|
||||||
|
charm: cs:cinder-ceph
|
||||||
|
num_units: 0
|
||||||
|
options:
|
||||||
|
rbd-mirroring-mode: image
|
||||||
|
|
||||||
|
# Deploy ceph-rbd-mirror applications with the appropriate names.
|
||||||
|
site-a-ceph-rbd-mirror:
|
||||||
|
charm: cs:ceph-rbd-mirror
|
||||||
|
num_units: 1
|
||||||
|
options:
|
||||||
|
source: *openstack-origin
|
||||||
|
|
||||||
|
site-b-ceph-rbd-mirror:
|
||||||
|
charm: cs:ceph-rbd-mirror
|
||||||
|
num_units: 1
|
||||||
|
options:
|
||||||
|
source: *openstack-origin
|
||||||
|
|
||||||
|
# Configure for the local environment.
|
||||||
|
ovn-chassis:
|
||||||
|
options:
|
||||||
|
bridge-interface-mappings: *data-port
|
|
@ -0,0 +1,576 @@
|
||||||
|
=========================
|
||||||
|
Cinder volume replication
|
||||||
|
=========================
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
Cinder volume replication is a primary/secondary failover solution based on
|
||||||
|
two-way `Ceph RBD mirroring`_.
|
||||||
|
|
||||||
|
Deployment
|
||||||
|
----------
|
||||||
|
|
||||||
|
The cloud deployment in this document is based on the stable `openstack-base`_
|
||||||
|
bundle in the `openstack-bundles`_ repository. The necessary documentation is
|
||||||
|
found in the `bundle README`_.
|
||||||
|
|
||||||
|
A custom overlay bundle (`cinder-volume-replication-overlay`_) is used to
|
||||||
|
extend the base cloud in order to implement volume replication.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
The key elements for adding volume replication to Ceph RBD mirroring is the
|
||||||
|
relation between cinder-ceph in one site and ceph-mon in the other (using the
|
||||||
|
``ceph-replication-device`` endpoint) and the cinder-ceph charm
|
||||||
|
configuration option ``rbd-mirroring-mode=image``.
|
||||||
|
|
||||||
|
Cloud notes:
|
||||||
|
|
||||||
|
* The cloud used in these instructions is based on Ubuntu 20.04 LTS (Focal) and
|
||||||
|
OpenStack Victoria. The openstack-base bundle may have been updated since.
|
||||||
|
* The two Ceph clusters are named 'site-a' and 'site-b' and are placed in the
|
||||||
|
same Juju model.
|
||||||
|
* A site's pool is named after its corresponding cinder-ceph application (e.g.
|
||||||
|
'cinder-ceph-a' for site-a) and is mirrored to the other site. Each site will
|
||||||
|
therefore have two pools: 'cinder-ceph-a' and 'cinder-ceph-b'.
|
||||||
|
* Glance is only backed by site-a.
|
||||||
|
|
||||||
|
To deploy:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju deploy ./bundle.yaml --overlay ./cinder-volume-replication-overlay.yaml
|
||||||
|
|
||||||
|
Configuration and verification of the base cloud
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
Configure the base cloud as per the referenced documentation.
|
||||||
|
|
||||||
|
Before proceeding, verify the base cloud by creating a VM and connecting to it
|
||||||
|
over SSH. See the main bundle's README for guidance.
|
||||||
|
|
||||||
|
.. important::
|
||||||
|
|
||||||
|
A known issue affecting the interaction of the ceph-rbd-mirror charm and
|
||||||
|
Ceph itself gives the impression of a fatal error. The symptom is messaging
|
||||||
|
that appears in :command:`juju status` command output: ``Pools WARNING (1)
|
||||||
|
OK (1) Images unknown (1)``. This remains a cosmetic issue however. See bug
|
||||||
|
`LP #1892201`_ for details.
|
||||||
|
|
||||||
|
Cinder volume types
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
For each site, create replicated and non-replicated Cinder volumes types. A
|
||||||
|
type is referenced at volume-creation time in order to specify whether the
|
||||||
|
volume is replicated (or not) and what pool it will reside in.
|
||||||
|
|
||||||
|
Type 'site-a-repl' denotes replication in site-a:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume type create site-a-repl \
|
||||||
|
--property volume_backend_name=cinder-ceph-a \
|
||||||
|
--property replication_enabled='<is> True'
|
||||||
|
|
||||||
|
Type 'site-a-local' denotes non-replication in site-a:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume type create site-a-local \
|
||||||
|
--property volume_backend_name=cinder-ceph-a
|
||||||
|
|
||||||
|
Type 'site-b-repl' denotes replication in site-b:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume type create site-b-repl \
|
||||||
|
--property volume_backend_name=cinder-ceph-b \
|
||||||
|
--property replication_enabled='<is> True'
|
||||||
|
|
||||||
|
Type 'site-b-local' denotes non-replication in site-b:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume type create site-b-local \
|
||||||
|
--property volume_backend_name=cinder-ceph-b
|
||||||
|
|
||||||
|
List the volume types:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume type list
|
||||||
|
+--------------------------------------+--------------+-----------+
|
||||||
|
| ID | Name | Is Public |
|
||||||
|
+--------------------------------------+--------------+-----------+
|
||||||
|
| ee70dfd9-7b97-407d-a860-868e0209b93b | site-b-local | True |
|
||||||
|
| b0f6d6b5-9c76-4967-9eb4-d488a6690712 | site-b-repl | True |
|
||||||
|
| fc89ca9b-d75a-443e-9025-6710afdbfd5c | site-a-local | True |
|
||||||
|
| 780980dc-1357-4fbd-9714-e16a79df252a | site-a-repl | True |
|
||||||
|
| d57df78d-ff27-4cf0-9959-0ada21ce86ad | __DEFAULT__ | True |
|
||||||
|
+--------------------------------------+--------------+-----------+
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
In this document, site-b volume types will not be used. They are created
|
||||||
|
here for the more generalised case where new volumes may be needed while
|
||||||
|
site-a is in a failover state. In such a circumstance, any volumes created
|
||||||
|
in site-b will naturally not be replicated (in site-a).
|
||||||
|
|
||||||
|
.. _rbd_image_status:
|
||||||
|
|
||||||
|
RBD image status
|
||||||
|
----------------
|
||||||
|
|
||||||
|
The status of the two RBD images associated with a replicated volume can be
|
||||||
|
queried using the ``status`` action of the ceph-rbd-mirror unit for each site.
|
||||||
|
|
||||||
|
A state of ``up+replaying`` in combination with the presence of
|
||||||
|
``"entries_behind_primary":0`` in the image description means the image in one
|
||||||
|
site is in sync with its counterpart in the other site.
|
||||||
|
|
||||||
|
A state of ``up+syncing`` indicates that the sync process is still underway.
|
||||||
|
|
||||||
|
A description of ``local image is primary`` means that the image is the
|
||||||
|
primary.
|
||||||
|
|
||||||
|
Consider the volume below that is created and given the volume type of
|
||||||
|
'site-a-repl'. Its primary will be in site-a and its non-primary (secondary)
|
||||||
|
will be in site-b:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume create --size 5 --type site-a-repl vol-site-a-repl
|
||||||
|
|
||||||
|
Their statuses can be queried in each site as shown:
|
||||||
|
|
||||||
|
Site a (primary),
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-a-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||||
|
volume-c44d4d20-6ede-422a-903d-588d1b0d51b0:
|
||||||
|
global_id: f66140a6-0c09-478c-9431-4eb1eb16ca86
|
||||||
|
state: up+stopped
|
||||||
|
description: local image is primary
|
||||||
|
|
||||||
|
Site b (secondary is in sync with the primary),
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-b-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||||
|
volume-c44d4d20-6ede-422a-903d-588d1b0d51b0:
|
||||||
|
global_id: f66140a6-0c09-478c-9431-4eb1eb16ca86
|
||||||
|
state: up+replaying
|
||||||
|
description: replaying, {"bytes_per_second":0.0,"entries_behind_primary":0,.....
|
||||||
|
|
||||||
|
.. _cinder_service_list:
|
||||||
|
|
||||||
|
Cinder service list
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
To verify the state of Cinder services the ``cinder service-list`` command is
|
||||||
|
used:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder service-list
|
||||||
|
+------------------+----------------------+------+---------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||||
|
+------------------+----------------------+------+---------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T15:59:25.000000 | - | - | |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-a | nova | enabled | up | 2021-04-08T15:59:24.000000 | - | - | up |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T15:59:25.000000 | - | - | up |
|
||||||
|
+------------------+----------------------+------+---------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
|
||||||
|
Each of the below examples ends with a failback to site-a. The above output is
|
||||||
|
the desired result.
|
||||||
|
|
||||||
|
The failover of a particular site entails the referencing of its corresponding
|
||||||
|
cinder-volume service host (e.g. ``cinder@cinder-ceph-a`` for site-a). We'll
|
||||||
|
see how to do this later on.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
'cinder-ceph-a' and 'cinder-ceph-b' correspond to the two applications
|
||||||
|
deployed via the `cinder-ceph`_ charm. The express purpose of this charm is
|
||||||
|
to connect Cinder to a Ceph cluster. See the
|
||||||
|
`cinder-volume-replication-overlay`_ bundle for details.
|
||||||
|
|
||||||
|
Failover, volumes, images, and pools
|
||||||
|
------------------------------------
|
||||||
|
|
||||||
|
This section will show the basics of failover/failback, non-replicated vs
|
||||||
|
replicated volumes, and what pools are used for the volume images.
|
||||||
|
|
||||||
|
In site-a, create one non-replicated and one replicated data volume and list
|
||||||
|
them:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume create --size 5 --type site-a-local vol-site-a-local
|
||||||
|
openstack volume create --size 5 --type site-a-repl vol-site-a-repl
|
||||||
|
|
||||||
|
openstack volume list
|
||||||
|
+--------------------------------------+------------------+-----------+------+-------------+
|
||||||
|
| ID | Name | Status | Size | Attached to |
|
||||||
|
+--------------------------------------+------------------+-----------+------+-------------+
|
||||||
|
| fba13395-62d1-468e-9b9a-40bebd0373e8 | vol-site-a-local | available | 5 | |
|
||||||
|
| c21a539e-d524-4f4d-991b-9b9476d4f930 | vol-site-a-repl | available | 5 | |
|
||||||
|
+--------------------------------------+------------------+-----------+------+-------------+
|
||||||
|
|
||||||
|
Pools and images
|
||||||
|
~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
For 'vol-site-a-local' there should be one image in the 'cinder-ceph-a' pool of
|
||||||
|
site-a.
|
||||||
|
|
||||||
|
For 'vol-site-a-repl' there should be two images: one in the 'cinder-ceph-a'
|
||||||
|
pool of site-a and one in the 'cinder-ceph-a' pool of site-b:
|
||||||
|
|
||||||
|
This can all be confirmed by querying a Ceph MON in each site:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju ssh site-a-ceph-mon/0 sudo rbd ls -p cinder-ceph-a
|
||||||
|
|
||||||
|
volume-fba13395-62d1-468e-9b9a-40bebd0373e8
|
||||||
|
volume-c21a539e-d524-4f4d-991b-9b9476d4f930
|
||||||
|
|
||||||
|
juju ssh site-b-ceph-mon/0 sudo rbd ls -p cinder-ceph-a
|
||||||
|
|
||||||
|
volume-c21a539e-d524-4f4d-991b-9b9476d4f930
|
||||||
|
|
||||||
|
Failover
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
Perform the failover of site-a:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a
|
||||||
|
|
||||||
|
Wait until the failover is complete:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder service-list
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T17:11:56.000000 | - | - | |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-a | nova | disabled | up | 2021-04-08T17:11:56.000000 | - | failed-over | - |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T17:11:56.000000 | - | - | up |
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
|
||||||
|
A failover triggers the promotion of one site and the demotion of the other
|
||||||
|
(site-b and site-a respectively in this example). Communication between Cinder
|
||||||
|
and each Ceph cluster is therefore ideal, as in this example.
|
||||||
|
|
||||||
|
Inspection
|
||||||
|
~~~~~~~~~~
|
||||||
|
|
||||||
|
By consulting the volume list we see that the replicated volume is still
|
||||||
|
available but that the non-replicated volume has errored:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume list
|
||||||
|
+--------------------------------------+------------------+-----------+------+-------------+
|
||||||
|
| ID | Name | Status | Size | Attached to |
|
||||||
|
+--------------------------------------+------------------+-----------+------+-------------+
|
||||||
|
| fba13395-62d1-468e-9b9a-40bebd0373e8 | vol-site-a-local | error | 5 | |
|
||||||
|
| c21a539e-d524-4f4d-991b-9b9476d4f930 | vol-site-a-repl | available | 5 | |
|
||||||
|
+--------------------------------------+------------------+-----------+------+-------------+
|
||||||
|
|
||||||
|
Generally a failover indicates a significant degree of non-confidence in the
|
||||||
|
primary site, site-a in this case. Once a **local** volume goes into an error
|
||||||
|
state due to a failover it is expected to not recover after failback. The
|
||||||
|
errored local volumes should normally be discarded (deleted).
|
||||||
|
|
||||||
|
Failback
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
Failback site-a and confirm the original health of Cinder services (as per
|
||||||
|
`Cinder service list`_):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||||
|
cinder service-list
|
||||||
|
|
||||||
|
Examples
|
||||||
|
--------
|
||||||
|
|
||||||
|
The following two examples will be considered. They will both use replication
|
||||||
|
and involve the failing over of site-a to site-b:
|
||||||
|
|
||||||
|
#. `Data volume used by a VM`_
|
||||||
|
#. `Bootable volume used by a VM`_
|
||||||
|
|
||||||
|
Data volume used by a VM
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
In this example, a replicated data volume will be created in site-a and
|
||||||
|
attached to a VM. The volume's block device will then have some test data
|
||||||
|
written to it. This will allow for verification of the replicated data once
|
||||||
|
failover has occurred and the volume is re-attached to the VM.
|
||||||
|
|
||||||
|
Preparation
|
||||||
|
^^^^^^^^^^^
|
||||||
|
|
||||||
|
Create the replicated data volume:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume create --size 5 --type site-a-repl vol-site-a-repl-data
|
||||||
|
openstack volume list
|
||||||
|
+--------------------------------------+---------------------------+-----------+------+-------------+
|
||||||
|
| ID | Name | Status | Size | Attached to |
|
||||||
|
+--------------------------------------+---------------------------+-----------+------+-------------+
|
||||||
|
| f23732c1-3257-4e58-a214-085c460abf56 | vol-site-a-repl-data | available | 5 | |
|
||||||
|
+--------------------------------------+---------------------------+-----------+------+-------------+
|
||||||
|
|
||||||
|
Create the VM (named 'vm-with-data-volume'):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server create --image focal-amd64 --flavor m1.tiny \
|
||||||
|
--key-name mykey --network int_net vm-with-data-volume
|
||||||
|
|
||||||
|
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||||
|
openstack server add floating ip vm-with-data-volume $FLOATING_IP
|
||||||
|
|
||||||
|
openstack server list
|
||||||
|
+--------------------------------------+----------------------+--------+---------------------------------+--------------------------+---------+
|
||||||
|
| ID | Name | Status | Networks | Image | Flavor |
|
||||||
|
+--------------------------------------+----------------------+--------+---------------------------------+--------------------------+---------+
|
||||||
|
| fbe07fea-731e-4973-8455-c8466be72293 | vm-with-data-volume | ACTIVE | int_net=192.168.0.38, 10.5.1.28 | focal-amd64 | m1.tiny |
|
||||||
|
+--------------------------------------+----------------------+--------+---------------------------------+--------------------------+---------+
|
||||||
|
|
||||||
|
Attach the data volume to the VM:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||||
|
|
||||||
|
Prepare the block device and write the test data to it:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||||
|
> sudo mkfs.ext4 /dev/vdc
|
||||||
|
> mkdir data
|
||||||
|
> sudo mount /dev/vdc data
|
||||||
|
> sudo chown ubuntu: data
|
||||||
|
> echo "This is a test." > data/test.txt
|
||||||
|
> sync
|
||||||
|
> exit
|
||||||
|
|
||||||
|
Failover
|
||||||
|
^^^^^^^^
|
||||||
|
|
||||||
|
When both sites are online, as is here, it is not recommended to perform a
|
||||||
|
failover when volumes are in use. This is because Cinder will try to demote the
|
||||||
|
Ceph image from the primary site, and if there is an active connection to it
|
||||||
|
the operation may fail (i.e. the volume will transition to an error state).
|
||||||
|
|
||||||
|
Here we ensure the volume is not in use by unmounting the block device and
|
||||||
|
removing it from the VM:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP sudo umount /dev/vdc
|
||||||
|
openstack server remove vm-with-data-volume vol-site-a-repl-data
|
||||||
|
|
||||||
|
Prior to failover the images of all replicated volumes must be fully
|
||||||
|
synchronised. Perform a check with the ceph-rbd-mirror charm's ``status``
|
||||||
|
action as per `RBD image status`_. If the volumes were created in site-a then
|
||||||
|
the ceph-rbd-mirror unit in site-b is the target:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-b-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||||
|
|
||||||
|
If all images look good, perform the failover of site-a:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a
|
||||||
|
cinder service-list
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T19:30:29.000000 | - | - | |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-a | nova | disabled | up | 2021-04-08T19:30:28.000000 | - | failed-over | - |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T19:30:28.000000 | - | - | up |
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
|
||||||
|
Verification
|
||||||
|
^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Re-attach the volume to the VM:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||||
|
|
||||||
|
Verify that the secondary device contains the expected data:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||||
|
> sudo mount /dev/vdc /data
|
||||||
|
> cat /data/test.txt
|
||||||
|
This is a test.
|
||||||
|
|
||||||
|
Failback
|
||||||
|
^^^^^^^^
|
||||||
|
|
||||||
|
Failback site-a and confirm the original health of Cinder services (as per
|
||||||
|
`Cinder service list`_):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||||
|
cinder service-list
|
||||||
|
|
||||||
|
Bootable volume used by a VM
|
||||||
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||||
|
|
||||||
|
In this example, a bootable volume will be created in site-a and have a
|
||||||
|
newly-created VM use that volume as its root device. Identically to the
|
||||||
|
previous example, the volume's block device will have test data written to it
|
||||||
|
to use for verification purposes.
|
||||||
|
|
||||||
|
Preparation
|
||||||
|
^^^^^^^^^^^
|
||||||
|
|
||||||
|
Create the replicated bootable volume:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume create --size 5 --type site-a-repl --image focal-amd64 --bootable vol-site-a-repl-boot
|
||||||
|
|
||||||
|
Wait for the volume to become available (it may take a while):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack volume list
|
||||||
|
+--------------------------------------+----------------------+-----------+------+-------------+
|
||||||
|
| ID | Name | Status | Size | Attached to |
|
||||||
|
+--------------------------------------+----------------------+-----------+------+-------------+
|
||||||
|
| c44d4d20-6ede-422a-903d-588d1b0d51b0 | vol-site-a-repl-boot | available | 5 | |
|
||||||
|
+--------------------------------------+----------------------+-----------+------+-------------+
|
||||||
|
|
||||||
|
Create a VM (named 'vm-with-boot-volume') by specifying the newly-created
|
||||||
|
bootable volume:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server create --volume vol-site-a-repl-boot --flavor m1.tiny \
|
||||||
|
--key-name mykey --network int_net vm-with-boot-volume
|
||||||
|
|
||||||
|
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||||
|
openstack server add floating ip vm-with-boot-volume $FLOATING_IP
|
||||||
|
|
||||||
|
openstack server list
|
||||||
|
+--------------------------------------+---------------------+--------+---------------------------------+--------------------------+---------+
|
||||||
|
| ID | Name | Status | Networks | Image | Flavor |
|
||||||
|
+--------------------------------------+---------------------+--------+---------------------------------+--------------------------+---------+
|
||||||
|
| c0a152d7-376b-4500-95d4-7c768a3ff280 | vm-with-boot-volume | ACTIVE | int_net=192.168.0.75, 10.5.1.53 | N/A (booted from volume) | m1.tiny |
|
||||||
|
+--------------------------------------+---------------------+--------+---------------------------------+--------------------------+---------+
|
||||||
|
|
||||||
|
Write the test data to the block device:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||||
|
> echo "This is a test." > test.txt
|
||||||
|
> sync
|
||||||
|
> exit
|
||||||
|
|
||||||
|
Failover
|
||||||
|
^^^^^^^^
|
||||||
|
|
||||||
|
As explained previously, when both sites are functional, prior to failover the
|
||||||
|
replicated volume should not be in use. Since the testing of the replicated
|
||||||
|
boot volume requires the VM to be rebuilt anyway (Cinder needs to give the
|
||||||
|
updated Ceph connection credentials to Nova) the easiest way forward is to
|
||||||
|
simply delete the VM:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server delete vm-with-boot-volume
|
||||||
|
|
||||||
|
Like before, prior to failover, confirm that the images of all replicated
|
||||||
|
volumes in site-b are fully synchronised. Perform a check with the
|
||||||
|
ceph-rbd-mirror charm's ``status`` action as per `RBD image status`_:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
juju run-action --wait site-b-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||||
|
|
||||||
|
If all images look good, perform the failover of site-a:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a
|
||||||
|
cinder service-list
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T21:29:12.000000 | - | - | |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-a | nova | disabled | up | 2021-04-08T21:29:12.000000 | - | failed-over | - |
|
||||||
|
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T21:29:11.000000 | - | - | up |
|
||||||
|
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||||
|
|
||||||
|
Verification
|
||||||
|
^^^^^^^^^^^^
|
||||||
|
|
||||||
|
Re-create the VM:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
openstack server create --volume vol-site-a-repl-boot --flavor m1.tiny \
|
||||||
|
--key-name mykey --network int_net vm-with-boot-volume
|
||||||
|
|
||||||
|
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||||
|
openstack server add floating ip vm-with-boot-volume $FLOATING_IP
|
||||||
|
|
||||||
|
Verify that the root device contains the expected data:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||||
|
> cat test.txt
|
||||||
|
This is a test.
|
||||||
|
> exit
|
||||||
|
|
||||||
|
Failback
|
||||||
|
^^^^^^^^
|
||||||
|
|
||||||
|
Failback site-a and confirm the original health of Cinder services (as per
|
||||||
|
`Cinder service list`_):
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||||
|
cinder service-list
|
||||||
|
|
||||||
|
Disaster recovery
|
||||||
|
-----------------
|
||||||
|
|
||||||
|
An uncontrolled failover is known as the disaster recovery scenario. It is
|
||||||
|
characterised by the sudden failure of the primary Ceph cluster. See the
|
||||||
|
:ref:`Cinder volume replication - Disaster recovery
|
||||||
|
<cinder_volume_replication_dr>` page for more information.
|
||||||
|
|
||||||
|
.. LINKS
|
||||||
|
.. _Ceph RBD mirroring: app-ceph-rbd-mirror.html
|
||||||
|
.. _openstack-base: https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-base/bundle.yaml
|
||||||
|
.. _openstack-bundles: https://github.com/openstack-charmers/openstack-bundles/
|
||||||
|
.. _bundle README: https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-base/README.md
|
||||||
|
.. _cinder-volume-replication-overlay: cinder-volume-replication-overlay.html
|
||||||
|
.. _cinder-ceph: https://jaas.ai/cinder-ceph
|
||||||
|
.. _LP #1892201: https://bugs.launchpad.net/charm-ceph-rbd-mirror/+bug/1892201
|
|
@ -85,6 +85,7 @@ OpenStack Charms usage. To help improve it you can `file an issue`_ or
|
||||||
app-erasure-coding
|
app-erasure-coding
|
||||||
app-rgw-multisite
|
app-rgw-multisite
|
||||||
app-ceph-rbd-mirror
|
app-ceph-rbd-mirror
|
||||||
|
cinder-volume-replication
|
||||||
app-manila-ganesha
|
app-manila-ganesha
|
||||||
app-swift
|
app-swift
|
||||||
|
|
||||||
|
|
Loading…
Reference in New Issue