Add Cinder volume replication
Add new page cinder-volume-replication.rst and the accompanying cinder-volume-replication-overlay.rst Separate the disaster scenario from the main body as per team consensus Related-Bug: #1925035 Change-Id: Id9e4c8fff27a678d78aa0b606ec9e8a00208a894
This commit is contained in:
parent
cbe50300e5
commit
c680f1bf70
|
@ -0,0 +1,275 @@
|
|||
:orphan:
|
||||
|
||||
.. _cinder_volume_replication_dr:
|
||||
|
||||
=============================================
|
||||
Cinder volume replication - Disaster recovery
|
||||
=============================================
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
This is the disaster recovery scenario of a Cinder volume replication
|
||||
deployment. It should be read in conjunction with the :doc:`Cinder volume
|
||||
replication <cinder-volume-replication>` page.
|
||||
|
||||
Scenario description
|
||||
--------------------
|
||||
|
||||
Disaster recovery involves an uncontrolled failover to the secondary site.
|
||||
Site-b takes over from a troubled site-a and becomes the de-facto primary site,
|
||||
which includes writes to its images. Control is passed back to site-a once it
|
||||
is repaired.
|
||||
|
||||
.. warning::
|
||||
|
||||
The charms support the underlying OpenStack servcies in their native ability
|
||||
to failover and failback. However, a significant degree of administrative
|
||||
care is still needed in order to ensure a successful recovery.
|
||||
|
||||
For example,
|
||||
|
||||
* primary volume images that are currently in use may experience difficulty
|
||||
during their demotion to secondary status
|
||||
|
||||
* running VMs will lose connectivity to their volumes
|
||||
|
||||
* subsequent image resyncs may not be straightforward
|
||||
|
||||
Any work necessary to rectify data issues resulting from an uncontrolled
|
||||
failover is beyond the scope of the OpenStack charms and this document.
|
||||
|
||||
Simulation
|
||||
----------
|
||||
|
||||
For the sake of understanding some of the rudimentary aspects involved in
|
||||
disaster recovery a simulation is provided.
|
||||
|
||||
Preparation
|
||||
~~~~~~~~~~~
|
||||
|
||||
Create the replicated data volume and confirm it is available:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume create --size 5 --type site-a-repl vol-site-a-repl-data
|
||||
openstack volume list
|
||||
|
||||
Simulate a failure in site-a by turning off all of its Ceph MON daemons:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju ssh site-a-ceph-mon/0 sudo systemctl stop ceph-mon.target
|
||||
juju ssh site-a-ceph-mon/1 sudo systemctl stop ceph-mon.target
|
||||
juju ssh site-a-ceph-mon/2 sudo systemctl stop ceph-mon.target
|
||||
|
||||
Modify timeout and retry settings
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When a Ceph cluster fails communication between Cinder and the failed cluster
|
||||
will be interrupted and the RBD driver will accommodate with retries and
|
||||
timeouts.
|
||||
|
||||
To accelerate the failover mechanism, timeout and retry settings on the
|
||||
cinder-ceph unit in site-a can be modified:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju ssh cinder-ceph-a/0
|
||||
> sudo apt install -y crudini
|
||||
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout 1
|
||||
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries 1
|
||||
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval 0
|
||||
> sudo crudini --set /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout 1
|
||||
> sudo systemctl restart cinder-volume
|
||||
> exit
|
||||
|
||||
These configuration changes are only intended to be in effect during the
|
||||
failover transition period. They should be reverted afterwards since the
|
||||
default values are fine for normal operations.
|
||||
|
||||
Failover
|
||||
~~~~~~~~
|
||||
|
||||
Perform the failover of site-a, confirm its cinder-volume host is disabled, and
|
||||
that the volume remains available:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a
|
||||
cinder service-list
|
||||
openstack volume list
|
||||
|
||||
Confirm that the Cinder log file (``/var/log/cinder/cinder-volume.log``) on
|
||||
unit ``cinder/0`` contains the successful failover message: ``Failed over to
|
||||
replication target successfully.``.
|
||||
|
||||
Revert timeout and retry settings
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Revert the configuration changes made to the cinder-ceph backend:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju ssh cinder-ceph-a/0
|
||||
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connect_timeout
|
||||
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_retries
|
||||
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a rados_connection_interval
|
||||
> sudo crudini --del /etc/cinder/cinder.conf cinder-ceph-a replication_connect_timeout
|
||||
> sudo systemctl restart cinder-volume
|
||||
> exit
|
||||
|
||||
Write to the volume
|
||||
~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Create a VM (named 'vm-with-data-volume'):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server create --image focal-amd64 --flavor m1.tiny \
|
||||
--key-name mykey --network int_net vm-with-data-volume
|
||||
|
||||
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||
openstack server add floating ip vm-with-data-volume $FLOATING_IP
|
||||
|
||||
Attach the volume to the VM, write some data to it, and detach it:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||
> sudo mkfs.ext4 /dev/vdc
|
||||
> mkdir data
|
||||
> sudo mount /dev/vdc data
|
||||
> sudo chown ubuntu: data
|
||||
> echo "This is a test." > data/test.txt
|
||||
> sync
|
||||
> sudo umount /dev/vdc
|
||||
> exit
|
||||
|
||||
openstack server remove volume vm-with-data-volume vol-site-a-repl-data
|
||||
|
||||
Repair site-a
|
||||
~~~~~~~~~~~~~
|
||||
|
||||
In the current example, site-a is repaired by starting the Ceph MON daemons:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju ssh site-a-ceph-mon/0 sudo systemctl start ceph-mon.target
|
||||
juju ssh site-a-ceph-mon/1 sudo systemctl start ceph-mon.target
|
||||
juju ssh site-a-ceph-mon/2 sudo systemctl start ceph-mon.target
|
||||
|
||||
Confirm that the MON cluster is now healthy (it may take a while):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju status site-a-ceph-mon
|
||||
|
||||
Unit Workload Agent Machine Public address Ports Message
|
||||
site-a-ceph-mon/0 active idle 14 10.5.0.15 Unit is ready and clustered
|
||||
site-a-ceph-mon/1* active idle 15 10.5.0.31 Unit is ready and clustered
|
||||
site-a-ceph-mon/2 active idle 16 10.5.0.11 Unit is ready and clustered
|
||||
|
||||
Image resync
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Putting site-a back online at this point will lead to two primary images for
|
||||
each replicated volume. This is a split-brain condition that cannot be resolved
|
||||
by the RBD mirror daemon. Hence, before failback is invoked each replicated
|
||||
volume will need a resync of its images (site-b images are more recent than the
|
||||
site-a images).
|
||||
|
||||
The image resync is a two-step process that is initiated on the ceph-rbd-mirror
|
||||
unit in site-a:
|
||||
|
||||
Demote the site-a images with the ``demote`` action:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-a-ceph-rbd-mirror/0 demote pools=cinder-ceph-a
|
||||
|
||||
Flag the site-a images for a resync with the ``resync-pools`` action. The
|
||||
``pools`` argument should point to the corresponding site's pool, which by
|
||||
default is the name of the cinder-ceph application for the site (here
|
||||
'cinder-ceph-a'):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-a-ceph-rbd-mirror/0 resync-pools i-really-mean-it=true pools=cinder-ceph-a
|
||||
|
||||
The Ceph RBD mirror daemon will perform the resync in the background.
|
||||
|
||||
Failback
|
||||
~~~~~~~~
|
||||
|
||||
Prior to failback, confirm that the images of all replicated volumes in site-a
|
||||
are fully synchronised. Perform a check with the ceph-rbd-mirror charm's
|
||||
``status`` action as per :ref:`RBD image status <rbd_image_status>`:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-a-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||
|
||||
This will take a while.
|
||||
|
||||
The state and description for site-a images will transition to:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
state: up+syncing
|
||||
description: bootstrapping, IMAGE_SYNC/CREATE_SYNC_POINT
|
||||
|
||||
The intermediate values will look like:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
state: up+replaying
|
||||
description: replaying, {"bytes_per_second":110318.93,"entries_behind_primary":4712.....
|
||||
|
||||
The final values, as expected, will become:
|
||||
|
||||
.. code-block:: console
|
||||
|
||||
state: up+replaying
|
||||
description: replaying, {"bytes_per_second":0.0,"entries_behind_primary":0.....
|
||||
|
||||
The failback of site-a can now proceed:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||
|
||||
Confirm the original health of Cinder services (as per :ref:`Cinder service
|
||||
list <cinder_service_list>`):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder service-list
|
||||
|
||||
Verification
|
||||
~~~~~~~~~~~~
|
||||
|
||||
Re-attach the volume to the VM and verify that the secondary device contains
|
||||
the expected data:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||
> sudo mount /dev/vdc data
|
||||
> cat data/test.txt
|
||||
This is a test.
|
||||
|
||||
We can also check the status of the image as per :ref:`RBD image status
|
||||
<rbd_image_status>` to verify that the primary indeed resides in site-a again:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-a-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||
|
||||
volume-c44d4d20-6ede-422a-903d-588d1b0d51b0:
|
||||
global_id: 3a4aa755-c9ee-4319-8ba4-fc494d20d783
|
||||
state: up+stopped
|
||||
description: local image is primary
|
|
@ -0,0 +1,138 @@
|
|||
:orphan:
|
||||
|
||||
.. _cinder_volume_replication_custom_overlay:
|
||||
|
||||
========================================
|
||||
Cinder volume replication custom overlay
|
||||
========================================
|
||||
|
||||
The below bundle overlay is used in the instructions given on the :doc:`Cinder
|
||||
volume replication <cinder-volume-replication>` page.
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
series: focal
|
||||
|
||||
# Change these variables according to the local environment, 'osd-devices'
|
||||
# and 'data-port' in particular.
|
||||
variables:
|
||||
openstack-origin: &openstack-origin cloud:focal-victoria
|
||||
osd-devices: &osd-devices /dev/sdb /dev/vdb
|
||||
expected-osd-count: &expected-osd-count 3
|
||||
expected-mon-count: &expected-mon-count 3
|
||||
data-port: &data-port br-ex:ens7
|
||||
|
||||
relations:
|
||||
- - cinder-ceph-a:storage-backend
|
||||
- cinder:storage-backend
|
||||
- - cinder-ceph-b:storage-backend
|
||||
- cinder:storage-backend
|
||||
|
||||
- - site-a-ceph-osd:mon
|
||||
- site-a-ceph-mon:osd
|
||||
- - site-b-ceph-osd:mon
|
||||
- site-b-ceph-mon:osd
|
||||
|
||||
- - site-a-ceph-mon:client
|
||||
- nova-compute:ceph
|
||||
- - site-b-ceph-mon:client
|
||||
- nova-compute:ceph
|
||||
|
||||
- - site-a-ceph-mon:client
|
||||
- cinder-ceph-a:ceph
|
||||
- - site-b-ceph-mon:client
|
||||
- cinder-ceph-b:ceph
|
||||
|
||||
- - nova-compute:ceph-access
|
||||
- cinder-ceph-a:ceph-access
|
||||
- - nova-compute:ceph-access
|
||||
- cinder-ceph-b:ceph-access
|
||||
|
||||
- - site-a-ceph-mon:client
|
||||
- glance:ceph
|
||||
|
||||
- - site-a-ceph-mon:rbd-mirror
|
||||
- site-a-ceph-rbd-mirror:ceph-local
|
||||
- - site-b-ceph-mon:rbd-mirror
|
||||
- site-b-ceph-rbd-mirror:ceph-local
|
||||
|
||||
- - site-a-ceph-mon
|
||||
- site-b-ceph-rbd-mirror:ceph-remote
|
||||
- - site-b-ceph-mon
|
||||
- site-a-ceph-rbd-mirror:ceph-remote
|
||||
|
||||
- - site-a-ceph-mon:client
|
||||
- cinder-ceph-b:ceph-replication-device
|
||||
- - site-b-ceph-mon:client
|
||||
- cinder-ceph-a:ceph-replication-device
|
||||
|
||||
applications:
|
||||
|
||||
# Prevent some applications in the main bundle from being deployed.
|
||||
ceph-radosgw:
|
||||
ceph-osd:
|
||||
ceph-mon:
|
||||
cinder-ceph:
|
||||
|
||||
# Deploy ceph-osd applications with the appropriate names.
|
||||
site-a-ceph-osd:
|
||||
charm: cs:ceph-osd
|
||||
num_units: 3
|
||||
options:
|
||||
osd-devices: *osd-devices
|
||||
source: *openstack-origin
|
||||
|
||||
site-b-ceph-osd:
|
||||
charm: cs:ceph-osd
|
||||
num_units: 3
|
||||
options:
|
||||
osd-devices: *osd-devices
|
||||
source: *openstack-origin
|
||||
|
||||
# Deploy ceph-mon applications with the appropriate names.
|
||||
site-a-ceph-mon:
|
||||
charm: cs:ceph-mon
|
||||
num_units: 3
|
||||
options:
|
||||
expected-osd-count: *expected-osd-count
|
||||
monitor-count: *expected-mon-count
|
||||
source: *openstack-origin
|
||||
|
||||
site-b-ceph-mon:
|
||||
charm: cs:ceph-mon
|
||||
num_units: 3
|
||||
options:
|
||||
expected-osd-count: *expected-osd-count
|
||||
monitor-count: *expected-mon-count
|
||||
source: *openstack-origin
|
||||
|
||||
# Deploy cinder-ceph applications with the appropriate names.
|
||||
cinder-ceph-a:
|
||||
charm: cs:cinder-ceph
|
||||
num_units: 0
|
||||
options:
|
||||
rbd-mirroring-mode: image
|
||||
|
||||
cinder-ceph-b:
|
||||
charm: cs:cinder-ceph
|
||||
num_units: 0
|
||||
options:
|
||||
rbd-mirroring-mode: image
|
||||
|
||||
# Deploy ceph-rbd-mirror applications with the appropriate names.
|
||||
site-a-ceph-rbd-mirror:
|
||||
charm: cs:ceph-rbd-mirror
|
||||
num_units: 1
|
||||
options:
|
||||
source: *openstack-origin
|
||||
|
||||
site-b-ceph-rbd-mirror:
|
||||
charm: cs:ceph-rbd-mirror
|
||||
num_units: 1
|
||||
options:
|
||||
source: *openstack-origin
|
||||
|
||||
# Configure for the local environment.
|
||||
ovn-chassis:
|
||||
options:
|
||||
bridge-interface-mappings: *data-port
|
|
@ -0,0 +1,576 @@
|
|||
=========================
|
||||
Cinder volume replication
|
||||
=========================
|
||||
|
||||
Overview
|
||||
--------
|
||||
|
||||
Cinder volume replication is a primary/secondary failover solution based on
|
||||
two-way `Ceph RBD mirroring`_.
|
||||
|
||||
Deployment
|
||||
----------
|
||||
|
||||
The cloud deployment in this document is based on the stable `openstack-base`_
|
||||
bundle in the `openstack-bundles`_ repository. The necessary documentation is
|
||||
found in the `bundle README`_.
|
||||
|
||||
A custom overlay bundle (`cinder-volume-replication-overlay`_) is used to
|
||||
extend the base cloud in order to implement volume replication.
|
||||
|
||||
.. note::
|
||||
|
||||
The key elements for adding volume replication to Ceph RBD mirroring is the
|
||||
relation between cinder-ceph in one site and ceph-mon in the other (using the
|
||||
``ceph-replication-device`` endpoint) and the cinder-ceph charm
|
||||
configuration option ``rbd-mirroring-mode=image``.
|
||||
|
||||
Cloud notes:
|
||||
|
||||
* The cloud used in these instructions is based on Ubuntu 20.04 LTS (Focal) and
|
||||
OpenStack Victoria. The openstack-base bundle may have been updated since.
|
||||
* The two Ceph clusters are named 'site-a' and 'site-b' and are placed in the
|
||||
same Juju model.
|
||||
* A site's pool is named after its corresponding cinder-ceph application (e.g.
|
||||
'cinder-ceph-a' for site-a) and is mirrored to the other site. Each site will
|
||||
therefore have two pools: 'cinder-ceph-a' and 'cinder-ceph-b'.
|
||||
* Glance is only backed by site-a.
|
||||
|
||||
To deploy:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju deploy ./bundle.yaml --overlay ./cinder-volume-replication-overlay.yaml
|
||||
|
||||
Configuration and verification of the base cloud
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Configure the base cloud as per the referenced documentation.
|
||||
|
||||
Before proceeding, verify the base cloud by creating a VM and connecting to it
|
||||
over SSH. See the main bundle's README for guidance.
|
||||
|
||||
.. important::
|
||||
|
||||
A known issue affecting the interaction of the ceph-rbd-mirror charm and
|
||||
Ceph itself gives the impression of a fatal error. The symptom is messaging
|
||||
that appears in :command:`juju status` command output: ``Pools WARNING (1)
|
||||
OK (1) Images unknown (1)``. This remains a cosmetic issue however. See bug
|
||||
`LP #1892201`_ for details.
|
||||
|
||||
Cinder volume types
|
||||
-------------------
|
||||
|
||||
For each site, create replicated and non-replicated Cinder volumes types. A
|
||||
type is referenced at volume-creation time in order to specify whether the
|
||||
volume is replicated (or not) and what pool it will reside in.
|
||||
|
||||
Type 'site-a-repl' denotes replication in site-a:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume type create site-a-repl \
|
||||
--property volume_backend_name=cinder-ceph-a \
|
||||
--property replication_enabled='<is> True'
|
||||
|
||||
Type 'site-a-local' denotes non-replication in site-a:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume type create site-a-local \
|
||||
--property volume_backend_name=cinder-ceph-a
|
||||
|
||||
Type 'site-b-repl' denotes replication in site-b:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume type create site-b-repl \
|
||||
--property volume_backend_name=cinder-ceph-b \
|
||||
--property replication_enabled='<is> True'
|
||||
|
||||
Type 'site-b-local' denotes non-replication in site-b:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume type create site-b-local \
|
||||
--property volume_backend_name=cinder-ceph-b
|
||||
|
||||
List the volume types:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume type list
|
||||
+--------------------------------------+--------------+-----------+
|
||||
| ID | Name | Is Public |
|
||||
+--------------------------------------+--------------+-----------+
|
||||
| ee70dfd9-7b97-407d-a860-868e0209b93b | site-b-local | True |
|
||||
| b0f6d6b5-9c76-4967-9eb4-d488a6690712 | site-b-repl | True |
|
||||
| fc89ca9b-d75a-443e-9025-6710afdbfd5c | site-a-local | True |
|
||||
| 780980dc-1357-4fbd-9714-e16a79df252a | site-a-repl | True |
|
||||
| d57df78d-ff27-4cf0-9959-0ada21ce86ad | __DEFAULT__ | True |
|
||||
+--------------------------------------+--------------+-----------+
|
||||
|
||||
.. note::
|
||||
|
||||
In this document, site-b volume types will not be used. They are created
|
||||
here for the more generalised case where new volumes may be needed while
|
||||
site-a is in a failover state. In such a circumstance, any volumes created
|
||||
in site-b will naturally not be replicated (in site-a).
|
||||
|
||||
.. _rbd_image_status:
|
||||
|
||||
RBD image status
|
||||
----------------
|
||||
|
||||
The status of the two RBD images associated with a replicated volume can be
|
||||
queried using the ``status`` action of the ceph-rbd-mirror unit for each site.
|
||||
|
||||
A state of ``up+replaying`` in combination with the presence of
|
||||
``"entries_behind_primary":0`` in the image description means the image in one
|
||||
site is in sync with its counterpart in the other site.
|
||||
|
||||
A state of ``up+syncing`` indicates that the sync process is still underway.
|
||||
|
||||
A description of ``local image is primary`` means that the image is the
|
||||
primary.
|
||||
|
||||
Consider the volume below that is created and given the volume type of
|
||||
'site-a-repl'. Its primary will be in site-a and its non-primary (secondary)
|
||||
will be in site-b:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume create --size 5 --type site-a-repl vol-site-a-repl
|
||||
|
||||
Their statuses can be queried in each site as shown:
|
||||
|
||||
Site a (primary),
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-a-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||
volume-c44d4d20-6ede-422a-903d-588d1b0d51b0:
|
||||
global_id: f66140a6-0c09-478c-9431-4eb1eb16ca86
|
||||
state: up+stopped
|
||||
description: local image is primary
|
||||
|
||||
Site b (secondary is in sync with the primary),
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-b-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||
volume-c44d4d20-6ede-422a-903d-588d1b0d51b0:
|
||||
global_id: f66140a6-0c09-478c-9431-4eb1eb16ca86
|
||||
state: up+replaying
|
||||
description: replaying, {"bytes_per_second":0.0,"entries_behind_primary":0,.....
|
||||
|
||||
.. _cinder_service_list:
|
||||
|
||||
Cinder service list
|
||||
-------------------
|
||||
|
||||
To verify the state of Cinder services the ``cinder service-list`` command is
|
||||
used:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder service-list
|
||||
+------------------+----------------------+------+---------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||
+------------------+----------------------+------+---------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T15:59:25.000000 | - | - | |
|
||||
| cinder-volume | cinder@cinder-ceph-a | nova | enabled | up | 2021-04-08T15:59:24.000000 | - | - | up |
|
||||
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T15:59:25.000000 | - | - | up |
|
||||
+------------------+----------------------+------+---------+-------+----------------------------+---------+-----------------+---------------+
|
||||
|
||||
Each of the below examples ends with a failback to site-a. The above output is
|
||||
the desired result.
|
||||
|
||||
The failover of a particular site entails the referencing of its corresponding
|
||||
cinder-volume service host (e.g. ``cinder@cinder-ceph-a`` for site-a). We'll
|
||||
see how to do this later on.
|
||||
|
||||
.. note::
|
||||
|
||||
'cinder-ceph-a' and 'cinder-ceph-b' correspond to the two applications
|
||||
deployed via the `cinder-ceph`_ charm. The express purpose of this charm is
|
||||
to connect Cinder to a Ceph cluster. See the
|
||||
`cinder-volume-replication-overlay`_ bundle for details.
|
||||
|
||||
Failover, volumes, images, and pools
|
||||
------------------------------------
|
||||
|
||||
This section will show the basics of failover/failback, non-replicated vs
|
||||
replicated volumes, and what pools are used for the volume images.
|
||||
|
||||
In site-a, create one non-replicated and one replicated data volume and list
|
||||
them:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume create --size 5 --type site-a-local vol-site-a-local
|
||||
openstack volume create --size 5 --type site-a-repl vol-site-a-repl
|
||||
|
||||
openstack volume list
|
||||
+--------------------------------------+------------------+-----------+------+-------------+
|
||||
| ID | Name | Status | Size | Attached to |
|
||||
+--------------------------------------+------------------+-----------+------+-------------+
|
||||
| fba13395-62d1-468e-9b9a-40bebd0373e8 | vol-site-a-local | available | 5 | |
|
||||
| c21a539e-d524-4f4d-991b-9b9476d4f930 | vol-site-a-repl | available | 5 | |
|
||||
+--------------------------------------+------------------+-----------+------+-------------+
|
||||
|
||||
Pools and images
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
For 'vol-site-a-local' there should be one image in the 'cinder-ceph-a' pool of
|
||||
site-a.
|
||||
|
||||
For 'vol-site-a-repl' there should be two images: one in the 'cinder-ceph-a'
|
||||
pool of site-a and one in the 'cinder-ceph-a' pool of site-b:
|
||||
|
||||
This can all be confirmed by querying a Ceph MON in each site:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju ssh site-a-ceph-mon/0 sudo rbd ls -p cinder-ceph-a
|
||||
|
||||
volume-fba13395-62d1-468e-9b9a-40bebd0373e8
|
||||
volume-c21a539e-d524-4f4d-991b-9b9476d4f930
|
||||
|
||||
juju ssh site-b-ceph-mon/0 sudo rbd ls -p cinder-ceph-a
|
||||
|
||||
volume-c21a539e-d524-4f4d-991b-9b9476d4f930
|
||||
|
||||
Failover
|
||||
~~~~~~~~
|
||||
|
||||
Perform the failover of site-a:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a
|
||||
|
||||
Wait until the failover is complete:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder service-list
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T17:11:56.000000 | - | - | |
|
||||
| cinder-volume | cinder@cinder-ceph-a | nova | disabled | up | 2021-04-08T17:11:56.000000 | - | failed-over | - |
|
||||
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T17:11:56.000000 | - | - | up |
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
|
||||
A failover triggers the promotion of one site and the demotion of the other
|
||||
(site-b and site-a respectively in this example). Communication between Cinder
|
||||
and each Ceph cluster is therefore ideal, as in this example.
|
||||
|
||||
Inspection
|
||||
~~~~~~~~~~
|
||||
|
||||
By consulting the volume list we see that the replicated volume is still
|
||||
available but that the non-replicated volume has errored:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume list
|
||||
+--------------------------------------+------------------+-----------+------+-------------+
|
||||
| ID | Name | Status | Size | Attached to |
|
||||
+--------------------------------------+------------------+-----------+------+-------------+
|
||||
| fba13395-62d1-468e-9b9a-40bebd0373e8 | vol-site-a-local | error | 5 | |
|
||||
| c21a539e-d524-4f4d-991b-9b9476d4f930 | vol-site-a-repl | available | 5 | |
|
||||
+--------------------------------------+------------------+-----------+------+-------------+
|
||||
|
||||
Generally a failover indicates a significant degree of non-confidence in the
|
||||
primary site, site-a in this case. Once a **local** volume goes into an error
|
||||
state due to a failover it is expected to not recover after failback. The
|
||||
errored local volumes should normally be discarded (deleted).
|
||||
|
||||
Failback
|
||||
~~~~~~~~
|
||||
|
||||
Failback site-a and confirm the original health of Cinder services (as per
|
||||
`Cinder service list`_):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||
cinder service-list
|
||||
|
||||
Examples
|
||||
--------
|
||||
|
||||
The following two examples will be considered. They will both use replication
|
||||
and involve the failing over of site-a to site-b:
|
||||
|
||||
#. `Data volume used by a VM`_
|
||||
#. `Bootable volume used by a VM`_
|
||||
|
||||
Data volume used by a VM
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In this example, a replicated data volume will be created in site-a and
|
||||
attached to a VM. The volume's block device will then have some test data
|
||||
written to it. This will allow for verification of the replicated data once
|
||||
failover has occurred and the volume is re-attached to the VM.
|
||||
|
||||
Preparation
|
||||
^^^^^^^^^^^
|
||||
|
||||
Create the replicated data volume:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume create --size 5 --type site-a-repl vol-site-a-repl-data
|
||||
openstack volume list
|
||||
+--------------------------------------+---------------------------+-----------+------+-------------+
|
||||
| ID | Name | Status | Size | Attached to |
|
||||
+--------------------------------------+---------------------------+-----------+------+-------------+
|
||||
| f23732c1-3257-4e58-a214-085c460abf56 | vol-site-a-repl-data | available | 5 | |
|
||||
+--------------------------------------+---------------------------+-----------+------+-------------+
|
||||
|
||||
Create the VM (named 'vm-with-data-volume'):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server create --image focal-amd64 --flavor m1.tiny \
|
||||
--key-name mykey --network int_net vm-with-data-volume
|
||||
|
||||
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||
openstack server add floating ip vm-with-data-volume $FLOATING_IP
|
||||
|
||||
openstack server list
|
||||
+--------------------------------------+----------------------+--------+---------------------------------+--------------------------+---------+
|
||||
| ID | Name | Status | Networks | Image | Flavor |
|
||||
+--------------------------------------+----------------------+--------+---------------------------------+--------------------------+---------+
|
||||
| fbe07fea-731e-4973-8455-c8466be72293 | vm-with-data-volume | ACTIVE | int_net=192.168.0.38, 10.5.1.28 | focal-amd64 | m1.tiny |
|
||||
+--------------------------------------+----------------------+--------+---------------------------------+--------------------------+---------+
|
||||
|
||||
Attach the data volume to the VM:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||
|
||||
Prepare the block device and write the test data to it:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||
> sudo mkfs.ext4 /dev/vdc
|
||||
> mkdir data
|
||||
> sudo mount /dev/vdc data
|
||||
> sudo chown ubuntu: data
|
||||
> echo "This is a test." > data/test.txt
|
||||
> sync
|
||||
> exit
|
||||
|
||||
Failover
|
||||
^^^^^^^^
|
||||
|
||||
When both sites are online, as is here, it is not recommended to perform a
|
||||
failover when volumes are in use. This is because Cinder will try to demote the
|
||||
Ceph image from the primary site, and if there is an active connection to it
|
||||
the operation may fail (i.e. the volume will transition to an error state).
|
||||
|
||||
Here we ensure the volume is not in use by unmounting the block device and
|
||||
removing it from the VM:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP sudo umount /dev/vdc
|
||||
openstack server remove vm-with-data-volume vol-site-a-repl-data
|
||||
|
||||
Prior to failover the images of all replicated volumes must be fully
|
||||
synchronised. Perform a check with the ceph-rbd-mirror charm's ``status``
|
||||
action as per `RBD image status`_. If the volumes were created in site-a then
|
||||
the ceph-rbd-mirror unit in site-b is the target:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-b-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||
|
||||
If all images look good, perform the failover of site-a:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a
|
||||
cinder service-list
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T19:30:29.000000 | - | - | |
|
||||
| cinder-volume | cinder@cinder-ceph-a | nova | disabled | up | 2021-04-08T19:30:28.000000 | - | failed-over | - |
|
||||
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T19:30:28.000000 | - | - | up |
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
|
||||
Verification
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Re-attach the volume to the VM:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server add volume vm-with-data-volume vol-site-a-repl-data
|
||||
|
||||
Verify that the secondary device contains the expected data:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||
> sudo mount /dev/vdc /data
|
||||
> cat /data/test.txt
|
||||
This is a test.
|
||||
|
||||
Failback
|
||||
^^^^^^^^
|
||||
|
||||
Failback site-a and confirm the original health of Cinder services (as per
|
||||
`Cinder service list`_):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||
cinder service-list
|
||||
|
||||
Bootable volume used by a VM
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
In this example, a bootable volume will be created in site-a and have a
|
||||
newly-created VM use that volume as its root device. Identically to the
|
||||
previous example, the volume's block device will have test data written to it
|
||||
to use for verification purposes.
|
||||
|
||||
Preparation
|
||||
^^^^^^^^^^^
|
||||
|
||||
Create the replicated bootable volume:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume create --size 5 --type site-a-repl --image focal-amd64 --bootable vol-site-a-repl-boot
|
||||
|
||||
Wait for the volume to become available (it may take a while):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack volume list
|
||||
+--------------------------------------+----------------------+-----------+------+-------------+
|
||||
| ID | Name | Status | Size | Attached to |
|
||||
+--------------------------------------+----------------------+-----------+------+-------------+
|
||||
| c44d4d20-6ede-422a-903d-588d1b0d51b0 | vol-site-a-repl-boot | available | 5 | |
|
||||
+--------------------------------------+----------------------+-----------+------+-------------+
|
||||
|
||||
Create a VM (named 'vm-with-boot-volume') by specifying the newly-created
|
||||
bootable volume:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server create --volume vol-site-a-repl-boot --flavor m1.tiny \
|
||||
--key-name mykey --network int_net vm-with-boot-volume
|
||||
|
||||
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||
openstack server add floating ip vm-with-boot-volume $FLOATING_IP
|
||||
|
||||
openstack server list
|
||||
+--------------------------------------+---------------------+--------+---------------------------------+--------------------------+---------+
|
||||
| ID | Name | Status | Networks | Image | Flavor |
|
||||
+--------------------------------------+---------------------+--------+---------------------------------+--------------------------+---------+
|
||||
| c0a152d7-376b-4500-95d4-7c768a3ff280 | vm-with-boot-volume | ACTIVE | int_net=192.168.0.75, 10.5.1.53 | N/A (booted from volume) | m1.tiny |
|
||||
+--------------------------------------+---------------------+--------+---------------------------------+--------------------------+---------+
|
||||
|
||||
Write the test data to the block device:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||
> echo "This is a test." > test.txt
|
||||
> sync
|
||||
> exit
|
||||
|
||||
Failover
|
||||
^^^^^^^^
|
||||
|
||||
As explained previously, when both sites are functional, prior to failover the
|
||||
replicated volume should not be in use. Since the testing of the replicated
|
||||
boot volume requires the VM to be rebuilt anyway (Cinder needs to give the
|
||||
updated Ceph connection credentials to Nova) the easiest way forward is to
|
||||
simply delete the VM:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server delete vm-with-boot-volume
|
||||
|
||||
Like before, prior to failover, confirm that the images of all replicated
|
||||
volumes in site-b are fully synchronised. Perform a check with the
|
||||
ceph-rbd-mirror charm's ``status`` action as per `RBD image status`_:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
juju run-action --wait site-b-ceph-rbd-mirror/0 status verbose=true | grep -A3 volume-
|
||||
|
||||
If all images look good, perform the failover of site-a:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a
|
||||
cinder service-list
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| Binary | Host | Zone | Status | State | Updated_at | Cluster | Disabled Reason | Backend State |
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
| cinder-scheduler | cinder | nova | enabled | up | 2021-04-08T21:29:12.000000 | - | - | |
|
||||
| cinder-volume | cinder@cinder-ceph-a | nova | disabled | up | 2021-04-08T21:29:12.000000 | - | failed-over | - |
|
||||
| cinder-volume | cinder@cinder-ceph-b | nova | enabled | up | 2021-04-08T21:29:11.000000 | - | - | up |
|
||||
+------------------+----------------------+------+----------+-------+----------------------------+---------+-----------------+---------------+
|
||||
|
||||
Verification
|
||||
^^^^^^^^^^^^
|
||||
|
||||
Re-create the VM:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
openstack server create --volume vol-site-a-repl-boot --flavor m1.tiny \
|
||||
--key-name mykey --network int_net vm-with-boot-volume
|
||||
|
||||
FLOATING_IP=$(openstack floating ip create -f value -c floating_ip_address ext_net)
|
||||
openstack server add floating ip vm-with-boot-volume $FLOATING_IP
|
||||
|
||||
Verify that the root device contains the expected data:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
ssh -i ~/cloud-keys/mykey ubuntu@$FLOATING_IP
|
||||
> cat test.txt
|
||||
This is a test.
|
||||
> exit
|
||||
|
||||
Failback
|
||||
^^^^^^^^
|
||||
|
||||
Failback site-a and confirm the original health of Cinder services (as per
|
||||
`Cinder service list`_):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
cinder failover-host cinder@cinder-ceph-a --backend_id default
|
||||
cinder service-list
|
||||
|
||||
Disaster recovery
|
||||
-----------------
|
||||
|
||||
An uncontrolled failover is known as the disaster recovery scenario. It is
|
||||
characterised by the sudden failure of the primary Ceph cluster. See the
|
||||
:ref:`Cinder volume replication - Disaster recovery
|
||||
<cinder_volume_replication_dr>` page for more information.
|
||||
|
||||
.. LINKS
|
||||
.. _Ceph RBD mirroring: app-ceph-rbd-mirror.html
|
||||
.. _openstack-base: https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-base/bundle.yaml
|
||||
.. _openstack-bundles: https://github.com/openstack-charmers/openstack-bundles/
|
||||
.. _bundle README: https://github.com/openstack-charmers/openstack-bundles/blob/master/stable/openstack-base/README.md
|
||||
.. _cinder-volume-replication-overlay: cinder-volume-replication-overlay.html
|
||||
.. _cinder-ceph: https://jaas.ai/cinder-ceph
|
||||
.. _LP #1892201: https://bugs.launchpad.net/charm-ceph-rbd-mirror/+bug/1892201
|
|
@ -85,6 +85,7 @@ OpenStack Charms usage. To help improve it you can `file an issue`_ or
|
|||
app-erasure-coding
|
||||
app-rgw-multisite
|
||||
app-ceph-rbd-mirror
|
||||
cinder-volume-replication
|
||||
app-manila-ganesha
|
||||
app-swift
|
||||
|
||||
|
|
Loading…
Reference in New Issue