Remote Redfish Subcloud Restore
Fixed Merge conflicts Fixed review comments for patchset 8 Fixed review comments for patchset 7 Fixed review comments for Patchset 4 Moved restoring-subclouds-from-backupdata-using-dcmanager to the Distributed Cloud Guide Story: 2008573 Task: 42332 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com> Change-Id: Ife0319125df38c54fb0baa79ac32070446a0d605 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
This commit is contained in:
3
doc/source/backup/.vscode/settings.json
vendored
Normal file
3
doc/source/backup/.vscode/settings.json
vendored
Normal file
@@ -0,0 +1,3 @@
|
|||||||
|
{
|
||||||
|
"restructuredtext.confPath": ""
|
||||||
|
}
|
@@ -28,24 +28,34 @@ specific applications must be re-applied once a storage cluster is configured.
|
|||||||
To restore the data, use the same version of the boot image \(ISO\) that
|
To restore the data, use the same version of the boot image \(ISO\) that
|
||||||
was used at the time of the original installation.
|
was used at the time of the original installation.
|
||||||
|
|
||||||
The |prod| restore supports two modes:
|
The |prod| restore supports the following optional modes:
|
||||||
|
|
||||||
.. _restoring-starlingx-system-data-and-storage-ol-tw4-kvc-4jb:
|
.. _restoring-starlingx-system-data-and-storage-ol-tw4-kvc-4jb:
|
||||||
|
|
||||||
#. To keep the Ceph cluster data intact \(false - default option\), use the
|
- To keep the Ceph cluster data intact \(false - default option\), use the
|
||||||
following syntax, when passing the extra arguments to the Ansible Restore
|
following parameter, when passing the extra arguments to the Ansible Restore
|
||||||
playbook command:
|
playbook command:
|
||||||
|
|
||||||
.. code-block:: none
|
.. code-block:: none
|
||||||
|
|
||||||
wipe_ceph_osds=false
|
wipe_ceph_osds=false
|
||||||
|
|
||||||
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||||
need to be recreated, use the following syntax:
|
need to be recreated, use the following parameter:
|
||||||
|
|
||||||
.. code-block:: none
|
.. code-block:: none
|
||||||
|
|
||||||
wipe_ceph_osds=true
|
wipe_ceph_osds=true
|
||||||
|
|
||||||
|
- To indicate that the backup data file is under /opt/platform-backup
|
||||||
|
directory on the local machine, use the following parameter:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
on_box_data=true
|
||||||
|
|
||||||
|
If this parameter is set to **false**, the Ansible Restore playbook expects
|
||||||
|
both the **initial_backup_dir** and **backup_filename** to be specified.
|
||||||
|
|
||||||
Restoring a |prod| cluster from a backup file is done by re-installing the
|
Restoring a |prod| cluster from a backup file is done by re-installing the
|
||||||
ISO on controller-0, running the Ansible Restore Playbook, applying updates
|
ISO on controller-0, running the Ansible Restore Playbook, applying updates
|
||||||
|
@@ -18,22 +18,20 @@ following command to run the Ansible Restore playbook:
|
|||||||
|
|
||||||
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
|
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
|
||||||
|
|
||||||
The |prod| restore supports two optional modes, keeping the Ceph cluster data
|
The |prod| restore supports the following optional modes, keeping the Ceph
|
||||||
intact or wiping the Ceph cluster.
|
cluster data intact or wiping the Ceph cluster.
|
||||||
|
|
||||||
.. rubric:: |proc|
|
|
||||||
|
|
||||||
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
|
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
|
||||||
|
|
||||||
#. To keep the Ceph cluster data intact \(false - default option\), use the
|
- To keep the Ceph cluster data intact \(false - default option\), use the
|
||||||
following command:
|
following parameter:
|
||||||
|
|
||||||
.. code-block:: none
|
.. code-block:: none
|
||||||
|
|
||||||
wipe_ceph_osds=false
|
wipe_ceph_osds=false
|
||||||
|
|
||||||
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||||
need to be recreated, use the following command:
|
need to be recreated, use the following parameter:
|
||||||
|
|
||||||
.. code-block:: none
|
.. code-block:: none
|
||||||
|
|
||||||
@@ -50,12 +48,23 @@ intact or wiping the Ceph cluster.
|
|||||||
the patches and prompt you to reboot the system. Then you will need to
|
the patches and prompt you to reboot the system. Then you will need to
|
||||||
re-run Ansible Restore playbook.
|
re-run Ansible Restore playbook.
|
||||||
|
|
||||||
|
- To indicate that the backup data file is under /opt/platform-backup
|
||||||
|
directory on the local machine, use the following parameter:
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
on_box_data=true
|
||||||
|
|
||||||
|
If this parameter is set to **false**, the Ansible Restore playbook expects
|
||||||
|
both the **initial_backup_dir** and **backup_filename** to be specified.
|
||||||
|
|
||||||
.. rubric:: |postreq|
|
.. rubric:: |postreq|
|
||||||
|
|
||||||
After running restore\_platform.yml playbook, you can restore the local
|
After running restore\_platform.yml playbook, you can restore the local
|
||||||
registry images.
|
registry images.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
The backup file of the local registry images may be large. Restore the
|
The backup file of the local registry images may be large. Restore the
|
||||||
backed up file on the controller, where there is sufficient space.
|
backed up file on the controller, where there is sufficient space.
|
||||||
|
|
||||||
|
@@ -51,18 +51,27 @@ In this method you can run Ansible Restore playbook and point to controller-0.
|
|||||||
|
|
||||||
where optional-extra-vars can be:
|
where optional-extra-vars can be:
|
||||||
|
|
||||||
- **Optional**: You can select one of the two restore modes:
|
- **Optional**: You can select one of the following restore modes:
|
||||||
|
|
||||||
- To keep Ceph data intact \(false - default option\), use the
|
- To keep Ceph data intact \(false - default option\), use the
|
||||||
following syntax:
|
following parameter:
|
||||||
|
|
||||||
:command:`wipe_ceph_osds=false`
|
:command:`wipe_ceph_osds=false`
|
||||||
|
|
||||||
- Start with an empty Ceph cluster \(true\), to recreate a new
|
- To start with an empty Ceph cluster \(true\), where the Ceph
|
||||||
Ceph cluster, use the following syntax:
|
cluster will need to be recreated, use the following parameter:
|
||||||
|
|
||||||
:command:`wipe_ceph_osds=true`
|
:command:`wipe_ceph_osds=true`
|
||||||
|
|
||||||
|
- To indicate that the backup data file is under /opt/platform-backup
|
||||||
|
directory on the local machine, use the following parameter:
|
||||||
|
|
||||||
|
:command:`on_box_data=true`
|
||||||
|
|
||||||
|
If this parameter is set to **false**, the Ansible Restore playbook
|
||||||
|
expects both the **initial_backup_dir** and **backup_filename**
|
||||||
|
to be specified.
|
||||||
|
|
||||||
- The backup\_filename is the platform backup tar file. It must be
|
- The backup\_filename is the platform backup tar file. It must be
|
||||||
provided using the ``-e`` option on the command line, for example:
|
provided using the ``-e`` option on the command line, for example:
|
||||||
|
|
||||||
|
@@ -49,6 +49,7 @@ Operation
|
|||||||
changing-the-admin-password-on-distributed-cloud
|
changing-the-admin-password-on-distributed-cloud
|
||||||
updating-docker-registry-credentials-on-a-subcloud
|
updating-docker-registry-credentials-on-a-subcloud
|
||||||
migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
|
migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
|
||||||
|
restoring-subclouds-from-backupdata-using-dcmanager
|
||||||
|
|
||||||
----------------------------------------------------------
|
----------------------------------------------------------
|
||||||
Kubernetes Version Upgrade Distributed Cloud Orchestration
|
Kubernetes Version Upgrade Distributed Cloud Orchestration
|
||||||
|
@@ -0,0 +1,113 @@
|
|||||||
|
|
||||||
|
.. _restoring-subclouds-from-backupdata-using-dcmanager:
|
||||||
|
|
||||||
|
=========================================================
|
||||||
|
Restoring a Subcloud From Backup Data Using DCManager CLI
|
||||||
|
=========================================================
|
||||||
|
|
||||||
|
For subclouds with servers that support Redfish Virtual Media Service
|
||||||
|
(version 1.2 or higher), you can use the Central Cloud's CLI to restore the
|
||||||
|
subcloud from data that was backed up previously.
|
||||||
|
|
||||||
|
.. rubric:: |context|
|
||||||
|
|
||||||
|
The CLI command :command:`dcmanager subcloud restore` can be used to restore a
|
||||||
|
subcloud from available system data and bring it back to the operational state
|
||||||
|
it was in when the backup procedure took place. The subcloud restore has three
|
||||||
|
phases:
|
||||||
|
|
||||||
|
- Re-install the controller-0 of the subcloud with the current active load
|
||||||
|
running in the SystemController. For subcloud servers that support
|
||||||
|
Redfish Virtual Media Service, this phase can be carried out remotely
|
||||||
|
as part of the CLI.
|
||||||
|
|
||||||
|
- Run Ansible Platform Restore to restore |prod|, from a previous backup on
|
||||||
|
the controller-0 of the subcloud. This phase is also carried out as part
|
||||||
|
of the CLI.
|
||||||
|
|
||||||
|
- Unlock the controller-0 of the subcloud and continue with the steps to
|
||||||
|
restore the remaining nodes of the subcloud where applicable. This phase
|
||||||
|
is carried out by the system administrator, see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`.
|
||||||
|
|
||||||
|
.. rubric:: |prereq|
|
||||||
|
|
||||||
|
- The SystemController is healthy, and ready to accept **dcmanager** related
|
||||||
|
commands.
|
||||||
|
|
||||||
|
- The subcloud is unmanaged, and not in the process of installation,
|
||||||
|
bootstrap or deployment.
|
||||||
|
|
||||||
|
- The platform backup tar file is already on the subcloud in
|
||||||
|
/opt/platform-backup directory or has been transferred to the
|
||||||
|
SystemController.
|
||||||
|
|
||||||
|
- The subcloud install values have been saved in the **dcmanager** database
|
||||||
|
i.e. the subcloud has been installed remotely as part of :command:`dcmanager subcloud add`.
|
||||||
|
|
||||||
|
.. rubric:: |proc|
|
||||||
|
|
||||||
|
#. Create the restore_values.yaml file which will be passed to the
|
||||||
|
:command:`dcmanager subcloud restore` command using the ``--restore-values``
|
||||||
|
option. This file contains parameters that will be used during the platform
|
||||||
|
restore phase. Minimally, the **backup_filename** parameter, indicating the
|
||||||
|
file containing a previous backup of the subcloud, must be specified in the
|
||||||
|
yaml file, see :ref:`Run Ansible Restore Playbook Remotely <system-backup-running-ansible-restore-playbook-remotely>`,
|
||||||
|
and, :ref:`Run Restore Playbook Locally on the Controller <running-restore-playbook-locally-on-the-controller>`,
|
||||||
|
for supported restore parameters.
|
||||||
|
|
||||||
|
#. Restore the subcloud, using the dcmanager CLI command, :command:`subcloud restore`
|
||||||
|
and specify the restore values, with the ``--with-install`` option and the
|
||||||
|
subcloud's sysadmin password.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin) $ dcmanager subcloud restore --restore-values /home/sysadmin/subcloud1-restore.yaml --with-install --sysadmin-password <sysadmin_password> subcloud-name-or-id
|
||||||
|
|
||||||
|
Where:
|
||||||
|
|
||||||
|
- ``--restore-values`` must reference the restore values yaml file
|
||||||
|
mentioned in Step 1 of this procedure.
|
||||||
|
|
||||||
|
- ``--with-install`` indicates that a re-install of controller-0 of the
|
||||||
|
subcloud should be done remotely using Redfish Virtual Media Service.
|
||||||
|
|
||||||
|
If the ``--sysadmin-password`` option is not specified, the system
|
||||||
|
administrator will be prompted for the password. The password is masked
|
||||||
|
when it is entered. Enter the sysadmin password for the subcloud.
|
||||||
|
The **dcmanager subcloud restore** can take up to 30 minutes to reinstall
|
||||||
|
and restore the platform on controller-0 of the subcloud.
|
||||||
|
|
||||||
|
#. On the Central Cloud (SystemController), monitor the progress of the
|
||||||
|
subcloud reinstall and restore via the deploy status field of the
|
||||||
|
:command:`dcmanager subcloud list` command.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)]$ dcmanager subcloud list
|
||||||
|
|
||||||
|
+----+-----------+------------+--------------+---------------+---------+
|
||||||
|
| id | name | management | availability | deploy status | sync |
|
||||||
|
+----+-----------+------------+--------------+---------------+---------+
|
||||||
|
| 1 | subcloud1 | unmanaged | online | installing | unknown |
|
||||||
|
+----+-----------+------------+--------------+---------------+---------+
|
||||||
|
|
||||||
|
#. In case of a failure, check the Ansible log for the corresponding subcloud
|
||||||
|
under /var/log/dcmanager/ansible directory.
|
||||||
|
|
||||||
|
#. When the subcloud deploy status changes to "complete", the controller-0
|
||||||
|
is ready to be unlocked. Log into the controller-0 of the subcloud using
|
||||||
|
its bootstrap IP and unlock the host using the following command.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)]$ system host-unlock controller-0
|
||||||
|
|
||||||
|
#. For |AIO|-DX and Standard subclouds, follow the procedure,
|
||||||
|
see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`
|
||||||
|
to restore the rest of the subcloud nodes.
|
||||||
|
|
||||||
|
#. To resume subcloud audit, use the following command.
|
||||||
|
|
||||||
|
.. code-block:: none
|
||||||
|
|
||||||
|
~(keystone_admin)]$ dcmanager subcloud manage
|
Reference in New Issue
Block a user