Remote Redfish Subcloud Restore
Fixed Merge conflicts
Fixed review comments for patchset 8
Fixed review comments for patchset 7
Fixed review comments for Patchset 4
Moved restoring-subclouds-from-backupdata-using-dcmanager to the Distributed Cloud Guide
Added missing files.
Story: 2008573
Task: 42332
Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
Change-Id: Ife0319125df38c54fb0baa79ac32070446a0d605
Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
(cherry picked from commit e2e42814e6
)
Signed-off-by: Ron Stone <ronald.stone@windriver.com>
This commit is contained in:

committed by
Ron Stone

parent
56b33d2b65
commit
7fcec920d2
3
doc/source/backup/.vscode/settings.json
vendored
Normal file
3
doc/source/backup/.vscode/settings.json
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
{
|
||||
"restructuredtext.confPath": ""
|
||||
}
|
@@ -28,24 +28,34 @@ specific applications must be re-applied once a storage cluster is configured.
|
||||
To restore the data, use the same version of the boot image \(ISO\) that
|
||||
was used at the time of the original installation.
|
||||
|
||||
The |prod| restore supports two modes:
|
||||
The |prod| restore supports the following optional modes:
|
||||
|
||||
.. _restoring-starlingx-system-data-and-storage-ol-tw4-kvc-4jb:
|
||||
|
||||
#. To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following syntax, when passing the extra arguments to the Ansible Restore
|
||||
- To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following parameter, when passing the extra arguments to the Ansible Restore
|
||||
playbook command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
wipe_ceph_osds=false
|
||||
|
||||
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following syntax:
|
||||
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
wipe_ceph_osds=true
|
||||
wipe_ceph_osds=true
|
||||
|
||||
- To indicate that the backup data file is under /opt/platform-backup
|
||||
directory on the local machine, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
on_box_data=true
|
||||
|
||||
If this parameter is set to **false**, the Ansible Restore playbook expects
|
||||
both the **initial_backup_dir** and **backup_filename** to be specified.
|
||||
|
||||
Restoring a |prod| cluster from a backup file is done by re-installing the
|
||||
ISO on controller-0, running the Ansible Restore Playbook, applying updates
|
||||
|
@@ -18,22 +18,20 @@ following command to run the Ansible Restore playbook:
|
||||
|
||||
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
|
||||
|
||||
The |prod| restore supports two optional modes, keeping the Ceph cluster data
|
||||
intact or wiping the Ceph cluster.
|
||||
|
||||
.. rubric:: |proc|
|
||||
The |prod| restore supports the following optional modes, keeping the Ceph
|
||||
cluster data intact or wiping the Ceph cluster.
|
||||
|
||||
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
|
||||
|
||||
#. To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following command:
|
||||
- To keep the Ceph cluster data intact \(false - default option\), use the
|
||||
following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
wipe_ceph_osds=false
|
||||
|
||||
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following command:
|
||||
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
|
||||
need to be recreated, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
@@ -50,12 +48,23 @@ intact or wiping the Ceph cluster.
|
||||
the patches and prompt you to reboot the system. Then you will need to
|
||||
re-run Ansible Restore playbook.
|
||||
|
||||
- To indicate that the backup data file is under /opt/platform-backup
|
||||
directory on the local machine, use the following parameter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
on_box_data=true
|
||||
|
||||
If this parameter is set to **false**, the Ansible Restore playbook expects
|
||||
both the **initial_backup_dir** and **backup_filename** to be specified.
|
||||
|
||||
.. rubric:: |postreq|
|
||||
|
||||
After running restore\_platform.yml playbook, you can restore the local
|
||||
registry images.
|
||||
|
||||
.. note::
|
||||
|
||||
The backup file of the local registry images may be large. Restore the
|
||||
backed up file on the controller, where there is sufficient space.
|
||||
|
||||
|
@@ -51,18 +51,27 @@ In this method you can run Ansible Restore playbook and point to controller-0.
|
||||
|
||||
where optional-extra-vars can be:
|
||||
|
||||
- **Optional**: You can select one of the two restore modes:
|
||||
- **Optional**: You can select one of the following restore modes:
|
||||
|
||||
- To keep Ceph data intact \(false - default option\), use the
|
||||
following syntax:
|
||||
following parameter:
|
||||
|
||||
:command:`wipe_ceph_osds=false`
|
||||
|
||||
- Start with an empty Ceph cluster \(true\), to recreate a new
|
||||
Ceph cluster, use the following syntax:
|
||||
- To start with an empty Ceph cluster \(true\), where the Ceph
|
||||
cluster will need to be recreated, use the following parameter:
|
||||
|
||||
:command:`wipe_ceph_osds=true`
|
||||
|
||||
- To indicate that the backup data file is under /opt/platform-backup
|
||||
directory on the local machine, use the following parameter:
|
||||
|
||||
:command:`on_box_data=true`
|
||||
|
||||
If this parameter is set to **false**, the Ansible Restore playbook
|
||||
expects both the **initial_backup_dir** and **backup_filename**
|
||||
to be specified.
|
||||
|
||||
- The backup\_filename is the platform backup tar file. It must be
|
||||
provided using the ``-e`` option on the command line, for example:
|
||||
|
||||
|
@@ -48,6 +48,9 @@ Operation
|
||||
managing-ldap-linux-user-accounts-on-the-system-controller
|
||||
changing-the-admin-password-on-distributed-cloud
|
||||
updating-docker-registry-credentials-on-a-subcloud
|
||||
migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
|
||||
restoring-subclouds-from-backupdata-using-dcmanager
|
||||
|
||||
|
||||
----------------------
|
||||
Manage Subcloud Groups
|
||||
|
@@ -0,0 +1,196 @@
|
||||
|
||||
.. _migrate-an-aiosx-subcloud-to-an-aiodx-subcloud:
|
||||
|
||||
---------------------------------------
|
||||
Migrate an AIO-SX to an AIO-DX Subcloud
|
||||
---------------------------------------
|
||||
|
||||
|release-caveat|
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
You can migrate an |AIO-SX| subcloud to an |AIO-DX| subcloud without
|
||||
reinstallation. This operation involves updating the system mode, adding the
|
||||
|OAM| unit IP addresses of each controller, and installing the second controller.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
A distributed cloud system is setup with at least a system controller and an
|
||||
|AIO-SX| subcloud. The subcloud must be online and managed by dcmanager.
|
||||
Both the management network and cluster-host network need to be configured and
|
||||
cannot be on the loopback interface.
|
||||
|
||||
======================================
|
||||
Reconfigure the Cluster-Host Interface
|
||||
======================================
|
||||
|
||||
If the cluster-host interface is on the loopback interface, use the following
|
||||
procedure to reconfigure the cluster-host interface on to a physical interface.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Lock the active controller.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system host-lock controller-0
|
||||
|
||||
#. Change the class attribute to 'none' for the loopback interface.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system host-if-modify controller-0 lo -c none
|
||||
|
||||
#. Delete the current cluster-host interface-network configuration
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ IFNET_UUID=$(system interface-network-list controller-0 | awk '{if ($8 =="cluster-host") print $4;}')
|
||||
~(keystone_admin)$ system interface-network-remove $IFNET_UUID
|
||||
|
||||
#. Assign the cluster-host network to the new interface. This example assumes
|
||||
the interface name is mgmt0.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system interface-network-assign controller-0 mgmt0 cluster-host
|
||||
|
||||
.. rubric:: |postreq|
|
||||
|
||||
Continue with the |AIO-SX| to |AIO-DX| subcloud migration, using one of the
|
||||
following procedures:
|
||||
|
||||
Use Ansible Playbook to Migrate a Subcloud from AIO-SX to AIO-DX, or
|
||||
Manually Migrate a Subcloud from AIO-SX to AIO-DX.
|
||||
|
||||
|
||||
.. _use-ansible-playbook-to-migrate-a-subcloud-from-AIO-SX-to-AIO-DX:
|
||||
|
||||
================================================================
|
||||
Use Ansible Playbook to Migrate a Subcloud from AIO-SX to AIO-DX
|
||||
================================================================
|
||||
|
||||
Use the following procedure to migrate a subcloud from |AIO-SX| to |AIO-DX|
|
||||
using the ansible playbook.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
- the subcloud must be online and managed from the System Controller
|
||||
- the subcloud's controller-0 may be locked or unlocked; the ansible playbook
|
||||
will lock the subcloud controller-0 as part of migrating the subcloud
|
||||
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Create a configuration file and specify the |OAM| unit IP addresses and
|
||||
the ansible ssh password in the **migrate-subcloud1-overrides-EXAMPLE.yml**
|
||||
file. The existing |OAM| IP address of the |AIO-SX| system will be used as
|
||||
the |OAM| floating IP address of the new |AIO-DX| system.
|
||||
|
||||
In the following example, 10.10.10.13 and 10.10.10.14 are the new |OAM| unit
|
||||
IP addresses for controller-0 and controller-1 respectively.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
{
|
||||
"ansible_ssh_pass": "St8rlingX*",
|
||||
"external_oam_node_0_address": "10.10.10.13",
|
||||
"external_oam_node_1_address": "10.10.10.14",
|
||||
}
|
||||
|
||||
#. On the system controller, run the ansible playbook to migrate the |AIO-SX|
|
||||
subcloud to an |AIO-DX|.
|
||||
|
||||
For example, if the subcloud name is 'subcloud1', enter:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/migrate_sx_to_dx.yml -e @migrate-subcloud1-overrides-EXAMPLE.yml -i subcloud1, -v
|
||||
|
||||
The ansible playbook will lock the subcloud's controller-0, if it not
|
||||
already locked, apply the configuration changes to convert the subcloud to
|
||||
an |AIO-DX| system with a single controller, and unlock controller-0.
|
||||
Wait for the controller to reset and come back up to an operational state.
|
||||
|
||||
#. Software install and configure the second controller for the subcloud.
|
||||
|
||||
From the System Controller, reconfigure the subcloud, using dcmanager.
|
||||
Specify the sysadmin password and the deployment configuration file, using
|
||||
the :command:`dcmanager subcloud reconfig` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ dcmanager subcloud reconfig --sysadmin-password <sysadmin_password> --deploy-config deployment-config-subcloud1-duplex.yaml <subcloud1>
|
||||
|
||||
where *<sysadmin_password>* is assumed to be the login password and
|
||||
*<subcloud1>* is the name of the subcloud
|
||||
|
||||
.. note::
|
||||
|
||||
``--deploy-config`` must reference a deployment configuration file for
|
||||
a |AIO-DX| subcloud.
|
||||
|
||||
For example, **deployment-config-subcloud1-duplex.yaml** should only
|
||||
include changes for controller-1 as changing fields for other nodes/
|
||||
resources may cause them to go out of sync.
|
||||
|
||||
.. only:: partner
|
||||
|
||||
.. include:: /_includes/migrate-an-aiosx-subcloud-to-an-aiodx-subcloud.rest
|
||||
|
||||
|
||||
.. _manually-migrate-a-subcloud-from-AIO-SX-to-AIO-DX:
|
||||
|
||||
=================================================
|
||||
Manually Migrate a Subcloud from AIO-SX to AIO-DX
|
||||
=================================================
|
||||
|
||||
As an alternative to using the Ansible playbook, use the following procedure
|
||||
to manually migrate a subcloud from |AIO-SX| to |AIO-DX|. Perform the following
|
||||
commands on the |AIO-SX| subcloud.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. If not already locked, lock the active controller.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system host-lock controller-0
|
||||
|
||||
#. Change the system mode to 'duplex'.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system modify --system_mode=duplex
|
||||
|
||||
#. Add the |OAM| unit IP addresses of controller-0 and controller-1.
|
||||
|
||||
For example, the |OAM| subnet is 10.10.10.0/24 and uses 10.10.10.13 and
|
||||
10.10.10.14 for the unit IP addresses of controller-0 and controller-1
|
||||
respectively. The existing |OAM| IP address of the |AIO-SX| system will be
|
||||
used as the OAM floating IP address of the new |AIO-DX| system.
|
||||
|
||||
.. note::
|
||||
|
||||
Only specifying oam_c0_ip and oam_c1_ip is necessary to configure the
|
||||
OAM unit IPs to transition to Duplex. However, oam_c0_ip and oam_c1_ip
|
||||
cannot equal the current or specified value for oam_floating_ip.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system oam-modify oam_subnet=10.10.10.0/24 oam_gateway_ip=10.10.10.1 oam_floating_ip=10.10.10.12 oam_c0_ip=10.10.10.13 oam_c1_ip=10.10.10.14
|
||||
|
||||
#. Unlock the controller.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system host-unlock controller-0
|
||||
|
||||
Wait for the controller to reset and come back up to an operational state.
|
||||
|
||||
#. Software install and configure the second controller for the subcloud.
|
||||
|
||||
For instructions on installing and configuring controller-1 in an
|
||||
|AIO-DX| setup to continue with the migration, see |inst-doc|.
|
||||
|
||||
|
@@ -0,0 +1,113 @@
|
||||
|
||||
.. _restoring-subclouds-from-backupdata-using-dcmanager:
|
||||
|
||||
=========================================================
|
||||
Restoring a Subcloud From Backup Data Using DCManager CLI
|
||||
=========================================================
|
||||
|
||||
For subclouds with servers that support Redfish Virtual Media Service
|
||||
(version 1.2 or higher), you can use the Central Cloud's CLI to restore the
|
||||
subcloud from data that was backed up previously.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
The CLI command :command:`dcmanager subcloud restore` can be used to restore a
|
||||
subcloud from available system data and bring it back to the operational state
|
||||
it was in when the backup procedure took place. The subcloud restore has three
|
||||
phases:
|
||||
|
||||
- Re-install the controller-0 of the subcloud with the current active load
|
||||
running in the SystemController. For subcloud servers that support
|
||||
Redfish Virtual Media Service, this phase can be carried out remotely
|
||||
as part of the CLI.
|
||||
|
||||
- Run Ansible Platform Restore to restore |prod|, from a previous backup on
|
||||
the controller-0 of the subcloud. This phase is also carried out as part
|
||||
of the CLI.
|
||||
|
||||
- Unlock the controller-0 of the subcloud and continue with the steps to
|
||||
restore the remaining nodes of the subcloud where applicable. This phase
|
||||
is carried out by the system administrator, see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
- The SystemController is healthy, and ready to accept **dcmanager** related
|
||||
commands.
|
||||
|
||||
- The subcloud is unmanaged, and not in the process of installation,
|
||||
bootstrap or deployment.
|
||||
|
||||
- The platform backup tar file is already on the subcloud in
|
||||
/opt/platform-backup directory or has been transferred to the
|
||||
SystemController.
|
||||
|
||||
- The subcloud install values have been saved in the **dcmanager** database
|
||||
i.e. the subcloud has been installed remotely as part of :command:`dcmanager subcloud add`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Create the restore_values.yaml file which will be passed to the
|
||||
:command:`dcmanager subcloud restore` command using the ``--restore-values``
|
||||
option. This file contains parameters that will be used during the platform
|
||||
restore phase. Minimally, the **backup_filename** parameter, indicating the
|
||||
file containing a previous backup of the subcloud, must be specified in the
|
||||
yaml file, see :ref:`Run Ansible Restore Playbook Remotely <system-backup-running-ansible-restore-playbook-remotely>`,
|
||||
and, :ref:`Run Restore Playbook Locally on the Controller <running-restore-playbook-locally-on-the-controller>`,
|
||||
for supported restore parameters.
|
||||
|
||||
#. Restore the subcloud, using the dcmanager CLI command, :command:`subcloud restore`
|
||||
and specify the restore values, with the ``--with-install`` option and the
|
||||
subcloud's sysadmin password.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin) $ dcmanager subcloud restore --restore-values /home/sysadmin/subcloud1-restore.yaml --with-install --sysadmin-password <sysadmin_password> subcloud-name-or-id
|
||||
|
||||
Where:
|
||||
|
||||
- ``--restore-values`` must reference the restore values yaml file
|
||||
mentioned in Step 1 of this procedure.
|
||||
|
||||
- ``--with-install`` indicates that a re-install of controller-0 of the
|
||||
subcloud should be done remotely using Redfish Virtual Media Service.
|
||||
|
||||
If the ``--sysadmin-password`` option is not specified, the system
|
||||
administrator will be prompted for the password. The password is masked
|
||||
when it is entered. Enter the sysadmin password for the subcloud.
|
||||
The **dcmanager subcloud restore** can take up to 30 minutes to reinstall
|
||||
and restore the platform on controller-0 of the subcloud.
|
||||
|
||||
#. On the Central Cloud (SystemController), monitor the progress of the
|
||||
subcloud reinstall and restore via the deploy status field of the
|
||||
:command:`dcmanager subcloud list` command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud list
|
||||
|
||||
+----+-----------+------------+--------------+---------------+---------+
|
||||
| id | name | management | availability | deploy status | sync |
|
||||
+----+-----------+------------+--------------+---------------+---------+
|
||||
| 1 | subcloud1 | unmanaged | online | installing | unknown |
|
||||
+----+-----------+------------+--------------+---------------+---------+
|
||||
|
||||
#. In case of a failure, check the Ansible log for the corresponding subcloud
|
||||
under /var/log/dcmanager/ansible directory.
|
||||
|
||||
#. When the subcloud deploy status changes to "complete", the controller-0
|
||||
is ready to be unlocked. Log into the controller-0 of the subcloud using
|
||||
its bootstrap IP and unlock the host using the following command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ system host-unlock controller-0
|
||||
|
||||
#. For |AIO|-DX and Standard subclouds, follow the procedure,
|
||||
see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`
|
||||
to restore the rest of the subcloud nodes.
|
||||
|
||||
#. To resume subcloud audit, use the following command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)]$ dcmanager subcloud manage
|
Reference in New Issue
Block a user