Remote Redfish Subcloud Restore

Fixed Merge conflicts
Fixed review comments for patchset 8
Fixed review comments for patchset 7
Fixed review comments for Patchset 4
Moved restoring-subclouds-from-backupdata-using-dcmanager to the Distributed Cloud Guide
Added missing files.

Story: 2008573
Task: 42332

Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
Change-Id: Ife0319125df38c54fb0baa79ac32070446a0d605
Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
(cherry picked from commit e2e42814e6)
Signed-off-by: Ron Stone <ronald.stone@windriver.com>
This commit is contained in:
Juanita-Balaraj
2021-04-23 16:59:39 -04:00
committed by Ron Stone
parent 56b33d2b65
commit 7fcec920d2
8 changed files with 361 additions and 18 deletions

View File

@@ -0,0 +1,3 @@
{
"restructuredtext.confPath": ""
}

View File

@@ -28,24 +28,34 @@ specific applications must be re-applied once a storage cluster is configured.
To restore the data, use the same version of the boot image \(ISO\) that
was used at the time of the original installation.
The |prod| restore supports two modes:
The |prod| restore supports the following optional modes:
.. _restoring-starlingx-system-data-and-storage-ol-tw4-kvc-4jb:
#. To keep the Ceph cluster data intact \(false - default option\), use the
following syntax, when passing the extra arguments to the Ansible Restore
- To keep the Ceph cluster data intact \(false - default option\), use the
following parameter, when passing the extra arguments to the Ansible Restore
playbook command:
.. code-block:: none
wipe_ceph_osds=false
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
need to be recreated, use the following syntax:
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
need to be recreated, use the following parameter:
.. code-block:: none
wipe_ceph_osds=true
wipe_ceph_osds=true
- To indicate that the backup data file is under /opt/platform-backup
directory on the local machine, use the following parameter:
.. code-block:: none
on_box_data=true
If this parameter is set to **false**, the Ansible Restore playbook expects
both the **initial_backup_dir** and **backup_filename** to be specified.
Restoring a |prod| cluster from a backup file is done by re-installing the
ISO on controller-0, running the Ansible Restore Playbook, applying updates

View File

@@ -18,22 +18,20 @@ following command to run the Ansible Restore playbook:
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
The |prod| restore supports two optional modes, keeping the Ceph cluster data
intact or wiping the Ceph cluster.
.. rubric:: |proc|
The |prod| restore supports the following optional modes, keeping the Ceph
cluster data intact or wiping the Ceph cluster.
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
#. To keep the Ceph cluster data intact \(false - default option\), use the
following command:
- To keep the Ceph cluster data intact \(false - default option\), use the
following parameter:
.. code-block:: none
wipe_ceph_osds=false
#. To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
need to be recreated, use the following command:
- To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
need to be recreated, use the following parameter:
.. code-block:: none
@@ -50,12 +48,23 @@ intact or wiping the Ceph cluster.
the patches and prompt you to reboot the system. Then you will need to
re-run Ansible Restore playbook.
- To indicate that the backup data file is under /opt/platform-backup
directory on the local machine, use the following parameter:
.. code-block:: none
on_box_data=true
If this parameter is set to **false**, the Ansible Restore playbook expects
both the **initial_backup_dir** and **backup_filename** to be specified.
.. rubric:: |postreq|
After running restore\_platform.yml playbook, you can restore the local
registry images.
.. note::
The backup file of the local registry images may be large. Restore the
backed up file on the controller, where there is sufficient space.

View File

@@ -51,18 +51,27 @@ In this method you can run Ansible Restore playbook and point to controller-0.
where optional-extra-vars can be:
- **Optional**: You can select one of the two restore modes:
- **Optional**: You can select one of the following restore modes:
- To keep Ceph data intact \(false - default option\), use the
following syntax:
following parameter:
:command:`wipe_ceph_osds=false`
- Start with an empty Ceph cluster \(true\), to recreate a new
Ceph cluster, use the following syntax:
- To start with an empty Ceph cluster \(true\), where the Ceph
cluster will need to be recreated, use the following parameter:
:command:`wipe_ceph_osds=true`
- To indicate that the backup data file is under /opt/platform-backup
directory on the local machine, use the following parameter:
:command:`on_box_data=true`
If this parameter is set to **false**, the Ansible Restore playbook
expects both the **initial_backup_dir** and **backup_filename**
to be specified.
- The backup\_filename is the platform backup tar file. It must be
provided using the ``-e`` option on the command line, for example:

View File

@@ -48,6 +48,9 @@ Operation
managing-ldap-linux-user-accounts-on-the-system-controller
changing-the-admin-password-on-distributed-cloud
updating-docker-registry-credentials-on-a-subcloud
migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
restoring-subclouds-from-backupdata-using-dcmanager
----------------------
Manage Subcloud Groups

View File

@@ -0,0 +1,196 @@
.. _migrate-an-aiosx-subcloud-to-an-aiodx-subcloud:
---------------------------------------
Migrate an AIO-SX to an AIO-DX Subcloud
---------------------------------------
|release-caveat|
.. rubric:: |context|
You can migrate an |AIO-SX| subcloud to an |AIO-DX| subcloud without
reinstallation. This operation involves updating the system mode, adding the
|OAM| unit IP addresses of each controller, and installing the second controller.
.. rubric:: |prereq|
A distributed cloud system is setup with at least a system controller and an
|AIO-SX| subcloud. The subcloud must be online and managed by dcmanager.
Both the management network and cluster-host network need to be configured and
cannot be on the loopback interface.
======================================
Reconfigure the Cluster-Host Interface
======================================
If the cluster-host interface is on the loopback interface, use the following
procedure to reconfigure the cluster-host interface on to a physical interface.
.. rubric:: |proc|
#. Lock the active controller.
.. code-block:: none
~(keystone_admin)$ system host-lock controller-0
#. Change the class attribute to 'none' for the loopback interface.
.. code-block:: none
~(keystone_admin)$ system host-if-modify controller-0 lo -c none
#. Delete the current cluster-host interface-network configuration
.. code-block:: none
~(keystone_admin)$ IFNET_UUID=$(system interface-network-list controller-0 | awk '{if ($8 =="cluster-host") print $4;}')
~(keystone_admin)$ system interface-network-remove $IFNET_UUID
#. Assign the cluster-host network to the new interface. This example assumes
the interface name is mgmt0.
.. code-block:: none
~(keystone_admin)$ system interface-network-assign controller-0 mgmt0 cluster-host
.. rubric:: |postreq|
Continue with the |AIO-SX| to |AIO-DX| subcloud migration, using one of the
following procedures:
Use Ansible Playbook to Migrate a Subcloud from AIO-SX to AIO-DX, or
Manually Migrate a Subcloud from AIO-SX to AIO-DX.
.. _use-ansible-playbook-to-migrate-a-subcloud-from-AIO-SX-to-AIO-DX:
================================================================
Use Ansible Playbook to Migrate a Subcloud from AIO-SX to AIO-DX
================================================================
Use the following procedure to migrate a subcloud from |AIO-SX| to |AIO-DX|
using the ansible playbook.
.. rubric:: |prereq|
- the subcloud must be online and managed from the System Controller
- the subcloud's controller-0 may be locked or unlocked; the ansible playbook
will lock the subcloud controller-0 as part of migrating the subcloud
.. rubric:: |proc|
#. Create a configuration file and specify the |OAM| unit IP addresses and
the ansible ssh password in the **migrate-subcloud1-overrides-EXAMPLE.yml**
file. The existing |OAM| IP address of the |AIO-SX| system will be used as
the |OAM| floating IP address of the new |AIO-DX| system.
In the following example, 10.10.10.13 and 10.10.10.14 are the new |OAM| unit
IP addresses for controller-0 and controller-1 respectively.
.. code-block:: none
{
"ansible_ssh_pass": "St8rlingX*",
"external_oam_node_0_address": "10.10.10.13",
"external_oam_node_1_address": "10.10.10.14",
}
#. On the system controller, run the ansible playbook to migrate the |AIO-SX|
subcloud to an |AIO-DX|.
For example, if the subcloud name is 'subcloud1', enter:
.. code-block:: none
~(keystone_admin)$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/migrate_sx_to_dx.yml -e @migrate-subcloud1-overrides-EXAMPLE.yml -i subcloud1, -v
The ansible playbook will lock the subcloud's controller-0, if it not
already locked, apply the configuration changes to convert the subcloud to
an |AIO-DX| system with a single controller, and unlock controller-0.
Wait for the controller to reset and come back up to an operational state.
#. Software install and configure the second controller for the subcloud.
From the System Controller, reconfigure the subcloud, using dcmanager.
Specify the sysadmin password and the deployment configuration file, using
the :command:`dcmanager subcloud reconfig` command.
.. code-block:: none
~(keystone_admin)$ dcmanager subcloud reconfig --sysadmin-password <sysadmin_password> --deploy-config deployment-config-subcloud1-duplex.yaml <subcloud1>
where *<sysadmin_password>* is assumed to be the login password and
*<subcloud1>* is the name of the subcloud
.. note::
``--deploy-config`` must reference a deployment configuration file for
a |AIO-DX| subcloud.
For example, **deployment-config-subcloud1-duplex.yaml** should only
include changes for controller-1 as changing fields for other nodes/
resources may cause them to go out of sync.
.. only:: partner
.. include:: /_includes/migrate-an-aiosx-subcloud-to-an-aiodx-subcloud.rest
.. _manually-migrate-a-subcloud-from-AIO-SX-to-AIO-DX:
=================================================
Manually Migrate a Subcloud from AIO-SX to AIO-DX
=================================================
As an alternative to using the Ansible playbook, use the following procedure
to manually migrate a subcloud from |AIO-SX| to |AIO-DX|. Perform the following
commands on the |AIO-SX| subcloud.
.. rubric:: |proc|
#. If not already locked, lock the active controller.
.. code-block:: none
~(keystone_admin)$ system host-lock controller-0
#. Change the system mode to 'duplex'.
.. code-block:: none
~(keystone_admin)$ system modify --system_mode=duplex
#. Add the |OAM| unit IP addresses of controller-0 and controller-1.
For example, the |OAM| subnet is 10.10.10.0/24 and uses 10.10.10.13 and
10.10.10.14 for the unit IP addresses of controller-0 and controller-1
respectively. The existing |OAM| IP address of the |AIO-SX| system will be
used as the OAM floating IP address of the new |AIO-DX| system.
.. note::
Only specifying oam_c0_ip and oam_c1_ip is necessary to configure the
OAM unit IPs to transition to Duplex. However, oam_c0_ip and oam_c1_ip
cannot equal the current or specified value for oam_floating_ip.
.. code-block:: none
~(keystone_admin)$ system oam-modify oam_subnet=10.10.10.0/24 oam_gateway_ip=10.10.10.1 oam_floating_ip=10.10.10.12 oam_c0_ip=10.10.10.13 oam_c1_ip=10.10.10.14
#. Unlock the controller.
.. code-block:: none
~(keystone_admin)$ system host-unlock controller-0
Wait for the controller to reset and come back up to an operational state.
#. Software install and configure the second controller for the subcloud.
For instructions on installing and configuring controller-1 in an
|AIO-DX| setup to continue with the migration, see |inst-doc|.

View File

@@ -0,0 +1,113 @@
.. _restoring-subclouds-from-backupdata-using-dcmanager:
=========================================================
Restoring a Subcloud From Backup Data Using DCManager CLI
=========================================================
For subclouds with servers that support Redfish Virtual Media Service
(version 1.2 or higher), you can use the Central Cloud's CLI to restore the
subcloud from data that was backed up previously.
.. rubric:: |context|
The CLI command :command:`dcmanager subcloud restore` can be used to restore a
subcloud from available system data and bring it back to the operational state
it was in when the backup procedure took place. The subcloud restore has three
phases:
- Re-install the controller-0 of the subcloud with the current active load
running in the SystemController. For subcloud servers that support
Redfish Virtual Media Service, this phase can be carried out remotely
as part of the CLI.
- Run Ansible Platform Restore to restore |prod|, from a previous backup on
the controller-0 of the subcloud. This phase is also carried out as part
of the CLI.
- Unlock the controller-0 of the subcloud and continue with the steps to
restore the remaining nodes of the subcloud where applicable. This phase
is carried out by the system administrator, see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`.
.. rubric:: |prereq|
- The SystemController is healthy, and ready to accept **dcmanager** related
commands.
- The subcloud is unmanaged, and not in the process of installation,
bootstrap or deployment.
- The platform backup tar file is already on the subcloud in
/opt/platform-backup directory or has been transferred to the
SystemController.
- The subcloud install values have been saved in the **dcmanager** database
i.e. the subcloud has been installed remotely as part of :command:`dcmanager subcloud add`.
.. rubric:: |proc|
#. Create the restore_values.yaml file which will be passed to the
:command:`dcmanager subcloud restore` command using the ``--restore-values``
option. This file contains parameters that will be used during the platform
restore phase. Minimally, the **backup_filename** parameter, indicating the
file containing a previous backup of the subcloud, must be specified in the
yaml file, see :ref:`Run Ansible Restore Playbook Remotely <system-backup-running-ansible-restore-playbook-remotely>`,
and, :ref:`Run Restore Playbook Locally on the Controller <running-restore-playbook-locally-on-the-controller>`,
for supported restore parameters.
#. Restore the subcloud, using the dcmanager CLI command, :command:`subcloud restore`
and specify the restore values, with the ``--with-install`` option and the
subcloud's sysadmin password.
.. code-block:: none
~(keystone_admin) $ dcmanager subcloud restore --restore-values /home/sysadmin/subcloud1-restore.yaml --with-install --sysadmin-password <sysadmin_password> subcloud-name-or-id
Where:
- ``--restore-values`` must reference the restore values yaml file
mentioned in Step 1 of this procedure.
- ``--with-install`` indicates that a re-install of controller-0 of the
subcloud should be done remotely using Redfish Virtual Media Service.
If the ``--sysadmin-password`` option is not specified, the system
administrator will be prompted for the password. The password is masked
when it is entered. Enter the sysadmin password for the subcloud.
The **dcmanager subcloud restore** can take up to 30 minutes to reinstall
and restore the platform on controller-0 of the subcloud.
#. On the Central Cloud (SystemController), monitor the progress of the
subcloud reinstall and restore via the deploy status field of the
:command:`dcmanager subcloud list` command.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud list
+----+-----------+------------+--------------+---------------+---------+
| id | name | management | availability | deploy status | sync |
+----+-----------+------------+--------------+---------------+---------+
| 1 | subcloud1 | unmanaged | online | installing | unknown |
+----+-----------+------------+--------------+---------------+---------+
#. In case of a failure, check the Ansible log for the corresponding subcloud
under /var/log/dcmanager/ansible directory.
#. When the subcloud deploy status changes to "complete", the controller-0
is ready to be unlocked. Log into the controller-0 of the subcloud using
its bootstrap IP and unlock the host using the following command.
.. code-block:: none
~(keystone_admin)]$ system host-unlock controller-0
#. For |AIO|-DX and Standard subclouds, follow the procedure,
see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`
to restore the rest of the subcloud nodes.
#. To resume subcloud audit, use the following command.
.. code-block:: none
~(keystone_admin)]$ dcmanager subcloud manage