Remote Redfish Subcloud Restore

Fixed Merge conflicts Fixed review comments for patchset 8 Fixed review comments for patchset 7 Fixed review comments for Patchset 4 Moved restoring-subclouds-from-backupdata-using-dcmanager to the Distributed Cloud Guide Story: 2008573 Task: 42332 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com> Change-Id: Ife0319125df38c54fb0baa79ac32070446a0d605 Signed-off-by: Juanita-Balaraj <juanita.balaraj@windriver.com>
2021-04-23 16:59:39 -04:00
parent 7230189e63
commit e2e42814e6
6 changed files with 163 additions and 18 deletions
--- a/doc/source/backup/.vscode/settings.json
+++ b/doc/source/backup/.vscode/settings.json
@@ -0,0 +1,3 @@
 {
    "restructuredtext.confPath": ""
 }
--- a/doc/source/backup/kubernetes/restoring-starlingx-system-data-and-storage.rst
+++ b/doc/source/backup/kubernetes/restoring-starlingx-system-data-and-storage.rst
@@ -28,24 +28,34 @@ specific applications must be re-applied once a storage cluster is configured.
    To restore the data, use the same version of the boot image \(ISO\) that
    was used at the time of the original installation.
-The |prod| restore supports two modes:
+The |prod| restore supports the following optional modes:
 .. _restoring-starlingx-system-data-and-storage-ol-tw4-kvc-4jb:
-#.  To keep the Ceph cluster data intact \(false - default option\), use the
+-   To keep the Ceph cluster data intact \(false - default option\), use the
-    following syntax, when passing the extra arguments to the Ansible Restore
+    following parameter, when passing the extra arguments to the Ansible Restore
    playbook command:
    .. code-block:: none
       wipe_ceph_osds=false
-#.  To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
+-   To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
-    need to be recreated, use the following syntax:
+    need to be recreated, use the following parameter:
    .. code-block:: none
-       wipe_ceph_osds=true
+        wipe_ceph_osds=true
 -   To indicate that the backup data file is under /opt/platform-backup
    directory on the local machine, use the following parameter:
    .. code-block:: none
        on_box_data=true
    If this parameter is set to **false**, the Ansible Restore playbook expects
    both the **initial_backup_dir** and **backup_filename** to be specified.
 Restoring a |prod| cluster from a backup file is done by re-installing the
 ISO on controller-0, running the Ansible Restore Playbook, applying updates
--- a/doc/source/backup/kubernetes/running-restore-playbook-locally-on-the-controller.rst
+++ b/doc/source/backup/kubernetes/running-restore-playbook-locally-on-the-controller.rst
@@ -18,22 +18,20 @@ following command to run the Ansible Restore playbook:
    ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
-The |prod| restore supports two optional modes, keeping the Ceph cluster data
+The |prod| restore supports the following optional modes, keeping the Ceph
-intact or wiping the Ceph cluster.
+cluster data intact or wiping the Ceph cluster.
 .. rubric:: |proc|
 .. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
-#.  To keep the Ceph cluster data intact \(false - default option\), use the
+-   To keep the Ceph cluster data intact \(false - default option\), use the
-    following command:
+    following parameter:
    .. code-block:: none
       wipe_ceph_osds=false
-#.  To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
+-   To wipe the Ceph cluster entirely \(true\), where the Ceph cluster will
-    need to be recreated, use the following command:
+    need to be recreated, use the following parameter:
    .. code-block:: none
@@ -50,12 +48,23 @@ intact or wiping the Ceph cluster.
        the patches and prompt you to reboot the system. Then you will need to
        re-run Ansible Restore playbook.
 -   To indicate that the backup data file is under /opt/platform-backup
    directory on the local machine, use the following parameter:
    .. code-block:: none
        on_box_data=true
    If this parameter is set to **false**, the Ansible Restore playbook expects
    both the **initial_backup_dir** and **backup_filename** to be specified.
 .. rubric:: |postreq|
 After running restore\_platform.yml playbook, you can restore the local
 registry images.
 .. note::
    The backup file of the local registry images may be large. Restore the
    backed up file on the controller, where there is sufficient space.
--- a/doc/source/backup/kubernetes/system-backup-running-ansible-restore-playbook-remotely.rst
+++ b/doc/source/backup/kubernetes/system-backup-running-ansible-restore-playbook-remotely.rst
@@ -51,18 +51,27 @@ In this method you can run Ansible Restore playbook and point to controller-0.
    where optional-extra-vars can be:
-    -   **Optional**: You can select one of the two restore modes:
+    -   **Optional**: You can select one of the following restore modes:
        -   To keep Ceph data intact \(false - default option\), use the
-            following syntax:
+            following parameter:
            :command:`wipe_ceph_osds=false`
-        -   Start with an empty Ceph cluster \(true\), to recreate a new
+        -   To start with an empty Ceph cluster \(true\), where the Ceph
-            Ceph cluster, use the following syntax:
+            cluster will need to be recreated, use the following parameter:
            :command:`wipe_ceph_osds=true`
        -   To indicate that the backup data file is under /opt/platform-backup
            directory on the local machine, use the following parameter:
            :command:`on_box_data=true`
            If this parameter is set to **false**, the Ansible Restore playbook
            expects both the **initial_backup_dir** and **backup_filename**
            to be specified.
    -   The backup\_filename is the platform backup tar file. It must be
        provided using the ``-e`` option on the command line, for example:
--- a/doc/source/dist_cloud/index.rst
+++ b/doc/source/dist_cloud/index.rst
@@ -49,6 +49,7 @@ Operation
    changing-the-admin-password-on-distributed-cloud
    updating-docker-registry-credentials-on-a-subcloud
    migrate-an-aiosx-subcloud-to-an-aiodx-subcloud
    restoring-subclouds-from-backupdata-using-dcmanager
 ----------------------------------------------------------
 Kubernetes Version Upgrade Distributed Cloud Orchestration
--- a/doc/source/dist_cloud/restoring-subclouds-from-backupdata-using-dcmanager.rst
+++ b/doc/source/dist_cloud/restoring-subclouds-from-backupdata-using-dcmanager.rst
@@ -0,0 +1,113 @@
 .. _restoring-subclouds-from-backupdata-using-dcmanager:
 =========================================================
 Restoring a Subcloud From Backup Data Using DCManager CLI
 =========================================================
 For subclouds with servers that support Redfish Virtual Media Service
 (version 1.2 or higher), you can use the Central Cloud's CLI to restore the
 subcloud from data that was backed up previously.
 .. rubric:: |context|
 The CLI command :command:`dcmanager subcloud restore` can be used to restore a
 subcloud from available system data and bring it back to the operational state
 it was in when the backup procedure took place. The subcloud restore has three
 phases:
 -   Re-install the controller-0 of the subcloud with the current active load
    running in the SystemController. For subcloud servers that support
    Redfish Virtual Media Service, this phase can be carried out remotely
    as part of the CLI.
 -   Run Ansible Platform Restore to restore |prod|, from a previous backup on
    the controller-0 of the subcloud. This phase is also carried out as part
    of the CLI.
 -   Unlock the controller-0 of the subcloud and continue with the steps to
    restore the remaining nodes of the subcloud where applicable. This phase
    is carried out by the system administrator, see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`.
 .. rubric:: |prereq|
 -   The SystemController is healthy, and ready to accept **dcmanager** related
    commands.
 -   The subcloud is unmanaged, and not in the process of installation,
    bootstrap or deployment.
 -   The platform backup tar file is already on the subcloud in
    /opt/platform-backup directory or has been transferred to the
    SystemController.
 -   The subcloud install values have been saved in the **dcmanager** database
    i.e. the subcloud has been installed remotely as part of :command:`dcmanager subcloud add`.
 .. rubric:: |proc|
 #.  Create the restore_values.yaml file which will be passed to the
    :command:`dcmanager subcloud restore` command using the ``--restore-values``
    option. This file contains parameters that will be used during the platform
    restore phase. Minimally, the **backup_filename** parameter, indicating the
    file containing a previous backup of the subcloud, must be specified in the
    yaml file, see :ref:`Run Ansible Restore Playbook Remotely <system-backup-running-ansible-restore-playbook-remotely>`,
    and, :ref:`Run Restore Playbook Locally on the Controller <running-restore-playbook-locally-on-the-controller>`,
    for supported restore parameters.
 #.  Restore the subcloud, using the dcmanager CLI command, :command:`subcloud restore`
    and specify the restore values, with the ``--with-install`` option and the
    subcloud's sysadmin password.
    .. code-block:: none
        ~(keystone_admin) $ dcmanager subcloud restore --restore-values /home/sysadmin/subcloud1-restore.yaml --with-install --sysadmin-password <sysadmin_password> subcloud-name-or-id
    Where:
    -  ``--restore-values`` must reference the restore values yaml file
       mentioned in Step 1 of this procedure.
    -  ``--with-install`` indicates that a re-install of controller-0 of the
       subcloud should be done remotely using Redfish Virtual Media Service.
    If the ``--sysadmin-password`` option is not specified, the system
    administrator will be prompted for the password. The password is masked
    when it is entered. Enter the sysadmin password for the subcloud.
    The **dcmanager subcloud restore** can take up to 30 minutes to reinstall
    and restore the platform on controller-0 of the subcloud.
 #.  On the Central Cloud (SystemController), monitor the progress of the
    subcloud reinstall and restore via the deploy status field of the
    :command:`dcmanager subcloud list` command.
    .. code-block:: none
        ~(keystone_admin)]$ dcmanager subcloud list
        +----+-----------+------------+--------------+---------------+---------+
        | id | name      | management | availability | deploy status | sync    |
        +----+-----------+------------+--------------+---------------+---------+
        |  1 | subcloud1 | unmanaged  | online       | installing    | unknown |
        +----+-----------+------------+--------------+---------------+---------+
 #.  In case of a failure, check the Ansible log for the corresponding subcloud
    under /var/log/dcmanager/ansible directory.
 #.  When the subcloud deploy status changes to "complete", the controller-0
    is ready to be unlocked. Log into the controller-0 of the subcloud using
    its bootstrap IP and unlock the host using the following command.
    .. code-block:: none
        ~(keystone_admin)]$ system host-unlock controller-0
 #.  For |AIO|-DX and Standard subclouds, follow the procedure,
    see :ref:`Restoring Platform System Data and Storage <restoring-starlingx-system-data-and-storage>`
    to restore the rest of the subcloud nodes.
 #.  To resume subcloud audit, use the following command.
    .. code-block:: none
        ~(keystone_admin)]$ dcmanager subcloud manage