From 4324ab7f3dee03c4c63227532da11ac6df595362 Mon Sep 17 00:00:00 2001 From: Ngairangbam Mili Date: Wed, 15 Nov 2023 17:03:41 +0000 Subject: [PATCH] Improve legacy restore steps and note differences (ds8) Change-Id: Iac330aec17a839197f251f4ec14676f21a7232ae Signed-off-by: Ngairangbam Mili --- ...ring-starlingx-system-data-and-storage.rst | 36 +++- ...kup-playbook-locally-on-the-controller.rst | 52 ++++- ...ore-playbook-locally-on-the-controller.rst | 189 ++++++++++-------- 3 files changed, 171 insertions(+), 106 deletions(-) diff --git a/doc/source/backup/kubernetes/restoring-starlingx-system-data-and-storage.rst b/doc/source/backup/kubernetes/restoring-starlingx-system-data-and-storage.rst index 568b70a11..4c9572d17 100644 --- a/doc/source/backup/kubernetes/restoring-starlingx-system-data-and-storage.rst +++ b/doc/source/backup/kubernetes/restoring-starlingx-system-data-and-storage.rst @@ -11,6 +11,9 @@ You can perform a system restore (controllers, workers, including or excluding storage nodes) of a |prod| cluster from a previous system backup and bring it back to the operational state it was when the backup procedure took place. +There are two restore modes- optimized restore and legacy restore. Optmized restore +must be used on |AIO-SX| and legacy restore must be used on systems that are not |AIO-SX|. + .. rubric:: |context| Kubernetes configuration will be restored and pods that are started from @@ -130,11 +133,12 @@ conditions are in place: #. Install network connectivity required for the subcloud. -#. Any patches that were present at the time of the backup will need to be - manually applied. This may include doing a reboot if required. - See :ref:`Install Kubernetes Platform on All-in-one Simplex `; - ``Install Software on Controller-0`` for steps on how to install patches - using the :command:`sw-patch install-local` command. +#. Ensure that the system is at the same patch level as it was when the backup + was taken. On the |AIO-SX| systems, you must manually reinstall any + previous patches. This may include doing a reboot if required. + + For steps on how to install patches using the :command:`sw-patch install-local` command, see :ref:`aio_simplex_install_kubernetes_r7`; + ``Install Software on Controller-0``. After the reboot, you can verify that the updates were applied. @@ -144,6 +148,12 @@ conditions are in place: :start-after: sw-patch-query-begin :end-before: sw-patch-query-end + .. note:: + + On the systems that are not |AIO-SX|, you can skip this step if + ``skip_patching=true`` is not used. Patches are automatically + reinstalled from the backup by default. + #. Ensure that the backup files are available on the controller. Run both Ansible Restore playbooks, restore_platform.yml and restore_user_images.yml. For more information on restoring the back up file, see :ref:`Run Restore @@ -156,15 +166,23 @@ conditions are in place: The backup files contain the system data and updates. - The restore operation will pull images from the Upstream registry, they - are not part of the backup. + The restore operation will pull missing images from the upstream registries. #. Restore the local registry using the file restore_user_images.yml. + Example: + + .. code-block:: none + + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_user_images.yml -e "initial_backup_dir=/home/sysadmin backup_filename=localhost_user_images_backup_2023_07_15_21_24_22.tgz ansible_become_pass=St8rlingX*" + .. note:: - This step applies only if it was created during the backup operation. + - This step applies only if it was created during the backup operation. + + - The ``user_images_backup*.tgz`` file is created during backup only if + ``backup_user_images`` is true. This must be done before unlocking controller-0. @@ -230,7 +248,7 @@ conditions are in place: The software is installed on the host, and then the host is rebooted. Wait for the host to be reported as **locked**, **disabled**, - and **offline**. + and **online**. #. Unlock controller-1. diff --git a/doc/source/backup/kubernetes/running-ansible-backup-playbook-locally-on-the-controller.rst b/doc/source/backup/kubernetes/running-ansible-backup-playbook-locally-on-the-controller.rst index 5d41fc274..c158919d0 100644 --- a/doc/source/backup/kubernetes/running-ansible-backup-playbook-locally-on-the-controller.rst +++ b/doc/source/backup/kubernetes/running-ansible-backup-playbook-locally-on-the-controller.rst @@ -13,21 +13,55 @@ In this method the Ansible Backup playbook is run on the active controller. Use one of the following commands to run the Ansible Backup playbook and back up the |prod| configuration, data, and user container images in registry.local: -- - .. code-block:: none +**Optimized**: Optmized restore must be used on |AIO-SX|. - ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass= admin_password=" -e "backup_registry_filesystem=true" +.. code-block:: none -- - .. code-block:: none + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass= admin_password=" -e "backup_registry_filesystem=true" - ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml --ask-vault-pass -e "override_files_dir=$HOME/override_dir" +**Legacy**: Legacy restore must be used on systems that are not |AIO-SX|. +.. code-block:: none + + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass= admin_password=" -e "backup_user_images=true" + +**Example using overrides encrypted with Ansible Vault**: + +.. code-block:: none + + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml --ask-vault-pass -e "override_files_dir=$HOME/override_dir + +Following are the ``-e`` command line options: + +**Legacy** + +``-e backup_user_images=true`` + Used in conjunction with legacy restore. This will create a backup of + custom user images. The file generated by this flag will be named according to the + pattern ``_user_images_backup_.tgz``. + +**Optimized** + +``-e backup_registry_filesystem=true`` + Used in conjunction with optimized restore. This will create a backup of + ``registry.local``. The file generated by this flag will be named according to the + pattern ``_image_registry_backup_.tgz``. + +**Common** + +``-e backup_dir=/opt/backups`` + Directory where the backups will be saved. + +``-e ignore_health=false`` + When set to true, the backup playbook is allowed to be run on an unhealthy + system. This is needed in some extreme cases. + + .. warning:: + + Restoring a backup from an unhealthy system will lead to undefined behavior and/or failure. To exclude a directory and all the files in it like ``/var/home*`` you can use -the optional parameter: - -:command:`-e "exclude_dirs=/var/home/**,/var/home"` +the optional parameter ``-e "exclude_dirs=/var/home/**,/var/home"``. .. note:: diff --git a/doc/source/backup/kubernetes/running-restore-playbook-locally-on-the-controller.rst b/doc/source/backup/kubernetes/running-restore-playbook-locally-on-the-controller.rst index c79eb1c9f..2961f3d21 100644 --- a/doc/source/backup/kubernetes/running-restore-playbook-locally-on-the-controller.rst +++ b/doc/source/backup/kubernetes/running-restore-playbook-locally-on-the-controller.rst @@ -8,18 +8,27 @@ Run Restore Playbook Locally on the Controller ============================================== -To run restore on the controller, you need to download the backup to the +To run restore on the controller, you need to upload the backup to the active controller. You can use an external storage device, for example, a USB drive. Use the -following command to run the Ansible Restore playbook: +following commands to run the Ansible Restore playbook: + +**Optimized**: Optmized restore must be used on |AIO-SX|. .. code-block:: none - ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir= admin_password= wipe_ceph_osds=" -e "restore_registry_filesystem=true" + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "restore_mode=optimized initial_backup_dir= admin_password= wipe_ceph_osds= restore_registry_filesystem=true" +**Legacy**: Legacy restore must be used on systems that are not |AIO-SX|. -Other ``-e`` command line options: +.. code-block:: none + + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir= admin_password= wipe_ceph_osds=" + +Below you can find other ``-e`` command line options: + +**Common** ``-e restore_mode=optimized`` Enable optimized restore mode @@ -28,7 +37,6 @@ Other ``-e`` command line options: Optimized restore is currently supported only on |AIO-SX| systems. - ``-e "initial_backup_dir=/home/sysadmin"`` Where the backup tgz files are located on box. @@ -36,6 +44,94 @@ Other ``-e`` command line options: The basename of the platform backup tgz. The full path will be a combination ``{initial_backup_dir}/{backup_filename}`` +.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb: + +(Optional): You can select one of the following restore modes: + +- To keep the Ceph cluster data intact (false - default option), use the + ``wipe_ceph_osds=false`` parameter. + +- To wipe the Ceph cluster entirely (true), where the Ceph cluster will + need to be recreated, use the ``wipe_ceph_osds=true`` parameter. + +- To define a convinient place to store the backup files, defined by + ``initial-backup_dir``, on the system (such as the home folder for + sysadmin, or /tmp, or even a mounted USB device), use the + ``on_box_data=true/false`` parameter. + + If this parameter is set to true, Ansible Restore playbook will look + for the backup file provided on the target server. The parameter + ``initial_backup_dir`` can be ommited from the command line. In this + case, the backup file will be under ``/opt/platform-backup`` directory. + + If this parameter is set to false, the Ansible Restore playbook will + look for backup file provided on the Ansible controller. In this + case, both the ``initial_backup_dir`` and ``backup_filename`` must be + specified in the command. + + Example of a backup file in ``/home/sysadmin``: + + .. code-block:: none + + ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=/home/sysadmin ansible_become_pass=St8rlingX* admin_password=St8rlingX* backup_filename=localhost_platform_backup_2020_07_27_07_48_48.tgz wipe_ceph_osds=true" + + .. note:: + + If the backup contains patches, Ansible Restore playbook will apply + the patches and prompt you to reboot the system. Then you will need + to re-run Ansible Restore playbook. + + The flag ``wipe_ceph_osds=true`` is required for a restore in a new + hardware. For more details, see :ref:`node-replacement-for-aiominussx-using-optimized-backup-and-restore-6603c650c80d`. + + +- ``ssl_ca_certificate_file`` defines a single certificate that + contains all the ssl_ca certificates that will be installed during the + restore. It will replace + ``/opt/platform/config//ca-cert.pem``, which is a + single certificate containing all the ssl_ca certificates installed in + the host when the backup was done. The certificate assigned to this + parameter must follow this same pattern. + + For example: + + .. code-block:: none + + ssl_ca_certificate_file=/ + + E.g.: + + -e "ssl_ca_certificate_file=/home/sysadmin/new_ca-cert.pem" + + This parameter depends on ``on_box_data`` value. + + When ``on_box_data=true`` or not defined, ``ssl_ca_certificate_file`` + will be the location of the ``ssl_ca`` certificate file on the target host. + This is the default case. + + When ``on_box_data=false``, ``ssl_ca_certificate_file`` will be the + location of the ``ssl_ca`` certificate file where the Ansible controller is + running. This is useful for remote play. + + .. note:: + + To use this option on local restore mode, you need to download the + ``ssl_ca`` certificate file to the active controller. + +**Legacy** + +``-e skip_patching=true`` + Patching will not be restored from the backup. With this option, you + will need to manually restore any patching before running the restore + playbook. + +``-e restore_user_images=true`` + Restores the user images created during backup when ``backup_user_images`` was + true. If the user images are not restored, the images must be pulled from + upstream or ``registry.central``. + +**Optimized** + ``-e restore_registry_filesystem=true`` Restores the registry images created during backup when ``backup_registry_filesystem`` was true. If the registry filesystem is not @@ -48,89 +144,6 @@ Other ``-e`` command line options: path will be a combination ``{initial_backup_dir}/{registry_backup_filename}`` - -.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb: - -- **Optional**: You can select one of the following restore modes: - - - To keep the Ceph cluster data intact (false - default option), use the - following parameter: - - :command:`wipe_ceph_osds=false` - - - To wipe the Ceph cluster entirely (true), where the Ceph cluster will - need to be recreated, use the following parameter: - - :command:`wipe_ceph_osds=true` - - - To define a convinient place to store the backup files, defined by - ``initial-backup_dir``, on the system (such as the home folder for - sysadmin, or /tmp, or even a mounted USB device), use the following - parameter: - - :command:`on_box_data=true/false` - - If this parameter is set to true, Ansible Restore playbook will look - for the backup file provided on the target server. The parameter - ``initial_backup_dir`` can be ommited from the command line. In this - case, the backup file will be under ``/opt/platform-backup`` directory. - - If this parameter is set to false, the Ansible Restore playbook will - look for backup file provided where is the Ansible controller. In this - case, both the ``initial_backup_dir`` and ``backup_filename`` must be - specified in the command. - - Example of a backup file in /home/sysadmin - - .. code-block:: none - - ~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=/home/sysadmin ansible_become_pass=St8rlingX* admin_password=St8rlingX* backup_filename=localhost_platform_backup_2020_07_27_07_48_48.tgz wipe_ceph_osds=true" - - .. note:: - - If the backup contains patches, Ansible Restore playbook will apply - the patches and prompt you to reboot the system. Then you will need - to re-run Ansible Restore playbook. - - The flag ``wipe_ceph_osds=true`` is required for a restore in a new - hardware, for more details see :ref:`AIO-SX - Restore on new - hardware - `. - - - - The ``ssl_ca_certificate_file`` defines a single certificate that - contains all the ssl_ca certificates that will be installed during the - restore. It will replace the - ``/opt/platform/config//ca-cert.pem``, which is a - single certificate containing all the ssl_ca certificates installed in - the host when backup was done. So, the certificate assigned to this - parameter must follow this same pattern. - - For example: - - .. code-block:: none - - ssl_ca_certificate_file=/ - - E.g.: - - -e "ssl_ca_certificate_file=/home/sysadmin/new_ca-cert.pem" - - This parameter depends on ``on_box_data`` value. - - When ``on_box_data=true`` or not defined, the ``ssl_ca_certificate_file`` - will be the location of ``ssl_ca`` certificate file in the target host. - This is the default case. - - When ``on_box_data=false``, the ``ssl_ca_certificate_file`` will be the - location of ``ssl_ca`` certificate file where the Ansible controller is - running. This is useful for remote play. - - .. note:: - - To use this option on local restore mode, you need to download the - ``ssl_ca`` certificate file to the active controller. - .. note:: After restore is completed it is not possible to restart (or rerun) the