Improve legacy restore steps and note differences (ds8)

Change-Id: Iac330aec17a839197f251f4ec14676f21a7232ae
Signed-off-by: Ngairangbam Mili <ngairangbam.mili@windriver.com>
This commit is contained in:
Ngairangbam Mili 2023-11-15 17:03:41 +00:00
parent 3976d48d64
commit 4324ab7f3d
3 changed files with 171 additions and 106 deletions

View File

@ -11,6 +11,9 @@ You can perform a system restore (controllers, workers, including or excluding
storage nodes) of a |prod| cluster from a previous system backup and bring it
back to the operational state it was when the backup procedure took place.
There are two restore modes- optimized restore and legacy restore. Optmized restore
must be used on |AIO-SX| and legacy restore must be used on systems that are not |AIO-SX|.
.. rubric:: |context|
Kubernetes configuration will be restored and pods that are started from
@ -130,11 +133,12 @@ conditions are in place:
#. Install network connectivity required for the subcloud.
#. Any patches that were present at the time of the backup will need to be
manually applied. This may include doing a reboot if required.
See :ref:`Install Kubernetes Platform on All-in-one Simplex <aio_simplex_install_kubernetes_r7>`;
``Install Software on Controller-0`` for steps on how to install patches
using the :command:`sw-patch install-local` command.
#. Ensure that the system is at the same patch level as it was when the backup
was taken. On the |AIO-SX| systems, you must manually reinstall any
previous patches. This may include doing a reboot if required.
For steps on how to install patches using the :command:`sw-patch install-local` command, see :ref:`aio_simplex_install_kubernetes_r7`;
``Install Software on Controller-0``.
After the reboot, you can verify that the updates were applied.
@ -144,6 +148,12 @@ conditions are in place:
:start-after: sw-patch-query-begin
:end-before: sw-patch-query-end
.. note::
On the systems that are not |AIO-SX|, you can skip this step if
``skip_patching=true`` is not used. Patches are automatically
reinstalled from the backup by default.
#. Ensure that the backup files are available on the controller. Run both
Ansible Restore playbooks, restore_platform.yml and restore_user_images.yml.
For more information on restoring the back up file, see :ref:`Run Restore
@ -156,15 +166,23 @@ conditions are in place:
The backup files contain the system data and updates.
The restore operation will pull images from the Upstream registry, they
are not part of the backup.
The restore operation will pull missing images from the upstream registries.
#. Restore the local registry using the file restore_user_images.yml.
Example:
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_user_images.yml -e "initial_backup_dir=/home/sysadmin backup_filename=localhost_user_images_backup_2023_07_15_21_24_22.tgz ansible_become_pass=St8rlingX*"
.. note::
This step applies only if it was created during the backup operation.
- This step applies only if it was created during the backup operation.
- The ``user_images_backup*.tgz`` file is created during backup only if
``backup_user_images`` is true.
This must be done before unlocking controller-0.
@ -230,7 +248,7 @@ conditions are in place:
The software is installed on the host, and then the host is
rebooted. Wait for the host to be reported as **locked**, **disabled**,
and **offline**.
and **online**.
#. Unlock controller-1.

View File

@ -13,21 +13,55 @@ In this method the Ansible Backup playbook is run on the active controller.
Use one of the following commands to run the Ansible Backup playbook and back
up the |prod| configuration, data, and user container images in registry.local:
-
.. code-block:: none
**Optimized**: Optmized restore must be used on |AIO-SX|.
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass=<sysadmin password> admin_password=<sysadmin password>" -e "backup_registry_filesystem=true"
.. code-block:: none
-
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass=<sysadmin password> admin_password=<sysadmin password>" -e "backup_registry_filesystem=true"
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml --ask-vault-pass -e "override_files_dir=$HOME/override_dir"
**Legacy**: Legacy restore must be used on systems that are not |AIO-SX|.
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml -e "ansible_become_pass=<sysadmin password> admin_password=<sysadmin password>" -e "backup_user_images=true"
**Example using overrides encrypted with Ansible Vault**:
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/backup.yml --ask-vault-pass -e "override_files_dir=$HOME/override_dir
Following are the ``-e`` command line options:
**Legacy**
``-e backup_user_images=true``
Used in conjunction with legacy restore. This will create a backup of
custom user images. The file generated by this flag will be named according to the
pattern ``<inventory_hostname>_user_images_backup_<timestamp>.tgz``.
**Optimized**
``-e backup_registry_filesystem=true``
Used in conjunction with optimized restore. This will create a backup of
``registry.local``. The file generated by this flag will be named according to the
pattern ``<inventory_hostname>_image_registry_backup_<timestamp>.tgz``.
**Common**
``-e backup_dir=/opt/backups``
Directory where the backups will be saved.
``-e ignore_health=false``
When set to true, the backup playbook is allowed to be run on an unhealthy
system. This is needed in some extreme cases.
.. warning::
Restoring a backup from an unhealthy system will lead to undefined behavior and/or failure.
To exclude a directory and all the files in it like ``/var/home*`` you can use
the optional parameter:
:command:`-e "exclude_dirs=/var/home/**,/var/home"`
the optional parameter ``-e "exclude_dirs=/var/home/**,/var/home"``.
.. note::

View File

@ -8,18 +8,27 @@
Run Restore Playbook Locally on the Controller
==============================================
To run restore on the controller, you need to download the backup to the
To run restore on the controller, you need to upload the backup to the
active controller.
You can use an external storage device, for example, a USB drive. Use the
following command to run the Ansible Restore playbook:
following commands to run the Ansible Restore playbook:
**Optimized**: Optmized restore must be used on |AIO-SX|.
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>" -e "restore_registry_filesystem=true"
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "restore_mode=optimized initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false> restore_registry_filesystem=true"
**Legacy**: Legacy restore must be used on systems that are not |AIO-SX|.
Other ``-e`` command line options:
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=<location_of_tarball ansible_become_pass=<admin_password> admin_password=<admin_password backup_filename=<backup_filename> wipe_ceph_osds=<true/false>"
Below you can find other ``-e`` command line options:
**Common**
``-e restore_mode=optimized``
Enable optimized restore mode
@ -28,7 +37,6 @@ Other ``-e`` command line options:
Optimized restore is currently supported only on |AIO-SX| systems.
``-e "initial_backup_dir=/home/sysadmin"``
Where the backup tgz files are located on box.
@ -36,6 +44,94 @@ Other ``-e`` command line options:
The basename of the platform backup tgz. The full path will be a
combination ``{initial_backup_dir}/{backup_filename}``
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
(Optional): You can select one of the following restore modes:
- To keep the Ceph cluster data intact (false - default option), use the
``wipe_ceph_osds=false`` parameter.
- To wipe the Ceph cluster entirely (true), where the Ceph cluster will
need to be recreated, use the ``wipe_ceph_osds=true`` parameter.
- To define a convinient place to store the backup files, defined by
``initial-backup_dir``, on the system (such as the home folder for
sysadmin, or /tmp, or even a mounted USB device), use the
``on_box_data=true/false`` parameter.
If this parameter is set to true, Ansible Restore playbook will look
for the backup file provided on the target server. The parameter
``initial_backup_dir`` can be ommited from the command line. In this
case, the backup file will be under ``/opt/platform-backup`` directory.
If this parameter is set to false, the Ansible Restore playbook will
look for backup file provided on the Ansible controller. In this
case, both the ``initial_backup_dir`` and ``backup_filename`` must be
specified in the command.
Example of a backup file in ``/home/sysadmin``:
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=/home/sysadmin ansible_become_pass=St8rlingX* admin_password=St8rlingX* backup_filename=localhost_platform_backup_2020_07_27_07_48_48.tgz wipe_ceph_osds=true"
.. note::
If the backup contains patches, Ansible Restore playbook will apply
the patches and prompt you to reboot the system. Then you will need
to re-run Ansible Restore playbook.
The flag ``wipe_ceph_osds=true`` is required for a restore in a new
hardware. For more details, see :ref:`node-replacement-for-aiominussx-using-optimized-backup-and-restore-6603c650c80d`.
- ``ssl_ca_certificate_file`` defines a single certificate that
contains all the ssl_ca certificates that will be installed during the
restore. It will replace
``/opt/platform/config/<software-version>/ca-cert.pem``, which is a
single certificate containing all the ssl_ca certificates installed in
the host when the backup was done. The certificate assigned to this
parameter must follow this same pattern.
For example:
.. code-block:: none
ssl_ca_certificate_file=<complete path>/<ssl_ca certificates file>
E.g.:
-e "ssl_ca_certificate_file=/home/sysadmin/new_ca-cert.pem"
This parameter depends on ``on_box_data`` value.
When ``on_box_data=true`` or not defined, ``ssl_ca_certificate_file``
will be the location of the ``ssl_ca`` certificate file on the target host.
This is the default case.
When ``on_box_data=false``, ``ssl_ca_certificate_file`` will be the
location of the ``ssl_ca`` certificate file where the Ansible controller is
running. This is useful for remote play.
.. note::
To use this option on local restore mode, you need to download the
``ssl_ca`` certificate file to the active controller.
**Legacy**
``-e skip_patching=true``
Patching will not be restored from the backup. With this option, you
will need to manually restore any patching before running the restore
playbook.
``-e restore_user_images=true``
Restores the user images created during backup when ``backup_user_images`` was
true. If the user images are not restored, the images must be pulled from
upstream or ``registry.central``.
**Optimized**
``-e restore_registry_filesystem=true``
Restores the registry images created during backup when
``backup_registry_filesystem`` was true. If the registry filesystem is not
@ -48,89 +144,6 @@ Other ``-e`` command line options:
path will be a combination
``{initial_backup_dir}/{registry_backup_filename}``
.. _running-restore-playbook-locally-on-the-controller-steps-usl-2c3-pmb:
- **Optional**: You can select one of the following restore modes:
- To keep the Ceph cluster data intact (false - default option), use the
following parameter:
:command:`wipe_ceph_osds=false`
- To wipe the Ceph cluster entirely (true), where the Ceph cluster will
need to be recreated, use the following parameter:
:command:`wipe_ceph_osds=true`
- To define a convinient place to store the backup files, defined by
``initial-backup_dir``, on the system (such as the home folder for
sysadmin, or /tmp, or even a mounted USB device), use the following
parameter:
:command:`on_box_data=true/false`
If this parameter is set to true, Ansible Restore playbook will look
for the backup file provided on the target server. The parameter
``initial_backup_dir`` can be ommited from the command line. In this
case, the backup file will be under ``/opt/platform-backup`` directory.
If this parameter is set to false, the Ansible Restore playbook will
look for backup file provided where is the Ansible controller. In this
case, both the ``initial_backup_dir`` and ``backup_filename`` must be
specified in the command.
Example of a backup file in /home/sysadmin
.. code-block:: none
~(keystone_admin)]$ ansible-playbook /usr/share/ansible/stx-ansible/playbooks/restore_platform.yml -e "initial_backup_dir=/home/sysadmin ansible_become_pass=St8rlingX* admin_password=St8rlingX* backup_filename=localhost_platform_backup_2020_07_27_07_48_48.tgz wipe_ceph_osds=true"
.. note::
If the backup contains patches, Ansible Restore playbook will apply
the patches and prompt you to reboot the system. Then you will need
to re-run Ansible Restore playbook.
The flag ``wipe_ceph_osds=true`` is required for a restore in a new
hardware, for more details see :ref:`AIO-SX - Restore on new
hardware
<node-replacement-for-aiominussx-using-optimized-backup-and-restore-6603c650c80d>`.
- The ``ssl_ca_certificate_file`` defines a single certificate that
contains all the ssl_ca certificates that will be installed during the
restore. It will replace the
``/opt/platform/config/<software-version>/ca-cert.pem``, which is a
single certificate containing all the ssl_ca certificates installed in
the host when backup was done. So, the certificate assigned to this
parameter must follow this same pattern.
For example:
.. code-block:: none
ssl_ca_certificate_file=<complete path>/<ssl_ca certificates file>
E.g.:
-e "ssl_ca_certificate_file=/home/sysadmin/new_ca-cert.pem"
This parameter depends on ``on_box_data`` value.
When ``on_box_data=true`` or not defined, the ``ssl_ca_certificate_file``
will be the location of ``ssl_ca`` certificate file in the target host.
This is the default case.
When ``on_box_data=false``, the ``ssl_ca_certificate_file`` will be the
location of ``ssl_ca`` certificate file where the Ansible controller is
running. This is useful for remote play.
.. note::
To use this option on local restore mode, you need to download the
``ssl_ca`` certificate file to the active controller.
.. note::
After restore is completed it is not possible to restart (or rerun) the