diff --git a/doc/source/updates/kubernetes/manual-rollback-host-software-deployment-9295ce1e6e29.rst b/doc/source/updates/kubernetes/manual-rollback-host-software-deployment-9295ce1e6e29.rst index aba471d1c..d2d16a8f9 100644 --- a/doc/source/updates/kubernetes/manual-rollback-host-software-deployment-9295ce1e6e29.rst +++ b/doc/source/updates/kubernetes/manual-rollback-host-software-deployment-9295ce1e6e29.rst @@ -7,459 +7,266 @@ Manual Rollback Host Software Deployment ======================================== -For a major release software deployment, you can roll back the +For a major release software deployment, you can abort and rollback the :ref:`manual-host-software-deployment-ee17ec6f71a4` procedure at any time between :command:`software deploy start` and :command:`software deploy delete`. -After the software deploy deletion step, aborting and rolling back of the major -release deployment is not possible. + +.. note:: + + After the software deploy deletion step, aborting and rolling back of the + major release deployment is not possible. .. note:: This section also covers the abort and rollback of a new patched major release deployment. -.. note:: +This section describes different rollback procedures based on where you +currently are in the software deployment procedure. - Currently, software deployments cannot be rolled back after the :command:`software - deploy activate` step. +.. contents:: |minitoc| + :local: + :depth: 1 -.. rubric:: |prereq| -You are in the middle of the -:ref:`manual-host-software-deployment-ee17ec6f71a4` procedure for a major -release between :command:`software deploy start` and :command:`software deploy -delete`. +Rollback After Successful Software Deploy Activate +-------------------------------------------------- + +You can roll back the manual host software deployment after a successful +:command:`software deploy activate` in the following scenarios: + +- A Patch Release deployment + + Immediately after the successful :command:`software deploy activate` + +- A major release deployment + + - Immediately after the successful :command:`software deploy activate`, before + attempting the Kubernetes upgrade. + + - After attempting a Kubernetes upgrade which failed and was rolled back. + + .. note:: + + A rollback of a host major release software deployment can only be done + if Kubernetes has been successfully rolled back to its original + version. + +.. _rollbackafterdeployactive: .. rubric:: |proc| -#. Abort the current in-progress major release software deployment. +#. Verify the current deploy state and Kubernetes current version. .. code-block:: - ~(keystone_admin)]$ software deploy abort + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show + +--------------+------------+------+------------------+ + | From Release | To Release | RR | State | + +--------------+------------+------+------------------+ + | 24.09.400 | 25.09.0 | True | deploy-completed | + +--------------+------------+------+------------------+ + + [sysadmin@controller-0 ~(keystone_admin)]$ system kube-version-list + +---------+--------+-----------+ + | version | target | state | + +---------+--------+-----------+ + | v1.29.2 | True | active | + | v1.30.6 | False | available | + | v1.31.5 | False | available | + | v1.32.2 | False | available | + | v1.33.0 | False | available | + +---------+--------+-----------+ + + .. _aborthostsoftwaredeploy: + +#. Abort the current host software deployment. + + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy abort Deployment has been aborted -#. If the current deploy state is ``deploy-activate-rollback-pending``, then roll back the - activate of the aborted deployment, otherwise proceed to :ref:`3 `. +#. Verify the deploy state. - .. code-block:: + .. code-block:: none - ~(keystone_admin)]$ software deploy show - +------------------+------------+------+----------------------------------+ - | From Release | To Release | RR | State | - +------------------+------------+------+----------------------------------+ - | | 10.0.0 | True | deploy-activate-rollback-pending | - +------------------+------------+------+----------------------------------+ + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show + +--------------+------------+------+----------------------------------+ + | From Release | To Release | RR | State | + +--------------+------------+------+----------------------------------+ + | 25.09.0 | 24.09.400 | True | deploy-activate-rollback-pending | + +--------------+------------+------+----------------------------------+ - .. code-block:: +#. Perform :command:`deploy activate-rollback` and check the deploy and host deploy states. - ~(keystone_admin)]$ software deploy activate-rollback + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy activate-rollback Deploy activate-rollback has started - When running the :command:`software deploy activate-rollback` command, previous - configurations are applied to the controller. + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show + +--------------+------------+------+-------------------------------+ + | From Release | To Release | RR | State | + +--------------+------------+------+-------------------------------+ + | 25.09.0 | 24.09.400 | True | deploy-activate-rollback-done | + +--------------+------------+------+-------------------------------+ - Alarm 250.001 (Configuration is out-of-date) is raised and cleared as the - configurations are applied. + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy host-list + +--------------+--------------+------------+------+------------------------------+ + | Host | From Release | To Release | RR | State | + +--------------+--------------+------------+------+------------------------------+ + | controller-0 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-pending | + | controller-1 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-pending | + +--------------+--------------+------------+------+------------------------------+ - The software deployment state goes from ``activate-rollback-done`` to ``host-rollback``. + .. _rollbackhost: - This may take up to 30 mins to complete depending on system configuration - and hardware. +#. Roll back host. - .. code-block:: + Perform the following commands in the order: all worker nodes, all + storage nodes, controller-1, and controller-0: - ~(keystone_admin)]$ software deploy show - +------------------+------------+------+---------------+ - | From Release | To Release | RR | State | - +------------------+------------+------+---------------+ - | | 10.0.0 | True | host-rollback | - +------------------+------------+------+---------------+ + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ system host-lock + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy host-rollback + [sysadmin@controller-0 ~(keystone_admin)]$ system host-unlock .. note:: - If :command:`software deploy activate-rollback` fails, that is, if the state is - ``activate-rollback-failed``, review ``/var/log/software.log`` on the - active controller for failure details, address the issues, and - re-execute the :command:`software deploy activate-rollback` command. + For controllers, a :command:`system host-swact controller-#` will be + required to lock and roll back the active controller. -#. If the current deploy state is ``host-rollback``, then roll back the - deployment of all the hosts. + .. _verifydeploystate: - .. _manual-rollback-host-software-deployment-9295ce1e6e29-step: +#. Verify the deploy state and host deploy state. - .. code-block:: + All the hosts must be in the ``deploy-host-rollback-deployed`` state and the deploy state + should be ``deploy-host-rollback-done``. - ~(keystone_admin)]$ software deploy show - +------------------+------------+------+-------------------+ - | From Release | To Release | RR | State | - +------------------+------------+------+-------------------+ - | | 10.0.0 | True | host-rollback | - +------------------+------------+------+-------------------+ + .. code-block:: none - If the state is ``host-rollback``, then proceed with the rest of this step, - otherwise proceed to :ref:`4 `. + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy host-list + +--------------+--------------+------------+------+-------------------------------+ + | Host | From Release | To Release | RR | State | + +--------------+--------------+------------+------+-------------------------------+ + | controller-0 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-deployed | + | controller-1 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-deployed | + +--------------+--------------+------------+------+-------------------------------+ + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show + +--------------+------------+------+---------------------------+ + | From Release | To Release | RR | State | + +--------------+------------+------+---------------------------+ + | 25.09.0 | 24.09.400 | True | deploy-host-rollback-done | + +--------------+------------+------+---------------------------+ - - For an |AIO-SX| system + .. _deletecurrentdeploy: - #. Roll back the software release on controller-0. +#. Delete the current deployment. - #. Lock controller-0. + .. code-block:: none - .. code-block:: - - ~(keystone_admin)]$ system host-lock controller-0 - - #. Roll back the software release on controller-0. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-rollback controller-0 - Host installation request sent to controller-0. - Host installation was successful on controller-0 - - The host is still running the new software release, however boot - parameters have been updated to boot into the previous software - release on the next host reboot, which will occur in the next step - which unlocks the host. - - #. Unlock controller-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-unlock controller-0 - - The host will now reboot into the previous software release. Wait for - the host to finish rebooting and become available. - - This may take 3-5 mins depending on hardware. - - #. Proceed to step :ref:`4 - ` of - the main procedure. - - - For an |AIO-DX| system or standard system - - #. If worker hosts are present, and one or more are in the ``pending-rollback`` - state, then roll back the software release on all worker hosts in the - ``pending-rollback`` state one at a time. Otherwise, proceed to step :ref:`b `. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-list - +--------------+------------------+------------+-------+------------------------------+ - | Host | From Release | To Release | RR | State | - +--------------+------------------+------------+-------+------------------------------+ - | controller-0 | | 10.0.0 | True | deploy-host-rollback-pending | - | controller-1 | | 10.0.0 | True | deploy-host-rollback-pending | - | storage-0 | | 10.0.0 | True | deploy-host-rollback-pending | - | storage-1 | | 10.0.0 | True | deploy-host-rollback-pending | - | storage-2 | | 10.0.0 | True | deploy-host-rollback-pending | - | storage-3 | | 10.0.0 | True | deploy-host-rollback-pending | - | worker-0 | | 10.0.0 | True | deploy-host-rollback-pending | - | worker-1 | | 10.0.0 | True | deploy-host-rollback-pending | - | worker-2 | | 10.0.0 | True | deploy-host-rollback-pending | - | worker-3 | | 10.0.0 | True | deploy-host-rollback-deployed| - +--------------+------------------+------------+-------+------------------------------+ - - #. Roll back the software release on worker-0. - - #. Lock worker-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-lock worker-0 - - #. Roll back the software release on worker-0. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-rollback worker-0 - Host installation request sent to worker-0 - Host installation was successful on worker-0. - - The host is still running the new software release, however boot parameters - have been updated to boot into the previous software release on the next - host reboot, which will occur in the next step which unlocks the host. - - #. Unlock worker-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-unlock worker-0 - - The host will now reboot into the previous software release. Wait - for the host to finish rebooting and become available. Wait - for all the alarms to clear after the unlock before proceeding to the - next worker host. - - This may take 3-5 mins depending on hardware. - - #. Display the state of software deployment. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-list - +--------------+------------------+------------+-------+------------------+ - | Host | From Release | To Release | RR | State | - +--------------+------------------+------------+-------+------------------+ - | controller-0 | | 10.0.0 | True | pending-rollback | - | controller-1 | | 10.0.0 | True | pending-rollback | - | storage-0 | | 10.0.0 | True | pending-rollback | - | storage-1 | | 10.0.0 | True | pending-rollback | - | storage-2 | | 10.0.0 | True | pending-rollback | - | storage-3 | | 10.0.0 | True | pending-rollback | - | worker-0 | | 10.0.0 | True | rolled back | - | worker-1 | | 10.0.0 | True | pending-rollback | - | worker-2 | | 10.0.0 | True | rolled back | - | worker-3 | | 10.0.0 | True | rolled back | - +--------------+------------------+------------+-------+------------------+ - - #. Repeat the above steps for any remaining worker hosts in the ``pending-rollback`` state. - - #. If storage hosts are present, and one or more are in the ``pending-rollback`` state, - then roll back the software release on all storage hosts in the ``pending-rollback`` state, - one at a time. Otherwise, proceed to step :ref:`c `. - - .. _manual-rollback-host-software-deployment-9295ce1e6e29-storagehost: - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-list - +--------------+------------------+------------+-------+------------------+ - | Host | From Release | To Release | RR | State | - +--------------+------------------+------------+-------+------------------+ - | controller-0 | | 10.0.0 | True | pending-rollback | - | controller-1 | | 10.0.0 | True | pending-rollback | - | storage-0 | | 10.0.0 | True | pending-rollback | - | storage-1 | | 10.0.0 | True | pending-rollback | - | storage-2 | | 10.0.0 | True | pending-rollback | - | storage-3 | | 10.0.0 | True | pending-rollback | - | worker-0 | | 10.0.0 | True | rolled back | - | worker-1 | | 10.0.0 | True | rolled back | - | worker-2 | | 10.0.0 | True | rolled back | - | worker-3 | | 10.0.0 | True | rolled back | - +--------------+------------------+------------+-------+------------------+ - - #. Roll back the software release on storage-0. - - #. Lock storage-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-lock storage-0 - - #. Roll back the software release on storage-0. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-rollback storage-0 - Host installation request sent to storage-0 - Host installation was successful on storage-0. - - The host is still running the new software release, - however boot parameters have been updated to boot into - the previous software release on the next host reboot, which - will occur in the next step which unlocks the host. - - #. Unlock storage-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-unlock storage-0 - - The host will now reboot into the previous software release. Wait - for the host to finish rebooting and become available. Wait for - all the alarms to clear after the unlock before proceeding to the next - storage host. - - This may take 3-5 mins depending on hardware. - - #. Display the state of software deployment. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-list - +--------------+------------------+------------+-------+------------------+ - | Host | From Release | To Release | RR | State | - +--------------+------------------+------------+-------+------------------+ - | controller-0 | | 10.0.0 | True | pending-rollback | - | controller-1 | | 10.0.0 | True | pending-rollback | - | storage-0 | | 10.0.0 | True | rolled back | - | storage-1 | | 10.0.0 | True | pending-rollback | - | storage-2 | | 10.0.0 | True | pending-rollback | - | storage-3 | | 10.0.0 | True | pending-rollback | - | worker-0 | | 10.0.0 | True | rolled back | - | worker-1 | | 10.0.0 | True | rolled back | - | worker-2 | | 10.0.0 | True | rolled back | - | worker-3 | | 10.0.0 | True | rolled back | - +--------------+------------------+------------+-------+------------------+ - - #. Repeat the above steps for any remaining storage hosts in the - ``pending-rollback`` state. - - .. note:: - - After rolling back the first storage host, you can expect alarm - 800.003. The alarm is cleared after all the storage hosts are rolled - back. - - #. If both the controllers are in the ``pending-rollback`` state, then roll back - controller-0 first. - - .. _manual-rollback-host-software-deployment-9295ce1e6e29-bothcontrollers: - - #. Ensure that controller-1 is active by switching activity from - controller-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-swact controller-0 - - Wait for the activity to switch to controller-1. This may take up to a - minute depending on hardware. Reconnect to the system. - - #. Roll back the software release on controller-0 (the standby controller). - - #. Lock controller-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-lock controller-0 - - #. Rollback the software release on controller-0. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-rollback controller-0 - Host installation request sent to controller-0. - Host installation was successful on controller-0. - - The host is still running the new software release, - however boot parameters have been updated to boot into - the previous software release on the next host reboot, which - will occur in the next step which unlocks the host. - - #. Unlock controller-0. - - .. code-block:: - - ~(keystone_admin)]$ system host-unlock controller-0 - - The host will now reboot into the new software release. Wait for - the host to finish rebooting and become available. - - This may take 3-5 mins depending on hardware. - - #. Display the state of software deployment. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-list - +--------------+------------------+------------+-------+------------------+ - | Host | From Release | To Release | RR | State | - +--------------+------------------+------------+-------+------------------+ - | controller-0 | | 10.0.0 | True | rolled back | - | controller-1 | | 10.0.0 | True | pending-rollback | - | storage-0 | | 10.0.0 | True | rolled back | - | storage-1 | | 10.0.0 | True | rolled back | - | storage-2 | | 10.0.0 | True | rolled back | - | storage-3 | | 10.0.0 | True | rolled back | - | worker-0 | | 10.0.0 | True | rolled back | - | worker-1 | | 10.0.0 | True | rolled back | - | worker-2 | | 10.0.0 | True | rolled back | - | worker-3 | | 10.0.0 | True | rolled back | - +--------------+------------------+------------+-------+------------------+ - - #. If only controller-1 is in the ``pending-rollback`` state, then roll - back controller-1. - - #. Ensure that controller-0 is active by switching activity from - controller-1. - - .. code-block:: - - ~(keystone_admin)]$ system host-swact controller-1 - - Wait for the activity to switch to controller-0. - - This may take up to a minute depending on hardware. - - Reconnect to the system. - - #. Roll back the software release on controller-1 (the standby controller). - - #. Lock controller-1 - - .. code-block:: - - ~(keystone_admin)]$ system host-lock controller-1 - - #. Roll back the software release on controller-1. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-rollback controller-1 - Host installation request sent to controller-1. - Host installation was successful on controller-1. - - The host is still running the new software release, however boot - parameters have been updated to boot into the previous software - release on the next host reboot, which will occur in the next step - which unlocks the host. - - #. Unlock controller-1. - - .. code-block:: - - ~(keystone_admin)]$ system host-unlock controller-1 - - The host will now reboot into the new software release. Wait for - the host to finish rebooting and become available. - - This may take 3-5 mins depending on hardware. - - #. Display the state of software deployment. - - .. code-block:: - - ~(keystone_admin)]$ software deploy host-list - +--------------+------------------+------------+-------+------------------+ - | Host | From Release | To Release | RR | State | - +--------------+------------------+------------+-------+------------------+ - | controller-0 | | 10.0.0 | True | rolled back | - | controller-1 | | 10.0.0 | True | rolled back | - | storage-0 | | 10.0.0 | True | rolled back | - | storage-1 | | 10.0.0 | True | rolled back | - | storage-2 | | 10.0.0 | True | rolled back | - | storage-3 | | 10.0.0 | True | rolled back | - | worker-0 | | 10.0.0 | True | rolled back | - | worker-1 | | 10.0.0 | True | rolled back | - | worker-2 | | 10.0.0 | True | rolled back | - | worker-3 | | 10.0.0 | True | rolled back | - +--------------+------------------+------------+-------+------------------+ - -#. Delete the software deployment to complete the rollback. - - .. _manual-rollback-host-software-deployment-9295ce1e6e29-deletestep: - - .. code-block:: - - ~(keystone_admin)]$ software deploy delete - Deployment has been deleted - - .. code-block:: - - ~(keystone_admin)]$ software deploy show + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy delete + Deploy deleted with success + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show No deploy in progress -#. Confirm that the previous software release is now deployed. +Now, the system is ready for the next host software deployment. + +Rollback After Software Deploy Activate Fails +--------------------------------------------- + +When :command:`software deploy activate` fails, follow the steps below to abort +and roll back the manual host software deployment. + +.. rubric:: |proc| + +#. Verify current deploy state. + + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show + +--------------+------------+------+------------------------+ + | From Release | To Release | RR | State | + +--------------+------------+------+------------------------+ + | 24.09.400 | 25.09.0 | True | deploy-activate-failed | + +--------------+------------+------+------------------------+ + +#. Follow the steps from :ref:`step 2 ` to :ref:`step + 7 ` in :ref:`Rollback After Successful Software Deploy + Activate `. + + +Rollback After Software Deploy Host Fails +---------------------------------------------------- + +When deploying software to a host (:command:`software deploy host `) fails, +follow the steps below to abort and roll back the manual host software +deployment. + +.. rubric:: |proc| + +#. Verify the current deploy state. + + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy show + +--------------+------------+------+--------------------+ + | From Release | To Release | RR | State | + +--------------+------------+------+--------------------+ + | 24.09.400 | 25.09.0 | True | deploy-host-failed | + +--------------+------------+------+--------------------+ + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy host-list + +--------------+--------------+------------+------+---------------------+ + | Host | From Release | To Release | RR | State | + +--------------+--------------+------------+------+---------------------+ + | controller-0 | 25.09.0 | 24.09.400 | True | deploy-host-failed | + | controller-1 | 25.09.0 | 24.09.400 | True | deploy-host-done | + | compute-0 | 25.09.0 | 24.09.400 | True | deploy-host-pending | + +--------------+--------------+------------+------+---------------------+ + +#. Abort the deploy. + + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy abort + Deployment has been aborted + +#. Verify the deploy state. + + .. code-block:: none + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy host-list + +--------------+--------------+------------+------+------------------------------+ + | Host | From Release | To Release | RR | State | + +--------------+--------------+------------+------+------------------------------+ + | controller-0 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-pending | + +--------------+--------------+------------+------+------------------------------+ + + [sysadmin@controller-0 ~(keystone_admin)]$ software deploy host-list + +--------------+--------------+------------+------+-------------------------------+ + | Host | From Release | To Release | RR | State | + +--------------+--------------+------------+------+-------------------------------+ + | controller-0 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-pending | + | controller-1 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-pending | + | compute-0 | 25.09.0 | 24.09.400 | True | deploy-host-rollback-deployed | + +--------------+--------------+------------+------+-------------------------------+ + +Follow :ref:`step 5 ` in :ref:`Rollback After Successful Software +Deploy Activate `. Roll back all the hosts that are +in ``deploy-host-rollback-pending`` until all the hosts are in +``deploy-host-rollback-deployed``. Then, follow :ref:`step 6 +` and :ref:`step 7 ` to complete the +rollback. + + +Rollback Before Software Deploy Host to Any Host +----------------------------------------------------------- + +If a host software deployment needs to be cancelled before the software is +deployed to any host, that is, if the deploy state is either +``deploy-start-done`` or ``deploy-start-failed``, you can delete the pending +deployment directly using :command:`software deploy delete`. - .. code-block:: - ~(keystone_admin)]$ software list - +--------------------------+-------+-----------+ - | Release | RR | State | - +--------------------------+-------+-----------+ - | starlingx-10.0.0 | True | deployed | - | | True | available | - +--------------------------+-------+-----------+