Merge "System Snapshot and Restore"

This commit is contained in:
Zuul
2025-09-25 14:25:37 +00:00
committed by Gerrit Code Review
16 changed files with 267 additions and 88 deletions

View File

@@ -0,0 +1,11 @@
.. sw-deploy-strategy-create-begin
.. sw-deploy-strategy-create-end
.. sw-deploy-strategy-apply-begin
.. sw-deploy-strategy-apply-end
.. sw-deploy-strategy-delete-begin
.. sw-deploy-strategy-delete-end

View File

@@ -0,0 +1,4 @@
.. strategy-delete-ds-begin
.. strategy-delete-ds-end

View File

@@ -93,6 +93,11 @@ across the subclouds to be successful.
subcloud will need to be prestaged again before deploying any subsequent
releases.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. rubric:: |proc|
#. Review software sync status of the subclouds.
@@ -150,8 +155,6 @@ across the subclouds to be successful.
| identity_sync_status | unknown |
| kubernetes_sync_status | unknown |
| kube-rootca_sync_status | unknown |
| load_sync_status | unknown |
| patching_sync_status | unknown |
| platform_sync_status | unknown |
| software_sync_status | unknown |
| region_name | 0aa48e9b72bf48bbaad9a624cdd0cacb |
@@ -170,6 +173,11 @@ across the subclouds to be successful.
[-max-parallel-subclouds <i>] \
[-stop-on-failure <level>] \
[--group group] \
[--release-id RELEASE_ID] \
[--snapshot] \
[--rollback] \
[--with-delete] \
[--delete-only] \
[<subcloud>]
where:
@@ -202,11 +210,44 @@ across the subclouds to be successful.
The subcloud group values are used for subcloud apply type and max
parallel subclouds parameters.
**snapshot**
Creates a snapshot of the main logical volumes so that, in the case of
a rollback, the snapshots can be restored in order to speed up the
overall rollback procedure. By default, the rollback procedure does not
use snapshots. Instead, it rolls back the individual file changes on
the main logical volumes.
**rollback**
Specifies that :command:`sw-deploy-strategy` should run a rollback of the
existing (un-deleted) software deployment on the subclouds.
This is valid after a failed :command:`sw-deploy-strategy` on a
subcloud or after a successful :command:`sw-deploy-strategy` on a
subcloud that did not delete the software deployment using
**--with-delete** or **--delete-only**.
**with-delete**
Specifies that the software deployments created on the subclouds as
part of the strategy apply should be deleted at the end of a successful
subcloud software deployment. With this option, the subclouds' software
deployments cannot be rolled back.
By default, the software deployment is not deleted, that is, a subcloud
software deployment rollback is possible, if required.
**delete-only**
Specifies that :command:`sw-deploy-strategy` should only run the
software deployment delete operation on subclouds. This command is used
to clean up a previous default :command:`sw-deploy-strategy` where the
subclouds' software deployments were not deleted.
For example:
.. only:: starlingx
.. code-block:: none
~(keystone_admin)]$ dcmanager sw-deploy-strategy create
~(keystone_admin)]$ dcmanager sw-deploy-strategy create --release-id
+------------------------+----------------------------+
| Field | Value |
+------------------------+----------------------------+
@@ -214,12 +255,21 @@ across the subclouds to be successful.
| subcloud apply type | None |
| max parallel subclouds | 2 |
| stop on failure | False |
| release_id | WRCP-24.09.200 |
| release_id | starlingx-25.09.0 |
| snapshot | False |
| rollback | False |
| delete_option | None |
| state | initial |
| created_at | 2025-01-21T20:00:31.141872 |
| created_at | 2025-08-21T22:30:53.282940 |
| updated_at | None |
+------------------------+----------------------------+
.. only:: partner
.. include:: /_includes/deploy-software-releases-using-the-cli.rest
:start-after: sw-deploy-strategy-create-begin
:end-before: sw-deploy-strategy-create-end
#. To show the settings for the ``sw-deploy-strategy``, use the
:command:`dcmanager sw-deploy-strategy show` command.
@@ -268,20 +318,32 @@ across the subclouds to be successful.
#. To apply the software deploy strategy, use the :command:`dcmanager sw-deploy-strategy apply`
command.
.. only:: starlingx
.. code-block:: none
~(keystone_admin)]$ dcmanager sw-deploy-strategy apply
+------------------------+----------------------------+
| Field | Value |
+------------------------+----------------------------+
| subcloud apply type | parallel |
| subcloud apply type | None |
| max parallel subclouds | 2 |
| stop on failure | False |
| release_id | starlingx-25.09.0 |
| snapshot | False |
| rollback | False |
| delete_option | None |
| state | applying |
| created_at | 2020-02-02T14:42:13.822499 |
| updated_at | 2020-02-02T14:42:19.376688 |
| created_at | 2025-08-21T22:30:53.282940 |
| updated_at | None |
+------------------------+----------------------------+
.. only:: partner
.. include:: /_includes/deploy-software-releases-using-the-cli.rest
:start-after: sw-deploy-strategy-apply-begin
:end-before: sw-deploy-strategy-apply-end
#. To show the step currently being performed on each of the subclouds, use
the :command:`dcmanager strategy-step list` command.
@@ -311,20 +373,31 @@ across the subclouds to be successful.
sw-deploy strategy, using the :command:`dcmanager sw-deploy-strategy delete`
command.
.. only:: starlingx
.. code-block:: none
~(keystone_admin)]$ dcmanager sw-deploy-strategy delete
+------------------------+----------------------------+
| Field | Value |
+------------------------+----------------------------+
| subcloud apply type | parallel |
| subcloud apply type | None |
| max parallel subclouds | 2 |
| stop on failure | False |
| release_id | starlingx-25.09.0 |
| snapshot | False |
| rollback | False |
| delete_option | None |
| state | deleting |
| created_at | 2020-03-23T20:04:50.992444 |
| updated_at | 2020-03-23T20:05:14.157352 |
| created_at | 2025-08-21T22:30:53.282940 |
| updated_at | None |
+------------------------+----------------------------+
.. only:: partner
.. include:: /_includes/deploy-software-releases-using-the-cli.rest
:start-after: sw-deploy-strategy-delete-begin
:end-before: sw-deploy-strategy-delete-end
.. rubric:: |postreq|

View File

@@ -32,7 +32,7 @@ Deploy a Software Release on the System Controller
#. Create a Software Deploy strategy by specifying the settings for the
parameters in the **Create Strategy** dialog box.
.. image:: figures/create-software-deploy-strategy-pop.png
.. image:: figures/create-software-deploy-strategy-pop1.PNG
.. note::
@@ -199,10 +199,18 @@ across the subclouds to be successful.
**complete**
The software release has been deployed successfully.
#. Once the strategy state changes to **complete**, click Delete Strategy to
#. Once the strategy state changes to **complete**, click **Delete Strategy** to
delete it.
.. image:: figures/strategy-delete.png
.. only:: starlingx
.. image:: figures/strategy-delete1.PNG
.. only:: partner
.. include:: /_includes/deploy-software-releases-using-the-horizon-dashboard.rest
:start-after: strategy-delete-ds-begin
:end-before: strategy-delete-ds-end
.. _customizing-the-update-configuration-for-distributed-cloud-update-orchestration:

Binary file not shown.

After

Width:  |  Height:  |  Size: 87 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 43 KiB

View File

@@ -96,14 +96,12 @@ across the subclouds. If a web interface is preferred, see
| backup_status | None |
| backup_datetime | None |
| prestage_status | complete |
| prestage_version | 22.12,24.09 |
| prestage_version | 24.09,25.09 |
| dc-cert_sync_status | in-sync |
| firmware_sync_status | in-sync |
| identity_sync_status | in-sync |
| kubernetes_sync_status | out-of-sync |
| kube-rootca_sync_status | in-sync |
| load_sync_status | not-available |
| patching_sync_status | not-available |
| platform_sync_status | in-sync |
| software_sync_status | in-sync |
| region_name | subcloud11 |
@@ -111,17 +109,6 @@ across the subclouds. If a web interface is preferred, see
.. note::
Prior to software release version |prod-ver|, the major software
release status was presented by load_sync_status attribute and the patch
software release status by ``patching_sync_status`` attribute. Starting
with software release version |prod-ver| and above, both major and minor
software release statuses are presented by a single
Starting with software release version |prod-ver| and above, both major
and minor software release statuses are presented by a single
``software_sync_status`` attribute.
The ``patching_sync_status`` is only applicable to subclouds running
the previous release (i.e. below |prod-ver|). It indicates whether the
subcloud's deployed patches of the previous release match those of
the Central Cloud.
Both ``patching_sync_status`` and ``load_sync_status`` attributes will
be removed in a future release.

View File

@@ -116,6 +116,7 @@
.. |LSM| replace:: :abbr:`LSM (Linux Security Modules)`
.. |LUKS| replace:: :abbr:`LUKS (Linux Unified Key Setup)`
.. |LVG| replace:: :abbr:`LVG (Local Volume Groups)`
.. |LVM| replace:: :abbr:`LVM (Logical Volume Manager)`
.. |MAC| replace:: :abbr:`MAC (Media Access Control)`
.. |MDS| replace:: :abbr:`MDS (MetaData Server for cephfs)`
.. |MEC| replace:: :abbr:`MEC (Multi-access Edge Computing)`

View File

@@ -75,6 +75,11 @@ For example:
|kube-ver| and after updating |prod| to version |prod-ver|. For more
information, see :ref:`upgrade-the-netapp-trident-software-c5ec64d213d3`.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. only:: partner
.. include:: /_includes/configuring-kubernetes-update-orchestration.rest

View File

@@ -80,6 +80,11 @@ For example:
information, see :ref:`Upgrade the NetApp Trident Software
<upgrade-the-netapp-trident-software-c5ec64d213d3>`.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. only:: partner

View File

@@ -10,11 +10,6 @@ Introduction
|prod| software management enables you to upversion your |prod| software to a
new Patch Release or a new Major Release.
.. warning::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
**Major Releases** (versioned as 'X.Y.0', for example, starlingx-10.0.0)
- deliver new and enhanced feature content
@@ -80,7 +75,16 @@ active software deployment is supported. The deployment of a Release can be
aborted and rolled back at any step of the deployment process, as long as the
active deployment has not been both completed and deleted.
The default rollback procedure for both Patch Release and Major Release
rollbacks for all deployment configurations, involves reverting changes to
individual files across the main logical volumes during the deployment. For
Major Release rollbacks, on |AIO-SX| configurations only, an |LVM| snapshot
option can be used to take a snapshot of the main logical volumes at the start
of deployment, such that in case of a rollback, the |LVM| snapshots can be
restored in order to speed up the overall rollback procedure.
For a Patch Release only, a **removal or un-deployment of a release** is
supported. One or more Patch Releases can be removed/un-deployed by deploying a
previous Patch Release.

View File

@@ -83,6 +83,11 @@ standard configuration.
:ref:`migrate-platform-certificates-to-use-cert-manager-c0b1727e4e5d` procedure
to reconfigure it with the RSA certificate/private key.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. rubric:: |proc|
#. For a duplex (dual controller) system, switch the activity from
@@ -169,20 +174,40 @@ standard configuration.
.. code-block::
~(keystone_admin)]$ software deploy precheck [ -f ] <new-release-id>
~(keystone_admin)]$ software deploy precheck [-f|--force] [-o|--options snapshot=true] <new-release-id>
System Health:
All hosts are provisioned: [OK]
All hosts are unlocked/enabled: [OK]
All hosts have current configurations: [OK]
All hosts are patch current: [OK]
Ceph Storage Healthy: [OK]
No alarms: [OK]
All kubernetes nodes are ready: [OK]
All kubernetes control plane pods are ready: [OK]
All PodSecurityPolicies are removed: [OK]
All kubernetes applications are in a valid state: [OK]
All hosts are patch current: [OK]
Active kubernetes version [v1.29.2] is a valid supported version: [OK]
Active controller is controller-0: [OK]
Installed license is valid: [OK]
Valid upgrade path from release 24.09 to 25.09: [OK]
Required patches are applied: [OK]
Docker filesystem in controllers satisfies the required size of 40GB: [OK]
(LVM snapshots) System is AIO-SX: [OK]
(LVM snapshots) Disk space available: [OK]
Where,
``--force|-f`` will ignore non-management affecting alarms.
``--options|-o snapshot=true`` will enable health checks for the |LVM| snapshot feature.
.. note::
The |LVM| snapshot option on software deploys will result in a snapshot
of the main logical volumes, so that in the case of a rollback, the
snapshots can be restored in order to speed up the overall rollback
procedure. By default, the rollback procedure does not use snapshots.
Instead, it rolls back the individual file changes on the main logical
volumes.
Resolve any checks that are not ok and re-run the :command:`software deploy precheck` command.
Use the `-f` option to ignore non-management affecting alarms.
@@ -214,7 +239,7 @@ standard configuration.
.. code-block::
~(keystone_admin)]$ software deploy start [ -f ] <new-release-id>
~(keystone_admin)]$ software deploy start [-f|--force] [-o|--options snapshot=true] <new-release-id>
Deployment for <new-release-id> started
Then, monitor the progress of :command:`software deploy start` using the
@@ -241,12 +266,17 @@ standard configuration.
.. note::
If :command:`software deploy start` fails, that is, if the state is
- If :command:`software deploy start` fails, that is, if the state is
`deploy-start-failed`, review ``/var/log/software.log`` on the active
controller for failure details, address the issues, and run the
:command:`software deploy delete` command to delete the deploy and
re-execute the :command:`software deploy start` command.
- If |LVM| snapshot feature is enabled on deploy start (|AIO-SX| support
only), the system will take snapshots of the main logical volumes so
that these can be used to speed up the rollback by restoring the
snapshots, if needed.
#. Deploy the new software release to all hosts.
- For an |AIO-SX| system

View File

@@ -59,6 +59,11 @@ and upgrade various systems.
to upgrade the Trident drivers to 25.02.1 before upgrading Kubernetes to
version 1.32.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. rubric:: |proc|
#. Upload, apply and install the platform update.

View File

@@ -42,6 +42,11 @@ system. This feature is not supported in the system which is not |AIO-SX|.
:ref:`upgrade-the-netapp-trident-software-c5ec64d213d3` to upgrade Trident
to 25.02.1 before upgrading Kubernetes to version 1.32.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. rubric:: |proc|
#. List the available Kubernetes versions, for example:

View File

@@ -160,6 +160,11 @@ to control and monitor their progress manually.
:ref:`migrate-platform-certificates-to-use-cert-manager-c0b1727e4e5d` procedure
to reconfigure it with RSA certificate/private key.
.. note::
The sysadmin and admin passwords must be set to the same value prior to
starting an upgrade from |prod-long| Release |v_r9| to |prod| Release |v_r10|.
.. rubric:: |proc|
#. Create a software deployment orchestration strategy for a specified software
@@ -169,17 +174,20 @@ to control and monitor their progress manually.
.. code-block::
~(keystone_admin)]$ sw-manager sw-deploy-strategy create [--controller-apply-type {serial,ignore}]
~(keystone_admin)]$ sw-manager sw-deploy-strategy create --help usage: sw-manager sw-deploy-strategy create [--controller-apply-type {serial,ignore}]
[--storage-apply-type {serial,parallel,ignore}]
[--worker-apply-type {serial,parallel,ignore}]
[--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10}]
[--instance-action {stop-start,migrate}]
[--alarm-restrictions {strict,relaxed}]
[--delete]
<software-release-id>
[--max-parallel-worker-hosts {2,3,4,5,6,7,8,9,10}]
[--rollback | --no-rollback]
[--delete | --no-delete]
[--snapshot | --no-snapshot]
[release]
strategy-uuid: 5435e049-7002-4403-acfb-7886f6da14af
release-id: <software-release-id>
release-id: <release>
controller-apply-type: serial
storage-apply-type: serial
worker-apply-type: serial
@@ -192,7 +200,7 @@ to control and monitor their progress manually.
where,
``<software-release-id>``
``<release>``
Specifies the specific software release to deploy. This can be a patch
release or a major release.
@@ -275,8 +283,40 @@ to control and monitor their progress manually.
to or greater than its management affecting severity. That is, it will use
the ``-f`` (force) option on the precheck or start of the deployment.
``[--rollback]``
(Optional) Creates strategy to roll back the platform upgrade.
``[--no-rollback]``
(Optional) Ensures that the strategy being created is not for rollback and upgrade.
``[--delete]``
(Optional) Specifies if the software deployment needs to be deleted or not.
(Optional) Creates strategy which will delete the deployment once it is
applied successfully. The system cannot be rolled back from that state.
``[--no-delete]``
(Optional) Option to not delete the deployment when the strategy is applied. The
system can be rolled back from that state.
``[--snapshot]``
(Optional) Creates a snapshot of the main logical volumes so that, in
the case of a rollback, the snapshots can be restored in order to speed
up the overall rollback procedure. By default, the rollback procedure
does not use snapshots. Instead, it rolls back the individual file
changes on the main logical volumes.
``[--no-snapshot]``
(Optional) Ensures that snapshots are not created before the platform
upgrade starts.
.. note::
The |LVM| snapshot option on software deploys will result in a snapshot
of the main logical volumes, so that in the case of a rollback, the
snapshots can be restored in order to speed up the overall rollback
procedure. By default, the rollback procedure does not use snapshots.
Instead, it rolls back the individual file changes on the main logical
volumes.
#. Wait for the ``build`` phase of the software deployment orchestration
strategy create to be 100% complete and its state to be ``ready-to-apply``.

View File

@@ -26,9 +26,10 @@ abort and rollback of the deployment is not possible.
.. note::
LVM snapshot is only supported for |AIO-SX|. If you enable LVM snapshots on
deployment start, it will automatically be used during rollback. If any of
the snapshots are invalid, it will fallback to the standard rollback procedure.
|LVM| snapshot is only supported for |AIO-SX|. If you specify the |LVM|
snapshots option on deployment start, it will automatically be used during
rollback. If any of the snapshots are invalid, it will fallback to the
standard rollback procedure.
.. rubric:: |prereq|