Extend scale back hacluster operation - retry

Extend the scale back hacluster cloud operation from using Vault as an example to replacing a Vault cluster node. This consists essentially of adding a section on scaling out the application. Improve the Unseal Vault operation by linking to the Vault TLS cloud operation. Change-Id: Id2e146e7e5dbad8f1df4acb85f60728257ba526d
2022-04-21 13:45:59 -04:00 · 2022-04-21 13:45:59 -04:00 · ab54e97433
parent 1fa01acd73
commit ab54e97433
4 changed files with 183 additions and 160 deletions
--- a/doc/source/admin/index.rst
+++ b/doc/source/admin/index.rst
@ -15,7 +15,7 @@ General cloud operations:
   ops-unseal-vault
   ops-config-tls-vault-api
   ops-live-migrate-vms
-   ops-scale-back-with-hacluster
+   ops-replace-vault-node
   ops-scale-out-nova-compute
   ops-start-innodb-from-outage
   ops-auto-glance-image-updates
--- a/doc/source/admin/ops-replace-vault-node.rst
+++ b/doc/source/admin/ops-replace-vault-node.rst
@ -0,0 +1,175 @@
+:orphan:
+
+==================================================
+Scale back an application with the hacluster charm
+==================================================
+
+Introduction
+------------
+
+This article shows how to replace a Vault node in a cluster made highly
+available by means of the subordinate hacluster charm. It implies the removal
+and then the addition of a vault unit. This is done with generic Juju commands
+and actions available to the hacluster charm.
+
+.. important::
+
+   This procedure will not result in cloud downtime providing that there is at
+   least one functional Vault node present at all times.
+
+.. warning::
+
+   This procedure will involve a sealed Vault instance. Please ensure that the
+   requisite number of unseal keys are available before continuing.
+
+Procedure
+---------
+
+If the unit being removed is in a 'lost' state (as seen in :command:`juju
+status`) please first see the `Notes`_ section.
+
+List the application units
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Display the units, in this case for the vault application:
+
+.. code-block:: none
+
+   juju status vault
+
+This article will be based on the following (partial) output:
+
+.. code-block:: console
+
+   Unit                     Workload  Agent  Machine  Public address  Ports     Message
+   vault/0*                 active    idle   1/lxd/4  10.246.114.76   8200/tcp  Unit is ready (active: false, mlock: disabled)
+     vault-hacluster/1      active    idle            10.246.114.76             Unit is ready and clustered
+     vault-mysql-router/0*  active    idle            10.246.114.76             Unit is ready
+   vault/3                  active    idle   0/lxd/8  10.246.114.83   8200/tcp  Unit is ready (active: true, mlock: disabled)
+     vault-hacluster/2      active    idle            10.246.114.83             Unit is ready and clustered
+     vault-mysql-router/25  active    idle            10.246.114.83             Unit is ready
+   vault/4                  active    idle   2/lxd/9  10.246.114.84   8200/tcp  Unit is ready (active: false, mlock: disabled)
+     vault-hacluster/0*     active    idle            10.246.114.84             Unit is ready and clustered
+     vault-mysql-router/24  active    idle            10.246.114.84             Unit is ready
+
+In this example, unit ``vault/3`` will be removed.
+
+Pause the subordinate hacluster unit
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Pause the hacluster unit that corresponds to the principal application unit
+being removed. Here, unit ``vault-hacluster/2`` corresponds to unit
+``vault/3``:
+
+.. code-block:: none
+
+   juju run-action --wait vault-hacluster/2 pause
+
+Remove the principal application unit
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Remove the principal application unit:
+
+.. code-block:: none
+
+   juju remove-unit vault/3
+
+This will also remove the hacluster subordinate unit (and any other subordinate
+units).
+
+Add a principal application unit
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Add a principal application unit. We accomplish this by scaling out the
+existing vault application and placing the new (containerised) unit on the same
+host that the removed unit was on (machine 0):
+
+.. code-block:: none
+
+   juju add-unit --to lxd:0 vault
+
+.. caution::
+
+   If network spaces are in use the above command will not succeed. See Juju
+   issue `LP #1969523`_ for a workaround.
+
+The new :command:`juju status` output now contains:
+
+.. code-block:: console
+
+   Unit                     Workload  Agent  Machine  Public address  Ports     Message
+   vault/0*                 active    idle   1/lxd/4  10.246.114.76   8200/tcp  Unit is ready (active: false, mlock: disabled)
+     vault-hacluster/1      active    idle            10.246.114.76             Unit is ready and clustered
+     vault-mysql-router/0*  active    idle            10.246.114.76             Unit is ready
+   vault/4                  active    idle   2/lxd/9  10.246.114.84   8200/tcp  Unit is ready (active: true, mlock: disabled)
+     vault-hacluster/0*     active    idle            10.246.114.84             Unit is ready and clustered
+     vault-mysql-router/24  active    idle            10.246.114.84             Unit is ready
+   vault/6                  blocked   idle   0/lxd/9  10.246.114.83   8200/tcp  Unit is sealed
+     vault-hacluster/28     active    idle            10.246.114.83             Unit is ready and clustered
+     vault-mysql-router/40  active    idle            10.246.114.83             Unit is ready
+
+Notice that the new vault unit (``vault/6``) is sealed.
+
+Unseal the new Vault instance
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Here we will assume that the original Vault deploy was initialised with a
+requirement of three unseal keys.
+
+Set an environment variable based on the address of the newly-introduced unit,
+and unseal the instance:
+
+.. code-block:: none
+
+   export VAULT_ADDR="http://10.246.114.83:8200"
+
+   vault operator unseal
+   vault operator unseal
+   vault operator unseal
+
+For more information on unsealing Vault see cloud operation :doc:`Unseal Vault
+<ops-unseal-vault>`.
+
+Verify cloud services
+~~~~~~~~~~~~~~~~~~~~~
+
+The final :command:`juju status vault` (partial) output is:
+
+.. code-block:: console
+
+   Unit                     Workload  Agent  Machine  Public address  Ports     Message
+   vault/0*                 active    idle   1/lxd/4  10.246.114.76   8200/tcp  Unit is ready (active: false, mlock: disabled)
+     vault-hacluster/1      active    idle            10.246.114.76             Unit is ready and clustered
+     vault-mysql-router/0*  active    idle            10.246.114.76             Unit is ready
+   vault/4                  active    idle   2/lxd/9  10.246.114.84   8200/tcp  Unit is ready (active: true, mlock: disabled)
+     vault-hacluster/0*     active    idle            10.246.114.84             Unit is ready and clustered
+     vault-mysql-router/24  active    idle            10.246.114.84             Unit is ready
+   vault/6                  active    idle   0/lxd/9  10.246.114.83   8200/tcp  Unit is ready (active: false, mlock: disabled)
+     vault-hacluster/28     active    idle            10.246.114.83             Unit is ready and clustered
+     vault-mysql-router/40  active    idle            10.246.114.83             Unit is ready
+
+Ensure that all cloud services are working as expected.
+
+Notes
+-----
+
+Pre-removal, in the case where the principal application unit has transitioned
+to a 'lost' state (e.g. dropped off the network due to a hardware failure),
+
+#. the first step (pause the hacluster unit) can be skipped
+#. the second step (remove the principal unit) can be replaced by:
+
+   .. code-block:: none
+
+      juju remove-machine N --force
+
+   N is the Juju machine ID (see the :command:`juju status` command) where the
+   unit to be removed is running.
+
+   .. warning::
+
+      Removing the machine by force will naturally remove any other units that
+      may be present, including those from an entirely different application.
+
+.. LINKS
+.. _LP #1969523: https://bugs.launchpad.net/juju/+bug/1969523
--- a/doc/source/admin/ops-scale-back-with-hacluster.rst
+++ b/doc/source/admin/ops-scale-back-with-hacluster.rst
@ -1,159 +0,0 @@
-:orphan:
-
-==================================================
-Scale back an application with the hacluster charm
-==================================================
-
-Introduction
------------
-
-This article shows how to scale back an application that is made highly
-available by means of the subordinate hacluster charm. It implies the removal
-of one or more of the principal application's units. This is easily done with
-generic Juju commands and actions available to the hacluster charm.
-
-.. note::
-
-   Since the application being scaled back is already in HA mode the removal of
-   one of its cluster members should not cause any immediate interruption of
-   cloud services.
-
-   Scaling back an application will also remove its associated hacluster unit.
-   It is best practice to have at least three hacluster units per application
-   at all times. An odd number is also recommended.
-
-Procedure
---------
-
-If the unit being removed is in a 'lost' state (as seen in :command:`juju
-status`) please first see the `Notes`_ section.
-
-List the application units
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Display the units, in this case for the vault application:
-
-.. code-block:: none
-
-   juju status vault
-
-This article will be based on the following output:
-
-.. code-block:: console
-
-   Unit                     Workload  Agent  Machine  Public address  Ports     Message
-   vault/0*                 active    idle   0/lxd/5  10.0.0.227      8200/tcp  Unit is ready (active: true, mlock: disabled)
-     vault-hacluster/0*     active    idle            10.0.0.227                Unit is ready and clustered
-     vault-mysql-router/0*  active    idle            10.0.0.227                Unit is ready
-   vault/1                  active    idle   1/lxd/5  10.0.0.234      8200/tcp  Unit is ready (active: true, mlock: disabled)
-     vault-hacluster/1      active    idle            10.0.0.234                Unit is ready and clustered
-     vault-mysql-router/1   active    idle            10.0.0.234                Unit is ready
-   vault/2                  active    idle   2/lxd/6  10.0.0.233      8200/tcp  Unit is ready (active: true, mlock: disabled)
-     vault-hacluster/2      active    idle            10.0.0.233                Unit is ready and clustered
-     vault-mysql-router/2   active    idle            10.0.0.233                Unit is ready
-
-In the below example, unit ``vault/1`` will be removed.
-
-Pause the subordinate hacluster unit
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Pause the hacluster unit that corresponds to the principle application unit
-being removed. Here, unit ``vault-hacluster/1`` corresponds to unit
-``vault/1``:
-
-.. code-block:: none
-
-   juju run-action --wait vault-hacluster/1 pause
-
-.. caution::
-
-   Unit numbers for a subordinate unit and its corresponding principal unit are
-   not necessarily the same (e.g. it is possible to have ``vault-hacluster/2``
-   correspond to ``vault/1``).
-
-Remove the principal application unit
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Remove the principal application unit:
-
-.. code-block:: none
-
-   juju remove-unit vault/1
-
-This will also remove the hacluster subordinate unit (and any other subordinate
-units).
-
-Update the ``cluster_count`` value
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-Inform the hacluster charm about the new number of hacluster units, two here:
-
-.. code-block:: none
-
-   juju config vault-hacluster cluster_count=2
-
-In this example a count of two (less than three) removes quorum functionality
-and enables a two-node cluster. This is a sub-optimal state and is shown as an
-example only.
-
-Update Corosync
-~~~~~~~~~~~~~~~
-
-Remove Corosync nodes from its ring and update ``corosync.conf`` to reflect the
-new number of nodes (``min_quorum`` is recalculated):
-
-.. code-block:: none
-
-   juju run-action --wait vault-hacluster/leader update-ring i-really-mean-it=true
-
-Check the status of the Corosync cluster by querying a remaining hacluster
-unit:
-
-.. code-block:: none
-
-   juju ssh vault-hacluster/leader sudo crm status
-
-There should not be any node listed as OFFLINE.
-
-.. note::
-
-   With Juju client < 2.9 a subordinate leader unit must be referenced via its
-   machine ID (e.g. 0/lxd/5) when using the :command:`juju ssh` command.
-
-Verify cloud services
-~~~~~~~~~~~~~~~~~~~~~
-
-For this example, the final :command:`juju status vault` output is:
-
-.. code-block:: console
-
-   Unit                     Workload  Agent  Machine  Public address  Ports     Message
-   vault/0*                 active    idle   0/lxd/5  10.0.0.227      8200/tcp  Unit is ready (active: true, mlock: disabled)
-     vault-hacluster/0*     active    idle            10.0.0.227                Unit is ready and clustered
-     vault-mysql-router/0*  active    idle            10.0.0.227                Unit is ready
-   vault/2                  active    idle   2/lxd/6  10.0.0.233      8200/tcp  Unit is ready (active: true, mlock: disabled)
-     vault-hacluster/2      active    idle            10.0.0.233                Unit is ready and clustered
-     vault-mysql-router/2   active    idle            10.0.0.233                Unit is ready
-
-Ensure that all cloud services are working as expected.
-
-Notes
-----
-
-Pre-removal, in the case where the principal application unit has transitioned
-to a 'lost' state (e.g. dropped off the network due to a hardware failure),
-
-#. the first step (pause the hacluster unit) can be skipped
-#. the second step (remove the principal unit) can be replaced by:
-
-   .. code-block:: none
-
-      juju remove-machine N --force
-
-   N is the Juju machine ID (see the :command:`juju status` command) where the
-   unit to be removed is running.
-
-   .. warning::
-
-      Removing the machine by force will naturally remove any other units that
-      may be present, including those from an entirely different application.
--- a/doc/source/admin/ops-unseal-vault.rst
+++ b/doc/source/admin/ops-unseal-vault.rst
@ -51,6 +51,13 @@ For a single unit requiring three keys (``vault/0`` with IP address
   vault operator unseal
   vault operator unseal

+.. note::
+
+   If the Vault API is encrypted you will need to inform your Vault client of
+   the associated CA certificate via an additional variable (VAULT_CACERT). See
+   cloud operation :doc:`Configure TLS for the Vault API
+   <ops-config-tls-vault-api>`.
+
 You will be prompted for the unseal keys. The information will not be echoed
 back to the screen nor captured in the shell's history.