tripleo-heat-templates/deployment/pacemaker
Damien Ciabrini 6733d14f11 Serialize shutdown of pacemaker nodes
When running minor update in a composable HA, different
roles could run ansible tasks concurrently. However,
there is currently a race when pacemaker nodes are
stopped in parallel [1,2], that could cause nodes to
incorrectly stop themselves once they reconnect to the
cluster.

To prevent concurrent shutdown, use a cluster-wide lock
to signals that one node is about to shutdown, and block
the others until the node disconnects from the cluster.

Tested the minor update in a composable HA environment:
  . when run with "openstack update run", every role
    is updated sequentially, and the shutdown lock
    doesn't interfere.
  . when running multiple ansible tasks in parallel
    "openstack update run --limit role<X>", pacemaker
    nodes are correctly stopped sequentially thanks
    to the shutdown lock.
  . when updating an existing overcloud, the new
    locking script used in the review is correctly
    injected on the overcloud, thanks to [3].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1791841
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1872404
[3] I2ac6bb98e1d4183327e888240fc8d5a70e0d6fcb

Closes-Bug: #1904193
Change-Id: I0e041c6a95a7f53019967f9263df2326b1408c6f
(cherry picked from commit cb55cc8ce5)
(cherry picked from commit 8e9798caf6)
(cherry picked from commit 83ba65e3ae)
2021-02-04 11:14:56 +01:00
..
clustercheck-container-puppet.yaml Remove unnecessary slash volume maps 2020-02-10 12:01:02 -05:00
compute-instanceha-baremetal-puppet.yaml Remove corosync.conf if it's a dir from remote. 2020-11-20 14:24:41 +01:00
ovn-dbs-baremetal-puppet.yaml Move compute-instanceha, neutron-ovn-dvr-ha to deployments 2019-05-30 20:37:36 +00:00
pacemaker-baremetal-puppet.yaml Serialize shutdown of pacemaker nodes 2021-02-04 11:14:56 +01:00
pacemaker-remote-baremetal-puppet.yaml [Train Only] Renamve tripleo_upgrade_hiera into tripleo-upgrade-hiera. 2020-08-31 22:49:13 +00:00