tripleo-heat-templates/container_config_scripts
Damien Ciabrini 6733d14f11 Serialize shutdown of pacemaker nodes
When running minor update in a composable HA, different
roles could run ansible tasks concurrently. However,
there is currently a race when pacemaker nodes are
stopped in parallel [1,2], that could cause nodes to
incorrectly stop themselves once they reconnect to the
cluster.

To prevent concurrent shutdown, use a cluster-wide lock
to signals that one node is about to shutdown, and block
the others until the node disconnects from the cluster.

Tested the minor update in a composable HA environment:
  . when run with "openstack update run", every role
    is updated sequentially, and the shutdown lock
    doesn't interfere.
  . when running multiple ansible tasks in parallel
    "openstack update run --limit role<X>", pacemaker
    nodes are correctly stopped sequentially thanks
    to the shutdown lock.
  . when updating an existing overcloud, the new
    locking script used in the review is correctly
    injected on the overcloud, thanks to [3].

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1791841
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1872404
[3] I2ac6bb98e1d4183327e888240fc8d5a70e0d6fcb

Closes-Bug: #1904193
Change-Id: I0e041c6a95a7f53019967f9263df2326b1408c6f
(cherry picked from commit cb55cc8ce5)
(cherry picked from commit 8e9798caf6)
(cherry picked from commit 83ba65e3ae)
2021-02-04 11:14:56 +01:00
..
monitoring Failure status should be set on 0 rather than 1 2020-10-29 16:29:36 -04:00
tests Skip Trilio dirs when setting ownership in /var/lib/nova 2020-12-21 11:58:09 +01:00
__init__.py Rename docker_config_scripts to container_config_scripts 2019-03-06 09:05:50 -05:00
cinder_ffu_db_sync.sh [TRAIN ONLY] Ensure interim db migration containers work properly 2020-06-17 16:24:47 +02:00
glance_ffu_db_sync.sh [TRAIN ONLY] Ensure interim db migration containers work properly 2020-06-17 16:24:47 +02:00
keystone_ffu_db_sync.sh [TRAIN ONLY] Ensure interim db migration containers work properly 2020-06-17 16:24:47 +02:00
manila_ffu_db_sync.sh [Q->T] Add FFU steps for manila 2020-07-06 21:56:27 +00:00
mistral_ffu_db_sync.sh [TRAIN ONLY] Ensure interim db migration containers work properly 2020-06-17 16:24:47 +02:00
neutron_db_rename.sh [TRAIN ONLY] Wait until DB is ready for neutron DB rename 2020-07-07 11:33:53 +01:00
neutron_ffu_db_sync.sh [TRAIN ONLY] Ensure interim db migration containers work properly 2020-06-17 16:24:47 +02:00
nova_ffu_db_sync.sh [TRAIN ONLY] Ensure interim db migration containers work properly 2020-06-17 16:24:47 +02:00
nova_statedir_ownership.py Skip Trilio dirs when setting ownership in /var/lib/nova 2020-12-21 11:58:09 +01:00
nova_wait_for_api_service.py Ensure nova-api is running before starting nova-compute containers 2019-10-01 11:11:44 +01:00
nova_wait_for_compute_service.py Merge "Add multi region support in nova_wait_for_compute_service.py" 2019-10-04 03:43:01 +00:00
pacemaker_mutex_restart_bundle.sh Rolling certificate update for HA services 2021-01-28 13:04:07 +01:00
pacemaker_mutex_shutdown.sh Serialize shutdown of pacemaker nodes 2021-02-04 11:14:56 +01:00
pacemaker_resource_lock.sh Serialize shutdown of pacemaker nodes 2021-02-04 11:14:56 +01:00
pacemaker_restart_bundle.sh Fix pcs restart in composable HA 2020-08-19 17:10:28 +02:00
pacemaker_wait_bundle.sh HA: reorder init_bundle and restart_bundle for improved updates 2020-03-03 09:59:42 +00:00
placement_wait_for_service.py Fix placement_wait_for_service 2019-10-24 04:23:48 +00:00
pyshim.sh Rename docker_config_scripts to container_config_scripts 2019-03-06 09:05:50 -05:00
wait-port-and-run.sh Ensure redis_tls_proxy starts after all redis instances 2020-07-27 15:50:27 +00:00