tripleo-heat-templates

History

Damien Ciabrini 8e9798caf6 Serialize shutdown of pacemaker nodes When running minor update in a composable HA, different roles could run ansible tasks concurrently. However, there is currently a race when pacemaker nodes are stopped in parallel [1,2], that could cause nodes to incorrectly stop themselves once they reconnect to the cluster. To prevent concurrent shutdown, use a cluster-wide lock to signals that one node is about to shutdown, and block the others until the node disconnects from the cluster. Tested the minor update in a composable HA environment: . when run with "openstack update run", every role is updated sequentially, and the shutdown lock doesn't interfere. . when running multiple ansible tasks in parallel "openstack update run --limit role<X>", pacemaker nodes are correctly stopped sequentially thanks to the shutdown lock. . when updating an existing overcloud, the new locking script used in the review is correctly injected on the overcloud, thanks to [3]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1791841 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1872404 [3] I2ac6bb98e1d4183327e888240fc8d5a70e0d6fcb Closes-Bug: #1904193 Change-Id: I0e041c6a95a7f53019967f9263df2326b1408c6f (cherry picked from commit `cb55cc8ce5`)		2021-01-21 13:15:12 +00:00
..
monitoring	Return details in output of container health check	2020-10-14 11:57:56 +02:00
tests	Skip Trilio dirs when setting ownership in /var/lib/nova	2020-12-21 11:49:16 +01:00
__init__.py	Rename docker_config_scripts to container_config_scripts	2019-03-06 09:05:50 -05:00
nova_statedir_ownership.py	Skip Trilio dirs when setting ownership in /var/lib/nova	2020-12-21 11:49:16 +01:00
nova_wait_for_api_service.py	Change optparse to argparse	2020-01-21 04:17:09 +00:00
nova_wait_for_compute_service.py	Change optparse to argparse	2020-01-21 04:17:09 +00:00
pacemaker_mutex_restart_bundle.sh	Rolling certificate update for HA services	2020-07-30 16:51:48 +02:00
pacemaker_mutex_shutdown.sh	Serialize shutdown of pacemaker nodes	2021-01-21 13:15:12 +00:00
pacemaker_resource_lock.sh	Serialize shutdown of pacemaker nodes	2021-01-21 13:15:12 +00:00
pacemaker_restart_bundle.sh	Fix pcs restart in composable HA	2020-08-19 16:21:15 +02:00
pacemaker_wait_bundle.sh	HA: reorder init_bundle and restart_bundle for improved updates	2020-01-23 16:09:36 +01:00
placement_wait_for_service.py	Stop to use the __future__ module.	2020-07-02 15:27:27 +00:00
pyshim.sh	Fix typos	2020-09-16 15:45:12 +05:30
wait-port-and-run.sh	Ensure redis_tls_proxy starts after all redis instances	2020-07-07 05:36:43 +00:00