Move stonith resource creation to step2
With the merging of the pcs on host patchset for train we are seeing a
problem with FFUs on Instance HA environments.
Preamble:
Tripleo keeps the stonith-enabled cluster property set to false until the puppet step 5
With the pcs on host patchset the enablement happens still at step 5 but
it gets triggered during tripleo_ha_wrapper deployment task of
cinder-volume which tries to restart the cinder-volume service (during
the leapp of the first controller) and this hangs forever because
pacemaker is in the following transition:
- stonith-fence_compute-fence-nova is configured
- pacemaker wants to call stonith on for controller-0 (which is probably
dumb, but it is unlikely we'll be able to change that in the right
timeframe as it seems a potentially involved change in behaviour)
- Any other action, like cinder-volume restart in this case, is stuck
and the FFU fails.
If we simply move the stonith resource creation (and change nothing else
in the stonith-enabled property being set at step 5) to step 2, we
fix this.
Tested and with the injection of this puppet-tripleo review into the
FFU queens->train upgrade on an IHA system, now the FFU passes.
Also applied this patch to a Train based IHA deployment and verified
that deployment, redeploy, minor update and scaleup all keep on working.
Closes-Bug: #1923723
Change-Id: Ib3e2d9c93221dfc2e15974142f30e8c84e7afd63
(cherry picked from commit 6196157b54
)
This commit is contained in:
parent
4a23dc84d4
commit
bd1807c48b
@ -146,7 +146,14 @@ class tripleo::profile::base::pacemaker (
|
|||||||
$pacemaker_master = false
|
$pacemaker_master = false
|
||||||
}
|
}
|
||||||
|
|
||||||
|
# enable_fencing guides the enablement of the stonith-enabled cluster-wide property
|
||||||
|
# enable_stonith_resources drives the creation of the stonith resources themselves and happens at
|
||||||
|
# step2. The reason for step2 is the following:
|
||||||
|
# During step1 the cluster is created (and also the pcmk remote resources in case of IHA)
|
||||||
|
# Since stonith resources are created on each node separately we need to have the guarantee that
|
||||||
|
# all cluster nodes + remote exist before creating stonith resources for them
|
||||||
$enable_fencing = str2bool(hiera('enable_fencing', false)) and $step >= 5
|
$enable_fencing = str2bool(hiera('enable_fencing', false)) and $step >= 5
|
||||||
|
$enable_stonith_resources = str2bool(hiera('enable_fencing', false)) and $step >= 2
|
||||||
|
|
||||||
if $step >= 1 {
|
if $step >= 1 {
|
||||||
include ::pacemaker::params
|
include ::pacemaker::params
|
||||||
@ -233,7 +240,7 @@ class tripleo::profile::base::pacemaker (
|
|||||||
}
|
}
|
||||||
Class['pacemaker::stonith'] -> Exec<|tag == 'pacemaker-scaleup'|>
|
Class['pacemaker::stonith'] -> Exec<|tag == 'pacemaker-scaleup'|>
|
||||||
}
|
}
|
||||||
if $enable_fencing {
|
if $enable_stonith_resources {
|
||||||
include ::tripleo::fencing
|
include ::tripleo::fencing
|
||||||
|
|
||||||
# enable stonith after all Pacemaker resources have been created
|
# enable stonith after all Pacemaker resources have been created
|
||||||
|
Loading…
Reference in New Issue
Block a user