Enable fence_watchdog configuration in stonith topology
This commit extends the fencing manifest to make use of a
"fence_watchdog" device and allows using the resulting "watchdog"
resource in a stonith topology.
In order for this to work the cluster must have been configured with
sbd, either manually or via 'pacemaker::corosync::enable_sbd: true'.
In addition, the fence_watchdog resource needs a supported watchdog
timer device to perform the self fencing.
The fence_watchdog configuration is very much opinionated:
- it assumes the resource name to be 'watchdog' (hardcoded in pacemaker)
- it only supports "all or nothing" scenario, in which all the cluster
nodes need to make use of it
- it is not supported to be used with pacemaker_remote nodes
The fencing creation logic has been adjusted to use the pacemaker
boostrap node to create the watchdog resource and the stonith topology
for all the nodes in the cluster (since this is a single shared
resource we couldn't reuse the old "every man for himself" logic).
fence_watchdog device can be defined like any other fencing device
via fencing.yaml or equivalent:
parameter_defaults:
EnableFencing: true
FencingConfig:
devices:
- agent: fence_watchdog
host_mac: 52:54:00:74:f7:51
...
Ideally fence_watchdog should be used a last resort, and so placed
at the bottom of a stonith topology where power-based fencing agents
are the primary choice for fencing.
The default value for stonith-watchdog-timeout (60s) can be
overridden via tripleo::fencing::watchdog_timeout .
Depends-On: Id010a392df0047d53dfab1c21cc78021c8c1aabf
Change-Id: I89a6014ffb40bc0935a348af7687684f3a71a968
(cherry picked from commit 6fc7430c18
)
This commit is contained in:
parent
08e4898053
commit
f5df16ab28
|
@ -55,6 +55,11 @@
|
|||
# after the resource update.
|
||||
# Defaults to 600 (seconds)
|
||||
#
|
||||
# [*watchdog_timeout*]
|
||||
# Only valid if sbd watchdog fencing is enabled.
|
||||
# Pacemaker will assume unseen nodes self-fence within this much time.
|
||||
# Defaults to 60 (seconds)
|
||||
#
|
||||
# [*enable_instanceha*]
|
||||
# (Optional) Boolean driving the Instance HA controlplane configuration
|
||||
# Defaults to false
|
||||
|
@ -65,6 +70,7 @@ class tripleo::fencing(
|
|||
$try_sleep = 3,
|
||||
$deep_compare = false,
|
||||
$update_settle_secs = 600,
|
||||
$watchdog_timeout = 60,
|
||||
$enable_instanceha = hiera('tripleo::instanceha', false),
|
||||
) {
|
||||
$common_params = {
|
||||
|
@ -185,6 +191,34 @@ class tripleo::fencing(
|
|||
Pcmk_stonith<||> -> Pcmk_stonith_level<||>
|
||||
}
|
||||
}
|
||||
# we use the boostrap_node to create the watchdog resource and the stonith
|
||||
# topology for all the nodes in the cluster, because the watchdog resource
|
||||
# is not per-node but cluster-wide
|
||||
$watchdog_devices = local_fence_devices('fence_watchdog', $all_devices)
|
||||
if length($watchdog_devices) > 0 {
|
||||
# check if this is the bootstrap node
|
||||
if downcase($::hostname) == lookup('pacemaker_short_bootstrap_node_name') {
|
||||
create_resources('pacemaker::stonith::fence_watchdog', $watchdog_devices, $common_params)
|
||||
$stonith_resources = [ 'watchdog' ]
|
||||
# if this is the boostrap node we set watchdog as levelX for all
|
||||
# the pacemaker nodes
|
||||
lookup('pacemaker_short_node_names').each |$node| {
|
||||
pacemaker::stonith::level{ "stonith-${level}-watchdog-${node}":
|
||||
level => $level,
|
||||
target => $node,
|
||||
stonith_resources => [ 'watchdog' ],
|
||||
tries => $tries,
|
||||
try_sleep => $try_sleep,
|
||||
}
|
||||
}
|
||||
pacemaker::property { 'stonith-watchdog-timeout':
|
||||
property => 'stonith-watchdog-timeout',
|
||||
value => $watchdog_timeout,
|
||||
tries => $tries,
|
||||
}
|
||||
Pcmk_property<||> -> Pcmk_stonith<||> -> Pcmk_stonith_level<||>
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
|
Loading…
Reference in New Issue