Add configurable monitor timeouts for ovn dbs

Under pressure, the default monitor timeout value of 20 seconds is not
enough to prevent unnecessary failovers of the ovn-dbs pacemaker resource.
While spawning a few VMs in the same time this could lead to unnecessary
movements of master DB, then re-connections of ovn-controllers (slaves are
read-only), further peaks of load on DBs, and at the end it could lead to
snowball effect. Now this value can be configurable by dbs_timeout in
tripleo::profile::pacemaker::ovn_dbs_bundle and by default is set to 60s.

Change-Id: Ib95c6b7614631eed264d42e6cf61672b705e7893
Signed-off-by: Kamil Sambor <ksambor@redhat.com>
Partial-Bug: #1853000
(cherry picked from commit 15e21010a8)
This commit is contained in:
Kamil Sambor 2019-10-17 15:30:58 +02:00
parent 7c9561bd5b
commit 95a7f217e2
2 changed files with 17 additions and 1 deletions

View File

@ -77,6 +77,10 @@
# configuration. It's only used if internal TLS is enabled. # configuration. It's only used if internal TLS is enabled.
# Defaults to undef # Defaults to undef
# #
# [*dbs_timeout*]
# (Optional) timeout for monitor of ovn dbs resource
# Defaults to 60
#
class tripleo::profile::pacemaker::ovn_dbs_bundle ( class tripleo::profile::pacemaker::ovn_dbs_bundle (
$ovn_dbs_docker_image = hiera('tripleo::profile::pacemaker::ovn_dbs_bundle::ovn_dbs_docker_image', undef), $ovn_dbs_docker_image = hiera('tripleo::profile::pacemaker::ovn_dbs_bundle::ovn_dbs_docker_image', undef),
@ -93,6 +97,7 @@ class tripleo::profile::pacemaker::ovn_dbs_bundle (
$tls_priorities = hiera('tripleo::pacemaker::tls_priorities', undef), $tls_priorities = hiera('tripleo::pacemaker::tls_priorities', undef),
$enable_internal_tls = hiera('enable_internal_tls', false), $enable_internal_tls = hiera('enable_internal_tls', false),
$ca_file = undef, $ca_file = undef,
$dbs_timeout = hiera('tripleo::profile::pacemaker::ovn_dbs_bundle::dbs_timeout', 60),
) { ) {
if $::hostname == downcase($bootstrap_node) { if $::hostname == downcase($bootstrap_node) {
@ -203,7 +208,8 @@ nb_master_protocol=ssl sb_master_protocol=ssl"
pacemaker::resource::ocf { "${ovndb_servers_resource_name}": pacemaker::resource::ocf { "${ovndb_servers_resource_name}":
ocf_agent_name => "${ovndb_servers_ocf_name}", ocf_agent_name => "${ovndb_servers_ocf_name}",
master_params => '', master_params => '',
op_params => 'start timeout=200s stop timeout=200s', op_params => "start timeout=200s stop timeout=200s monitor interval=10s role=Master timeout=${dbs_timeout}s \
monitor interval=30s role=Slave timeout=${dbs_timeout}s",
resource_params => $resource_map, resource_params => $resource_map,
tries => $pcs_tries, tries => $pcs_tries,
location_rule => $ovn_dbs_location_rule, location_rule => $ovn_dbs_location_rule,

View File

@ -0,0 +1,10 @@
---
features:
- |
Under pressure, the default monitor timeout value of 20 seconds is not
enough to prevent unnecessary failovers of the ovn-dbs pacemaker resource.
While spawning a few VMs in the same time this could lead to unnecessary
movements of master DB, then re-connections of ovn-controllers (slaves are
read-only), further peaks of load on DBs, and at the end it could lead to
snowball effect. Now this value can be configurable by dbs_timeout in
tripleo::profile::pacemaker::ovn_dbs_bundle and by default is set to 60s.