Fix detached keystone node epoch mismatch

Pacemaker maintains an internal database, which is used for
configuration storage. Each update of this database increases a counter,
called "epoch", which should have the same value cluster-wide.

If an update operation comes to a previously detached node, a conflict
will occur. Pacemaker does not allow updating this database on a node,
which epoch value is lower than the epoch value of a cluster leader.

We should wait the epoch counter to come into sync by perodically
retrying the update command.

Closes-Bug: 1494314
Change-Id: I1f242bcd90264ec45da2aaa6bc030f244511761b
This commit is contained in:
Dmitry Bilunov 2015-12-15 12:29:07 +03:00
parent 9a8dae5941
commit 6134af0c6e

View File

@ -10,6 +10,9 @@ Puppet::Type.type(:cs_property).provide(:crm, :parent => Puppet::Provider::Crmsh
commands :crm => 'crm' commands :crm => 'crm'
commands :cibadmin => 'cibadmin' commands :cibadmin => 'cibadmin'
RETRY_COUNT = 100
RETRY_STEP = 6
def self.instances def self.instances
block_until_ready block_until_ready
@ -71,6 +74,25 @@ Puppet::Type.type(:cs_property).provide(:crm, :parent => Puppet::Provider::Crmsh
@property_hash[:value] = should @property_hash[:value] = should
end end
# retry the given command until it runs without errors
# or for RETRY_COUNT times with RETRY_STEP sec step
# print cluster status report on fail
# returns normal command output on success
# @return [String]
def retry_command
(0..RETRY_COUNT).each do
begin
out = yield
rescue Puppet::ExecutionFailure => e
Puppet.debug "Command failed: #{e.message}"
sleep RETRY_STEP
else
return out
end
end
fail "Execution timeout after #{RETRY_COUNT * RETRY_STEP} seconds!"
end
# Flush is triggered on anything that has been detected as being # Flush is triggered on anything that has been detected as being
# modified in the property_hash. It generates a temporary file with # modified in the property_hash. It generates a temporary file with
# the updates that need to be made. The temporary file is then used # the updates that need to be made. The temporary file is then used
@ -82,7 +104,9 @@ Puppet::Type.type(:cs_property).provide(:crm, :parent => Puppet::Provider::Crmsh
# clear this on properties, in case it's set from a previous # clear this on properties, in case it's set from a previous
# run of a different corosync type # run of a different corosync type
ENV['CIB_shadow'] = nil ENV['CIB_shadow'] = nil
crm('configure', 'property', '$id="cib-bootstrap-options"', "#{@property_hash[:name]}=#{@property_hash[:value]}") retry_command {
crm('configure', 'property', '$id="cib-bootstrap-options"', "#{@property_hash[:name]}=#{@property_hash[:value]}")
}
end end
end end
end end