pacemaker: run neutron-server-start-wait-stop only at step 4

neutron-server-start-wait-stop is a dangerous Exec that is exposed to
race conditions, because it does not have "onlyif" or "unless"
statements.

That means during a deployment, this exec can be run in the wrong order
during Step 5 and/or 6, while it was supposed to be run at Step 4 only.
If that happens, the exec will fail because puppet tries to start
neutron-server while Pacemaker already started the resource. So in that
case, systemd would returns 1 to Puppet which would return 6 to the
overcloud deployment and the deployment would fail to finish correctly.

This patch aims to prevent from this scenario by making sure we run the
exec only during the step 4.

Also, in order to secure it a bit more, we add 'unless' statement to
this exec, so we would make sure the Puppet run would be idempotent and
the Exec would run one successful time only.

https://bugzilla.redhat.com/show_bug.cgi?id=1290582

Change-Id: I42813c5cff6c525c15c9c24baad4e355f88af672
This commit is contained in:
Emilien Macchi 2015-12-10 16:23:50 -05:00
parent 99bd9970d6
commit 676ec6ea6d

View File

@ -1060,15 +1060,32 @@ if hiera('step') >= 4 {
Pacemaker::Resource::Service[$::glance::params::api_service_name]],
}
# Neutron
# NOTE(gfidente): Neutron will try to populate the database with some data
# as soon as neutron-server is started; to avoid races we want to make this
# happen only on one node, before normal Pacemaker initialization
# https://bugzilla.redhat.com/show_bug.cgi?id=1233061
exec { '/usr/bin/systemctl start neutron-server && /usr/bin/sleep 5' : } ->
pacemaker::resource::service { $::neutron::params::server_service:
clone_params => 'interleave=true',
require => Pacemaker::Resource::Service[$::keystone::params::service_name],
if hiera('step') == 4 {
# Neutron
# NOTE(gfidente): Neutron will try to populate the database with some data
# as soon as neutron-server is started; to avoid races we want to make this
# happen only on one node, before normal Pacemaker initialization
# https://bugzilla.redhat.com/show_bug.cgi?id=1233061
# NOTE(emilien): we need to run this Exec only at Step 4 otherwise this exec
# will try to start the service while it's already started by Pacemaker
# It would result to a deployment failure since systemd would return 1 to Puppet
# and the overcloud would fail to deploy (6 would be returned).
# This conditional prevents from a race condition during the deployment.
# https://bugzilla.redhat.com/show_bug.cgi?id=1290582
exec { 'neutron-server-systemd-start-sleep' :
command => 'systemctl start neutron-server && /usr/bin/sleep 5',
path => '/usr/bin',
unless => '/sbin/pcs resource show neutron-server',
} ->
pacemaker::resource::service { $::neutron::params::server_service:
clone_params => 'interleave=true',
require => Pacemaker::Resource::Service[$::keystone::params::service_name]
}
} else {
pacemaker::resource::service { $::neutron::params::server_service:
clone_params => 'interleave=true',
require => Pacemaker::Resource::Service[$::keystone::params::service_name]
}
}
if hiera('neutron::enable_l3_agent', true) {
pacemaker::resource::service { $::neutron::params::l3_agent_service: