Sleep before stopping threads for delete

The logged error in bug #1328983 occurs in the following scenario:
* single heat-engine (so no multi-engine locking paths are invoked)
* multiple stack deletes on the same stack
* when the threads are stopped from a previous delete call so
  that the new delete call can acquire the lock and do the delete

This change does an eventlet sleep before forcefully stopping
the thread when taking over a delete from a different
thread. This gives the previous delete thread an opportunity to
finish naturally.

This may not prevent the error in bug #1328983 from ever
being logged in a production environment, but it is an internal error
which does not propagate to the user. This change should however
prevent this error from being logged during a tempest run, and the
original gate failure was due to ERROR log entry detection rather
than actual failing tests.

By experimentation the timeout of 0.2s was chosen. The error in
bug #1328983 started being logged again when the timeout was
reduced to 0.02s

Closes-Bug: #1328983

Change-Id: I8f95f29bd238e097ed9f4b889afe12c88d193240
This commit is contained in:
Steve Baker 2014-07-02 15:16:06 +12:00
parent 657b339967
commit cb966ff38a

View File

@ -746,6 +746,9 @@ class EngineService(service.Service):
# Current engine has the lock
if acquire_result == self.engine_id:
# give threads which are almost complete an opportunity to
# finish naturally before force stopping them
eventlet.sleep(0.2)
self.thread_group_mgr.stop(stack.id)
# Another active engine has the lock