Fix race condition deleting in-progress stack
If we call stack_delete on a stack with an operation in progress, we kill any existing delete thread that is running. However, we don't wait for that thread to die before starting a new thread to delete the stack again. If any part of the cleanup operation in the old thread (i.e. handling of the GreenthreadExit exception) causes a context switch (which is likely), other threads can start working while the cleanup is still in progress. This could create race conditions like the one in bug 1328983. Avoid this problem by making sure we wait for all threads in a thread group to die before continuing. (Note that this means the user's API call is blocking on the cleanup of the old thread. This is sadly unavoidable for now, but should probably be fixed in the future by stopping the old thread from the new delete thread.) This was suggested earlier, but removed without explanation between patchsets 11 and 12 of I188e43ad88b98da7d1a08269189aaefa57c36df2, which implemented deletion of in-progress stacks with locks: https://review.openstack.org/#/c/63002/11..12/heat/engine/service.py Also remove the call to stack_lock_release(), which was a hack around the fact that wait() does not wait for link()ed functions - eventlet sends the exit event (that wait() is waiting on) before resolving links. Instead, add another link to the end of the list to indicate that links have all been run. This should eliminate "Lock was already released" messages in the logs. Change-Id: I2e4561cbe29ab10554da67859df8c2db0854dd38changes/62/99562/4
parent
59be7efed1
commit
ae2b47d8fd
Loading…
Reference in New Issue