heat/heat/tests/engine
Anant Patil 634c24ecfe Convergence: Concurrency subtle issues
To avoid certain concurrency related issues, the DB update API needs to
be given the traversal ID of the stack intended to be updated. By making
this change, we can void having following at all the places:

    if current_traversal != stack.current_traversal:
        return

The check for current traversal should be implicit, as a part of stack's
store and state_set methods, where self.current_traversal should be used
as expected traversal to be updated. All the state changes or updates in
DB to the stack object go through this implicit check (using
update...where).

When stack updates are triggered, the current traversal should be backed
up as previous traversal, a new traversal should be generated and the
stack should be stored in DB with expected traversal as the previous
traversal. This will ensure that no two updates can simultaneously
succeed on same stack with same traversal ID. This was one of our
primary goal.

Following example cases describe the issues we encounter:

1. When 2 updates, U1 and U2 try to update a stack concurrently:

    1. Current traversal(CT) is X
    2. U1 loads stack with CT=X
    3. U2 loads stack with CT=X
    4. U2 stores the stack and updates CT=Y
    5. U1 stores the stack and updates the CT=Z

    Both the updates have succeeded, and both would be running until
    one of the workers does stack.current_traversal == current_traversal
    and bail out.

    Ideally, U1 should have failed: only one should be allowed in case
    of concurrent update. When both U1 and U2 pass X as the expected
    traversal ID of the stack, then this problem is solved.

2. A resource R is being provisioned for stack with current traversal
   CT=X:

    1. An new update U is issued, it loads the stack with CT=X.
    2. Resource R fails and loads the stack with CT=X to mark it as FAILED.
    3. Update U updates the stack with CT=Y and goes ahead with sync_point
       etc., marks stack as UPDATE_IN_PROGRESS
    4. Resource marks the stack as UPDATE_FAILED, which to user means that
       update U has failed, but it actually is going on.

    With this patch, when Resource R fails, it will supply CT=X as
    expected traversal to be updated and will eventually fail because
    update U with CT=Y has taken over.

Partial-Bug: #1512343
Change-Id: I6ca11bed1f353786bb05fec62c89708d98159050
2015-11-26 09:45:49 +00:00
..
service Convergence: Concurrency subtle issues 2015-11-26 09:45:49 +00:00
__init__.py Split engine service test case 2015-04-20 10:19:58 -04:00
test_dependencies.py Use assertIn and assertNotIn 2015-10-26 22:40:14 +01:00
test_engine_worker.py Convergence: Concurrency subtle issues 2015-11-26 09:45:49 +00:00
test_plugin_manager.py Move core engine related unit tests to tests/engine 2015-07-21 17:37:20 +05:30
test_resource_type.py Fix HTTP error codes due to invalid templates 2015-11-20 21:27:13 -05:00
test_scheduler.py py34: cleanup 2015-10-08 20:10:54 +05:30
test_sync_point.py Convergence: Fix failing integration tests 2015-09-12 08:30:04 +00:00
tools.py Change namespace for Nova resources and tests 2015-11-18 08:42:29 +08:00