The db.api module provides a useless indirection to the only
implementation we ever had, sqlalchemy. Let's use that directly instead
of the wrapper.
When partially removing eager loading of raw_template in stacks
(260b79ed28), we didn't get into account
that accessing the sqlalchemy field would create an additional query
whereas it was previously eager loaded. This removes it by only filling
the field when it's already fetched.
Always loading the raw template in situations where we didn't need it -
e.g. in identify_stack, where we just want the name + id (given one of
them), or when getting the summary stack list - uses up DB bandwidth and
This partially reverts commit 3ab0ede98c.
* The eager_load option to get_stack() is reinstated, but with the default
flipped to True. In places where we explicitly do not want to load the
template, we pass False.
* stack_get_by_name() no longer eagerly loads the template. There were no
instances of this where we subsequently use the template.
* stack_get_all() acquires an eager_load option, with the default set to
False. Direct users of objects.stack.Stack.get_all() will not eagerly
load by default, but users of engine.stack.Stack.load_all() will get the
template eagerly loaded. This practically always corresponds to what you
Run `heat-manage migrate-convergence-1 <stack_id>` to migrate
legacy stack to convergence engine.
Heat engine is used for doing migration i.e. migration can't
be done offline.
We can use admin_context to have access to stacks
and software configs across projects. This removes
the tenant_safe flag from rpc and db api. This is
backward compatible with older rpc clients.
We still support use of global_tenant flag for listing
stacks and software configs. However, by default
an admin(user with admin role in admin_project)
would not need that.
The vast majority of stack fetches are immediately followed by a
raw_template fetch, so this change always eagerly fetches the
raw_template for every stack fetch.
During stack versioned object creation the stack's raw_template object
is used to construct the versioned raw template object.
In Python 3 __ne__ by default delegates to __eq__ and inverts the
result, but in Python 2 they urge you to define __ne__ when you
define __eq__ for it to work properly .There are no implied
relationships among the comparison operators. The truth of x==y
does not imply that x!=y is false. Accordingly, when defining
__eq__(), one should also define __ne__() so that the operators
will behave as expected.
Tags for a stack need to be loaded in the following scenarios:
- To display during a stack show
- To store during a create or update
All other operations which require a loaded stack will likely not
reference the tags, so the extra SQL query which always loads them is
often not necessary.
This change removes the tags loading from stack_object.Stack and
stack.Stack.load and implements annotated properties functions for a
Stack.tags property so that they are loaded from the database on
Currently calls to get a collection of stacks via the stack object
have the following data access behaviour:
- One query to fetch the stack records
- One query per stack to fetch the raw template
- One query per stack to fetch the tags
This causes excessive database round trips when there are many stacks.
In addition, the list_stacks call results in a collection of full
stack objects to be created which builds a fully parsed stack - most
of this information is then discarded by the RPC and middleware
formatters so this overhead is for no benefit.
This change is the first step in changing the data access patterns by
querying the Stack versioned object directly instead of via the full
Stack object. A future change will apply an eager fetch and caching
approach to avoid unnecessary queries.
This change does the following:
- Service list_stacks calls stack_object.Stack.get_all directly
- Service list_stacks formats with new function
api.format_stack_db_object with the exact fields required by the CFN
and REST API list stacks formatters
- Since the description field is the only one that requires full stack
parsing, it is now set to an empty string for stack listings via API
or calls to aws cloudformation list-stacks 
This last point may be controversial, my attempts to find uses of the
stack description in stack listings found none that do, and the following
API uses which *don't* show the description:
- heat stack-list
- openstack stack list
- horizon stack list
- rackspace control panel stack list
If we want to add the description back later we could always add a
description field back to the Stack table to store the denormalised
Currently, StackResource loads the whole stack when checking for status
(in check_*_complete method), but only care about the state of the
stack. This is a fairly expensive operation, as it retrieves the
template and reparses everything. This simplifies it with a new API that
simply query the stack status from the database.
Retrieving the field from the db object triggers the query to
stack_tags, so we don't need to do it a second time to retrieve the list
of tags. Instead, use the data directly.
When refreshing a Stack object, we first retrive the object from the
database (which handle NotFound errors) and then call refresh in a
different transaction. By that time, the db object can be gone and
refresh would fail, and it also doesn't bring any benefit, so let's
remove that call.
As we don't have transactions and we're loading the templates
seperately, they can be gone by the time we load the stack. Let's ignore
NotFound errors in that case.
If we get passed a non-string stack name, e.g a map or list, we
fail with a DB error associated with looking up the existing stack.
So instead force all stack lookups to use string identifiers, and
make the name validation for new stacks robust to fail gracefully
when there is an invalid (non string) argument passed.
Fix raw_template purge query on MySQL, and handle stack tags before
removing stacks. This also removes a bunch of race conditions where we
deleted incorrect data.
To avoid certain concurrency related issues, the DB update API needs to
be given the traversal ID of the stack intended to be updated. By making
this change, we can void having following at all the places:
if current_traversal != stack.current_traversal:
The check for current traversal should be implicit, as a part of stack's
store and state_set methods, where self.current_traversal should be used
as expected traversal to be updated. All the state changes or updates in
DB to the stack object go through this implicit check (using
When stack updates are triggered, the current traversal should be backed
up as previous traversal, a new traversal should be generated and the
stack should be stored in DB with expected traversal as the previous
traversal. This will ensure that no two updates can simultaneously
succeed on same stack with same traversal ID. This was one of our
Following example cases describe the issues we encounter:
1. When 2 updates, U1 and U2 try to update a stack concurrently:
1. Current traversal(CT) is X
2. U1 loads stack with CT=X
3. U2 loads stack with CT=X
4. U2 stores the stack and updates CT=Y
5. U1 stores the stack and updates the CT=Z
Both the updates have succeeded, and both would be running until
one of the workers does stack.current_traversal == current_traversal
and bail out.
Ideally, U1 should have failed: only one should be allowed in case
of concurrent update. When both U1 and U2 pass X as the expected
traversal ID of the stack, then this problem is solved.
2. A resource R is being provisioned for stack with current traversal
1. An new update U is issued, it loads the stack with CT=X.
2. Resource R fails and loads the stack with CT=X to mark it as FAILED.
3. Update U updates the stack with CT=Y and goes ahead with sync_point
etc., marks stack as UPDATE_IN_PROGRESS
4. Resource marks the stack as UPDATE_FAILED, which to user means that
update U has failed, but it actually is going on.
With this patch, when Resource R fails, it will supply CT=X as
expected traversal to be updated and will eventually fail because
update U with CT=Y has taken over.
When StackResource 'update_with_template' is called and there
is no existing nested_stack, we call create_stack with an empty
template(TemplateResource CREATE_FAILED->UPDATE flow). We check
db for CREATE_COMPLETE before calling update_stack.There is a
possibility that the create_stack green thread has not finished
and released the stack lock. By persisting the stack state for
COMPLETE/FAILED in the same db session as releasing the lock,
when green thread finishes it's run, we can avoid this race
The stack update API needs to know the expected traversal ID of the
stack (each update is associated with a unique traversal ID) it is
When two updates are triggered simultaneously - for example, in a case
where two resources at same level try to rollback the stack when they
fail at same time - then the updates need to give the traversal ID they
intend to update. If it happens that the resource A triggers the
rollback immediately after the rollback of A has started, but before A
has persisted in DB, then the system is in erroneous state.
Steps to describe the problem:
1. Resource A and B of a Stack S fail.
2. Resource A loads the stack.
3. Resource B loads the stack.
4. Resource A triggers rollback - it takes the previous template as
current template, updates the DB.
5. Resource B now again triggers the rollback. It is not aware of update
triggered by resource A.
If the updates triggered by A and B give the stack traversal they
expect to be updated, then update from resource B will fail, since the
expected traversal ID will not be found in DB as a result of resource A
already updating it.
1. don't return DB models from the objects API
(only return Objects)
2. delete shouldn't return anything
3. update_and_save should return the refreshed object.
Note: there is still some inconsistency in what is returned by
update_by_id() some return an object and some return a bool.
The count is performed in 2 parts
- A function which builds a list of all nested stacks
by recursively calling stack_get_all_by_owner_id
- A query which counts resources which belong to the list
Considering this will be the basis for fixing bug #1455589 then this
approach is appropriate for backporting to stable/kilo, but master would
ideally replace this soon with a new Stack attribute that stores the
calculated resource count.
This change adds a (currently unused) database function
stack_get_root_id to find the root stack ID for any stack.
Scheduler hints are moved to using Stack.root_stack_id() in this change.
Remaining uses of Stack.root_stack() will switch to using
Stack.root_stack_id() later in the series.
1. Add a configuration option to enable/disable template parameters
2. Encrypt hidden parameters before storing them in the database and
decrypt on stack load.
Implements: blueprint encrypt-hidden-parameters
Co-Authored-By: Jason Dunsmore <email@example.com>
We are persisting for a number of reasons:
- so we don't have to pass this through ever rpc call
- the API exposes parent_resource (currently always None as
it is not persisted)
The current Stack object refresh doesn't actually do any refreshing,
which explains the functional test speed regression since the RPC
nested stack code relies on stack.refresh() to poll for state change.
With this change AutoScalingSignalTest.test_signal_with_policy_update
locally takes ~60s consistently. Without this change ~50% of test runs
have some form of execution delay, taking 80s -> 300s+.
Implementation of oslo.versionedobjects.
This commit consists basic mechanism and first objects.
This should be base of implementation versoning to other objects
Implements: blueprint versioned-objects
Co-Authored-By: ShaoHe Feng <firstname.lastname@example.org>
Co-Authored-By: Grzegorz Grasza <email@example.com>