332 lines
11 KiB
ReStructuredText
332 lines
11 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License.
|
|
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
=====================
|
|
Online Schema Changes
|
|
=====================
|
|
|
|
https://blueprints.launchpad.net/nova/+spec/online-schema-changes
|
|
|
|
Make schema changes execute online (ie while services are running) when
|
|
safely and semantically possible. This will allow operators to reduce the
|
|
amount of downtime currently required during deploys.
|
|
|
|
|
|
Problem description
|
|
===================
|
|
|
|
* All database migrations are currently required to be run offline.
|
|
|
|
* Database migrations have historically been a source of lengthy downtime
|
|
during deployments.
|
|
|
|
* Schema changes are required to be repeated in two places: database
|
|
model defined in Nova and writing a migration script.
|
|
|
|
|
|
Use Cases
|
|
---------
|
|
|
|
* Any deployer that would like to reduce the amount of downtime during
|
|
deploys.
|
|
* Developers that would like to spend less time writing migration
|
|
scripts.
|
|
* Deployers that would like to maintain a set of local schema changes.
|
|
|
|
|
|
Project Priority
|
|
----------------
|
|
|
|
This fits under the 'Live Upgrades' kilo priorities.
|
|
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
A new alternative workflow for applying schema changes will be added
|
|
that expands the schema, then contracts the schema (expand/contract
|
|
workflow).
|
|
|
|
The new expand/contract workflow will not utilize any migration scripts,
|
|
instead it will dynamically compare the running schema against the
|
|
database model defined in Nova. DDL statements will be generated, and
|
|
optionally executed, to make the running schema match the model.
|
|
|
|
The existing schema management workflow is the 'db sync' command to
|
|
nova-manage. This is managed by sqlalchemy-migrate and uses individual
|
|
migration scripts. This workflow will remain for now, but is expected
|
|
to be removed at some future time, leaving the expand/contract
|
|
workflow as the way to manage the database schema.
|
|
|
|
Until the sqlalchemy-migrate workflow is removed, all schema changes
|
|
will still need sqlalchemy-migrate migration scripts to be written.
|
|
|
|
Three new nova-manage commands will be added:
|
|
|
|
#. expand. This would apply changes that are compatible with old running
|
|
code.
|
|
#. migrate. This would apply changes that are necessary to be run offline.
|
|
#. contract. This would apply changes that are compatible with new
|
|
running code.
|
|
|
|
Those schema changes that can be safely and semantically applied while
|
|
running online will be applied during the expand and contract phases.
|
|
Also, only those schema changes that will not acquire long running
|
|
locks in the database will be considered for the online phases (expand,
|
|
contract). All other schema changes will be applied during the migrate
|
|
phase.
|
|
|
|
The three new commands would be built by dynamically executing alembic's
|
|
autogenerate and DDL generating features. A list of differences would
|
|
be generated by alembic and then DDL statements would be generated using
|
|
a separate feature of alembic.
|
|
|
|
The set of DDL statements that can be run in each phase would be dictated
|
|
by the database software used (eg MySQL, PostgreSQL, etc), the version of
|
|
the database software (eg MySQL 5.5, 5.6, etc) and the storage engine used
|
|
(eg InnoDB, TokuDB, etc).
|
|
|
|
As an example, index additions can be executed online in MySQL 5.5, but
|
|
not 5.1. An index addition would be run during the expand phase for
|
|
MySQL 5.5 or higher, but during the migrate phase for MySQL 5.1.
|
|
|
|
It is intended that the initial set that will run online will be
|
|
conservative at first and a subset of what is possible to run safely.
|
|
This can be safely expanded at any time in the future.
|
|
|
|
Schema changes that will be potentially performed during expand:
|
|
- Table creates
|
|
- Column additions
|
|
- Non-Unique Index additions
|
|
|
|
Schema changes that will be potentially performed during migrate:
|
|
- Unique Index additions/drops
|
|
- Foreign Key additions/drops
|
|
|
|
Schema changes that will be potentially performed during contract:
|
|
- Table drops
|
|
- Column drops
|
|
- Non-Unique Index drops
|
|
|
|
Some schema changes that aren't currently used or are difficult to
|
|
automate will not be allowed initially. For instance, column type
|
|
changes will not be allowed initially. This is because not all column
|
|
type changes can be automated because of complexity and database
|
|
restrictions. A subset of column type changes may be implemented in
|
|
the future if they can be automated on all databases.
|
|
|
|
The migrate and contract phases would verify that the previous phases
|
|
(expand in the case of migrate, expand and migrate in the case of
|
|
contract) no longer need to be executed before continuing.
|
|
|
|
This would be performed by generating the list of needed changes for
|
|
the previous phases and verifying the list is empty. This indicates the
|
|
previous phases were either run or unnecessary.
|
|
|
|
A new '--dryrun' argument would print, instead of execute, each
|
|
generated DDL statement. This could be used by database administrators
|
|
to see what would be executed for a particular phase. These can be
|
|
optionally executed manually if desired. The schema synchronizer will
|
|
not generate that DDL statement since the running schema does not
|
|
have that difference anymore.
|
|
|
|
When 'db contract' is finally run and the running schema has been
|
|
verified to match the models, the version in the migrate_version
|
|
table would be updated to the latest shipped sqlalchemy-migrate
|
|
migration version. This would maintain compatibility with the
|
|
existing sqlalchemy-migrate workflow.
|
|
|
|
The fundamental difference between the two workflows is the
|
|
expand/contract workflow is declarative (by using the model) and the
|
|
sqlalchemy-migrate workflow is imperative (by using migration scripts).
|
|
|
|
By being declarative, it limits changes to one place (database model)
|
|
and allows for more intelligent decisions (by factoring in the database
|
|
software, engine, version, etc)
|
|
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
Splitting the existing single stream of migrations into three separate
|
|
streams of migrations. This would allow some schema changes to be
|
|
executed online.
|
|
|
|
This limits the schema changes that can be safely executed online to
|
|
that of the lowest common denominator of databases supported by Nova.
|
|
|
|
This would also require changes to sqlalchemy-migrate to be able to
|
|
manage seperate streams of migrations.
|
|
|
|
Another option would be remove the use of sqlalchemy-migrate for schema
|
|
changes altogether. The 'db sync' command to nova-manage would be
|
|
implemented by effectively calling 'db expand', 'db migrate' and
|
|
'db contract'.
|
|
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
None
|
|
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
None
|
|
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
None
|
|
|
|
|
|
Notifications impact
|
|
--------------------
|
|
|
|
None
|
|
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
None
|
|
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
Running online DDL changes can affect the performance of a running system.
|
|
This is optional and is only done when the deployer explicitly requests
|
|
it.
|
|
|
|
This can mitigated by the deployer by scheduling the expand and contract
|
|
phases to be run during periods of low activity. The expand phase can
|
|
be run an arbitrary amount of time before the migrate phase. Likewise,
|
|
the contract phase does not need to be run immediately after the
|
|
migrate phase is run.
|
|
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
Using the new expand/contract workflow is optional. If the deployer
|
|
does not want to perform database schema changes online, then they
|
|
can continue using the 'db sync' command with nova-manage.
|
|
|
|
Those deployers that want to take advantage of the online schema changes
|
|
will need to run the 'db expand', 'db migrate' and 'db contract' commands
|
|
at the appropriate steps in their deployment process.
|
|
|
|
Switching from the sqlalchemy-migrate workflow to the expand/contract
|
|
workflow can happen at any time. The reverse can only happen after
|
|
a final 'db contract' is run (to ensure all schema changes are applied
|
|
and the migrate_version table is updated).
|
|
|
|
If the expand/contract workflow is used, then 'db contract' is required
|
|
to be execute once for each formal release of Nova. This is to ensure
|
|
that SQL namespaces (table, column, etc) can be reused in the future.
|
|
|
|
Deployers that have made local schema changes (extra indexes, columns,
|
|
tables, etc) will need to update the model to ensure those additions
|
|
aren't dropped during the contract phase.
|
|
|
|
If using the expand/contract workflow, then deployers can run 'db expand'
|
|
before stopping or restarting any services. 'db migrate' might acquire
|
|
locks in the database and may affect running services. 'db contract' can
|
|
be run after all Nova services are running the new code.
|
|
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
Eventually no more sqlalchemy-migrate migrations would need to be written
|
|
leading to less work for developers.
|
|
|
|
No more migration compaction. The initial creation of tables for a
|
|
database is handled completely by the schema synchronizer.
|
|
|
|
Some schema changes will no longer be allowed. This is generally
|
|
restricted to schema changes that cannot be reasonably automated but
|
|
those schema changes are generally the ones with the most downtime
|
|
anyway.
|
|
|
|
Namespaces (table, column, index, etc) are not reusable in a formal
|
|
release cycle. The contract phase is only required to be executed once
|
|
per formal release, pinning old names until the next formal release.
|
|
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
johannes.erdfelt
|
|
|
|
Other contributors:
|
|
None
|
|
|
|
|
|
Work Items
|
|
----------
|
|
|
|
- Implement schema synchronizer using alembic.autogenerate
|
|
- Implement new 'expand', 'migrate' and 'contract' commands to
|
|
'nova-manage db'
|
|
- Ensure grenade and turbo-hipster tests are update
|
|
|
|
|
|
Dependencies
|
|
============
|
|
|
|
This builds on top of the validate-migrations-and-model spec. The
|
|
existing use of alembic.autogenerate will now also be used to generate
|
|
the list of needed changes to make the schema match the model.
|
|
|
|
This also depends on dropping the use of sqlalchemy-migrate for data
|
|
migrations.
|
|
|
|
|
|
Testing
|
|
=======
|
|
|
|
No tempest tests will be added since tempest does not do any upgrade
|
|
testing.
|
|
|
|
A Nova unit test will be added to test starting from an empty database.
|
|
|
|
Grenade currently tests upgrades from older versions of Nova. A new
|
|
test to use the new 'db expand', 'db migrate' and 'db contract' commands
|
|
are necessary. This will be compared with the result of 'db sync' to
|
|
ensure that upgrades from past commits end up semantically identical.
|
|
|
|
turbo-hipster tests upgrades using production database snapshots. It
|
|
currently uses the 'db sync' command to nova-manage. The new
|
|
expand/contract workflow will be tested as well to ensure that both
|
|
workflows function correctly.
|
|
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
Documentation will need to be updated to include the new 'expand',
|
|
'migrate' and 'contract' commands to 'nova-manage db'.
|
|
|
|
Release Notes will need to be updated to warn that the model will need
|
|
to be updated with local schema changes.
|
|
|
|
Instance Types would need to be manually created as the 216 migration
|
|
would not necessarily run anymore.
|
|
|
|
|
|
References
|
|
==========
|
|
|
|
https://etherpad.openstack.org/p/kilo-nova-zero-downtime-upgrades
|