Re-add Online Schema Changes
Re-adding Ibbab7cb29911d52b57c467c6bfbc5876d1102e79 for Liberty Change-Id: I4004ef58a099ac0d33d80f587e576d62ff85e343 Previously-approved: kilo
This commit is contained in:
331
specs/liberty/approved/online-schema-changes.rst
Normal file
331
specs/liberty/approved/online-schema-changes.rst
Normal file
@@ -0,0 +1,331 @@
|
||||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
=====================
|
||||
Online Schema Changes
|
||||
=====================
|
||||
|
||||
https://blueprints.launchpad.net/nova/+spec/online-schema-changes
|
||||
|
||||
Make schema changes execute online (ie while services are running) when
|
||||
safely and semantically possible. This will allow operators to reduce the
|
||||
amount of downtime currently required during deploys.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
* All database migrations are currently required to be run offline.
|
||||
|
||||
* Database migrations have historically been a source of lengthy downtime
|
||||
during deployments.
|
||||
|
||||
* Schema changes are required to be repeated in two places: database
|
||||
model defined in Nova and writing a migration script.
|
||||
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
* Any deployer that would like to reduce the amount of downtime during
|
||||
deploys.
|
||||
* Developers that would like to spend less time writing migration
|
||||
scripts.
|
||||
* Deployers that would like to maintain a set of local schema changes.
|
||||
|
||||
|
||||
Project Priority
|
||||
----------------
|
||||
|
||||
This fits under the 'Live Upgrades' liberty priorities.
|
||||
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
A new alternative workflow for applying schema changes will be added
|
||||
that expands the schema, then contracts the schema (expand/contract
|
||||
workflow).
|
||||
|
||||
The new expand/contract workflow will not utilize any migration scripts,
|
||||
instead it will dynamically compare the running schema against the
|
||||
database model defined in Nova. DDL statements will be generated, and
|
||||
optionally executed, to make the running schema match the model.
|
||||
|
||||
The existing schema management workflow is the 'db sync' command to
|
||||
nova-manage. This is managed by sqlalchemy-migrate and uses individual
|
||||
migration scripts. This workflow will remain for now, but is expected
|
||||
to be be removed at some future time, leaving the expand/contract
|
||||
workflow as the way to manage the database schema.
|
||||
|
||||
Until the sqlalchemy-migrate workflow is removed, all schema changes
|
||||
will still need sqlalchemy-migrate migration scripts to be written.
|
||||
|
||||
Three new nova-manage commands will be added:
|
||||
|
||||
#. expand. This would apply changes that are compatible with old running
|
||||
code.
|
||||
#. migrate. This would apply changes that are necessary to be run offline.
|
||||
#. contract. This would apply changes that are compatible with new
|
||||
running code.
|
||||
|
||||
Those schema changes that can be safely and semantically applied while
|
||||
running online will be applied during the expand and contract phases.
|
||||
Also, only those schema changes that will not acquire long running
|
||||
locks in the database will be considered for the online phases (expand,
|
||||
contract). All other schema changes will be applied during the migrate
|
||||
phase.
|
||||
|
||||
The three new commands would be built by dynamically executing alembic's
|
||||
autogenerate and DDL generating features. A list of differences would
|
||||
be generated by alembic and then DDL statements would be generated using
|
||||
a separate feature of alembic.
|
||||
|
||||
The set of DDL statements that can be run in each phase would be dictated
|
||||
by the database software used (eg MySQL, PostgreSQL, etc), the version of
|
||||
the database software (eg MySQL 5.5, 5.6, etc) and the storage engine used
|
||||
(eg InnoDB, TokuDB, etc).
|
||||
|
||||
As an example, index additions can be executed online in MySQL 5.5, but
|
||||
not 5.1. An index addition would be run during the expand phase for
|
||||
MySQL 5.5 or higher, but during the migrate phase for MySQL 5.1.
|
||||
|
||||
It is intended that the initial set that will run online will be
|
||||
conservative at first and a subset of what is possible to run safely.
|
||||
This can be safely expanded at any time in the future.
|
||||
|
||||
Schema changes that will be potentially performed during expand:
|
||||
- Table creates
|
||||
- Column additions
|
||||
- Non-Unique Index additions
|
||||
|
||||
Schema changes that will be potentially performed during migrate:
|
||||
- Unique Index additions/drops
|
||||
- Foreign Key additions/drops
|
||||
|
||||
Schema changes that will be potentially performed during contract:
|
||||
- Table drops
|
||||
- Column drops
|
||||
- Non-Unique Index drops
|
||||
|
||||
Some schema changes that aren't currently used or are difficult to
|
||||
automate will not be allowed initially. For instance, column type
|
||||
changes will not be allowed initially. This is because not all column
|
||||
type changes can be automated because of complexity and database
|
||||
restrictions. A subset of column type changes may be implemented in
|
||||
the future if they can be automated on all databases.
|
||||
|
||||
The migrate and contract phases would verify that the previous phases
|
||||
(expand in the case of migrate, expand and migrate in the case of
|
||||
contract) no longer need to be executed before continuing.
|
||||
|
||||
This would be performed by generating the list of needed changes for
|
||||
the previous phases and verifying the list is empty. This indicates the
|
||||
previous phases were either run or unnecessary.
|
||||
|
||||
A new '--dryrun' argument would print, instead of execute, each
|
||||
generated DDL statement. This could be used by database administrators
|
||||
to see what would be executed for a particular phase. These can be
|
||||
optionally executed manually if desired. The schema synchronizer will
|
||||
not generate that DDL statement since the running schema does not
|
||||
have that difference anymore.
|
||||
|
||||
When 'db contract' is finally run and the running schema has been
|
||||
verified to match the models, the version in the migrate_version
|
||||
table would be updated to the latest shipped sqlalchemy-migrate
|
||||
migration version. This would maintain compatibility with the
|
||||
existing sqlalchemy-migrate workflow.
|
||||
|
||||
The fundamental difference between the two workflows is the
|
||||
expand/contract workflow is declarative (by using the model) and the
|
||||
sqlalchemy-migrate workflow is imperative (by using migration scripts).
|
||||
|
||||
By being declarative, it limits changes to one place (database model)
|
||||
and allows for more intelligent decisions (by factoring in the database
|
||||
software, engine, version, etc)
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Splitting the existing single stream of migrations into three separate
|
||||
streams of migrations. This would allow some schema changes to be
|
||||
executed online.
|
||||
|
||||
This limits the schema changes that can be safely executed online to
|
||||
that of the lowest common denominator of databases supported by Nova.
|
||||
|
||||
This would also require changes to sqlalchemy-migrate to be able to
|
||||
manage seperate streams of migrations.
|
||||
|
||||
Another option would be remove the use of sqlalchemy-migrate for schema
|
||||
changes altogether. The 'db sync' command to nova-manage would be
|
||||
implemented by effectively calling 'db expand', 'db migrate' and
|
||||
'db contract'.
|
||||
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
None
|
||||
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Running online DDL changes can affect the performance of a running system.
|
||||
This is optional and is only done when the deployer explicitly requests
|
||||
it.
|
||||
|
||||
This can mitigated by the deployer by scheduling the expand and contract
|
||||
phases to be run during periods of low activity. The expand phase can
|
||||
be run an arbitrary amount of time before the migrate phase. Likewise,
|
||||
the contract phase does not need to be run immediately after the
|
||||
migrate phase is run.
|
||||
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
Using the new expand/contract workflow is optional. If the deployer
|
||||
does not want to perform database schema changes online, then they
|
||||
can continue using the 'db sync' command with nova-manage.
|
||||
|
||||
Those deployers that want to take advantage of the online schema changes
|
||||
will need to run the 'db expand', 'db migrate' and 'db contract' commands
|
||||
at the appropriate steps in their deployment process.
|
||||
|
||||
Switching from the sqlalchemy-migrate workflow to the expand/contract
|
||||
workflow can happen at any time. The reverse can only happen after
|
||||
a final 'db contract' is run (to ensure all schema changes are applied
|
||||
and the migrate_version table is updated).
|
||||
|
||||
If the expand/contract workflow is used, then 'db contract' is required
|
||||
to be execute once for each formal release of Nova. This is to ensure
|
||||
that SQL namespaces (table, column, etc) can be reused in the future.
|
||||
|
||||
Deployers that have made local schema changes (extra indexes, columns,
|
||||
tables, etc) will need to update the model to ensure those additions
|
||||
aren't dropped during the contract phase.
|
||||
|
||||
If using the expand/contract workflow, then deployers can run 'db expand'
|
||||
before stopping or restarting any services. 'db migrate' might acquire
|
||||
locks in the database and may affect running services. 'db contract' can
|
||||
be run after all Nova services are running the new code.
|
||||
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
Eventually no more sqlalchemy-migrate migrations would need to be written
|
||||
leading to less work for developers.
|
||||
|
||||
No more migration compaction. The initial creation of tables for a
|
||||
database is handled completely by the schema synchronizer.
|
||||
|
||||
Some schema changes will no longer be allowed. This is generally
|
||||
restricted to schema changes that cannot be reasonably automated but
|
||||
those schema changes are generally the ones with the most downtime
|
||||
anyway.
|
||||
|
||||
Namespaces (table, column, index, etc) are not reusable in a formal
|
||||
release cycle. The contract phase is only required to be executed once
|
||||
per formal release, pinning old names until the next formal release.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
johannes.erdfelt
|
||||
|
||||
Other contributors:
|
||||
None
|
||||
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
- Implement schema synchronizer using alembic.autogenerate
|
||||
- Implement new 'expand', 'migrate' and 'contract' commands to
|
||||
'nova-manage db'
|
||||
- Ensure grenade and turbo-hipster tests are update
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
This builds on top of the validate-migrations-and-model spec. The
|
||||
existing use of alembic.autogenerate will now also be used to generate
|
||||
the list of needed changes to make the schema match the model.
|
||||
|
||||
This also depends on dropping the use of sqlalchemy-migrate for data
|
||||
migrations.
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
No tempest tests will be added since tempest does not do any upgrade
|
||||
testing.
|
||||
|
||||
A Nova unit test will be added to test starting from an empty database.
|
||||
|
||||
Grenade currently tests upgrades from older versions of Nova. A new
|
||||
test to use the new 'db expand', 'db migrate' and 'db contract' commands
|
||||
are necessary. This will be compared with the result of 'db sync' to
|
||||
ensure that upgrades from past commits end up semantically identical.
|
||||
|
||||
turbo-hipster tests upgrades using production database snapshots. It
|
||||
currently uses the 'db sync' command to nova-manage. The new
|
||||
expand/contract workflow will be tested as well to ensure that both
|
||||
workflows function correctly.
|
||||
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
Documentation will need to be updated to include the new 'expand',
|
||||
'migrate' and 'contract' commands to 'nova-manage db'.
|
||||
|
||||
Release Notes will need to be updated to warn that the model will need
|
||||
to be updated with local schema changes.
|
||||
|
||||
Instance Types would need to be manually created as the 216 migration
|
||||
would not necessarily run anymore.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
https://etherpad.openstack.org/p/kilo-nova-zero-downtime-upgrades
|
||||
Reference in New Issue
Block a user