Re-add Online Schema Changes

Re-adding Ibbab7cb29911d52b57c467c6bfbc5876d1102e79 for Liberty

Change-Id: I4004ef58a099ac0d33d80f587e576d62ff85e343
Previously-approved: kilo
This commit is contained in:
Joe Gordon
2015-05-22 11:33:16 -07:00
committed by Dan Smith
parent 5a8cda8986
commit 72e9a96018

View File

@@ -0,0 +1,331 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=====================
Online Schema Changes
=====================
https://blueprints.launchpad.net/nova/+spec/online-schema-changes
Make schema changes execute online (ie while services are running) when
safely and semantically possible. This will allow operators to reduce the
amount of downtime currently required during deploys.
Problem description
===================
* All database migrations are currently required to be run offline.
* Database migrations have historically been a source of lengthy downtime
during deployments.
* Schema changes are required to be repeated in two places: database
model defined in Nova and writing a migration script.
Use Cases
---------
* Any deployer that would like to reduce the amount of downtime during
deploys.
* Developers that would like to spend less time writing migration
scripts.
* Deployers that would like to maintain a set of local schema changes.
Project Priority
----------------
This fits under the 'Live Upgrades' liberty priorities.
Proposed change
===============
A new alternative workflow for applying schema changes will be added
that expands the schema, then contracts the schema (expand/contract
workflow).
The new expand/contract workflow will not utilize any migration scripts,
instead it will dynamically compare the running schema against the
database model defined in Nova. DDL statements will be generated, and
optionally executed, to make the running schema match the model.
The existing schema management workflow is the 'db sync' command to
nova-manage. This is managed by sqlalchemy-migrate and uses individual
migration scripts. This workflow will remain for now, but is expected
to be be removed at some future time, leaving the expand/contract
workflow as the way to manage the database schema.
Until the sqlalchemy-migrate workflow is removed, all schema changes
will still need sqlalchemy-migrate migration scripts to be written.
Three new nova-manage commands will be added:
#. expand. This would apply changes that are compatible with old running
code.
#. migrate. This would apply changes that are necessary to be run offline.
#. contract. This would apply changes that are compatible with new
running code.
Those schema changes that can be safely and semantically applied while
running online will be applied during the expand and contract phases.
Also, only those schema changes that will not acquire long running
locks in the database will be considered for the online phases (expand,
contract). All other schema changes will be applied during the migrate
phase.
The three new commands would be built by dynamically executing alembic's
autogenerate and DDL generating features. A list of differences would
be generated by alembic and then DDL statements would be generated using
a separate feature of alembic.
The set of DDL statements that can be run in each phase would be dictated
by the database software used (eg MySQL, PostgreSQL, etc), the version of
the database software (eg MySQL 5.5, 5.6, etc) and the storage engine used
(eg InnoDB, TokuDB, etc).
As an example, index additions can be executed online in MySQL 5.5, but
not 5.1. An index addition would be run during the expand phase for
MySQL 5.5 or higher, but during the migrate phase for MySQL 5.1.
It is intended that the initial set that will run online will be
conservative at first and a subset of what is possible to run safely.
This can be safely expanded at any time in the future.
Schema changes that will be potentially performed during expand:
- Table creates
- Column additions
- Non-Unique Index additions
Schema changes that will be potentially performed during migrate:
- Unique Index additions/drops
- Foreign Key additions/drops
Schema changes that will be potentially performed during contract:
- Table drops
- Column drops
- Non-Unique Index drops
Some schema changes that aren't currently used or are difficult to
automate will not be allowed initially. For instance, column type
changes will not be allowed initially. This is because not all column
type changes can be automated because of complexity and database
restrictions. A subset of column type changes may be implemented in
the future if they can be automated on all databases.
The migrate and contract phases would verify that the previous phases
(expand in the case of migrate, expand and migrate in the case of
contract) no longer need to be executed before continuing.
This would be performed by generating the list of needed changes for
the previous phases and verifying the list is empty. This indicates the
previous phases were either run or unnecessary.
A new '--dryrun' argument would print, instead of execute, each
generated DDL statement. This could be used by database administrators
to see what would be executed for a particular phase. These can be
optionally executed manually if desired. The schema synchronizer will
not generate that DDL statement since the running schema does not
have that difference anymore.
When 'db contract' is finally run and the running schema has been
verified to match the models, the version in the migrate_version
table would be updated to the latest shipped sqlalchemy-migrate
migration version. This would maintain compatibility with the
existing sqlalchemy-migrate workflow.
The fundamental difference between the two workflows is the
expand/contract workflow is declarative (by using the model) and the
sqlalchemy-migrate workflow is imperative (by using migration scripts).
By being declarative, it limits changes to one place (database model)
and allows for more intelligent decisions (by factoring in the database
software, engine, version, etc)
Alternatives
------------
Splitting the existing single stream of migrations into three separate
streams of migrations. This would allow some schema changes to be
executed online.
This limits the schema changes that can be safely executed online to
that of the lowest common denominator of databases supported by Nova.
This would also require changes to sqlalchemy-migrate to be able to
manage seperate streams of migrations.
Another option would be remove the use of sqlalchemy-migrate for schema
changes altogether. The 'db sync' command to nova-manage would be
implemented by effectively calling 'db expand', 'db migrate' and
'db contract'.
Data model impact
-----------------
None
REST API impact
---------------
None
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
Running online DDL changes can affect the performance of a running system.
This is optional and is only done when the deployer explicitly requests
it.
This can mitigated by the deployer by scheduling the expand and contract
phases to be run during periods of low activity. The expand phase can
be run an arbitrary amount of time before the migrate phase. Likewise,
the contract phase does not need to be run immediately after the
migrate phase is run.
Other deployer impact
---------------------
Using the new expand/contract workflow is optional. If the deployer
does not want to perform database schema changes online, then they
can continue using the 'db sync' command with nova-manage.
Those deployers that want to take advantage of the online schema changes
will need to run the 'db expand', 'db migrate' and 'db contract' commands
at the appropriate steps in their deployment process.
Switching from the sqlalchemy-migrate workflow to the expand/contract
workflow can happen at any time. The reverse can only happen after
a final 'db contract' is run (to ensure all schema changes are applied
and the migrate_version table is updated).
If the expand/contract workflow is used, then 'db contract' is required
to be execute once for each formal release of Nova. This is to ensure
that SQL namespaces (table, column, etc) can be reused in the future.
Deployers that have made local schema changes (extra indexes, columns,
tables, etc) will need to update the model to ensure those additions
aren't dropped during the contract phase.
If using the expand/contract workflow, then deployers can run 'db expand'
before stopping or restarting any services. 'db migrate' might acquire
locks in the database and may affect running services. 'db contract' can
be run after all Nova services are running the new code.
Developer impact
----------------
Eventually no more sqlalchemy-migrate migrations would need to be written
leading to less work for developers.
No more migration compaction. The initial creation of tables for a
database is handled completely by the schema synchronizer.
Some schema changes will no longer be allowed. This is generally
restricted to schema changes that cannot be reasonably automated but
those schema changes are generally the ones with the most downtime
anyway.
Namespaces (table, column, index, etc) are not reusable in a formal
release cycle. The contract phase is only required to be executed once
per formal release, pinning old names until the next formal release.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
johannes.erdfelt
Other contributors:
None
Work Items
----------
- Implement schema synchronizer using alembic.autogenerate
- Implement new 'expand', 'migrate' and 'contract' commands to
'nova-manage db'
- Ensure grenade and turbo-hipster tests are update
Dependencies
============
This builds on top of the validate-migrations-and-model spec. The
existing use of alembic.autogenerate will now also be used to generate
the list of needed changes to make the schema match the model.
This also depends on dropping the use of sqlalchemy-migrate for data
migrations.
Testing
=======
No tempest tests will be added since tempest does not do any upgrade
testing.
A Nova unit test will be added to test starting from an empty database.
Grenade currently tests upgrades from older versions of Nova. A new
test to use the new 'db expand', 'db migrate' and 'db contract' commands
are necessary. This will be compared with the result of 'db sync' to
ensure that upgrades from past commits end up semantically identical.
turbo-hipster tests upgrades using production database snapshots. It
currently uses the 'db sync' command to nova-manage. The new
expand/contract workflow will be tested as well to ensure that both
workflows function correctly.
Documentation Impact
====================
Documentation will need to be updated to include the new 'expand',
'migrate' and 'contract' commands to 'nova-manage db'.
Release Notes will need to be updated to warn that the model will need
to be updated with local schema changes.
Instance Types would need to be manually created as the 216 migration
would not necessarily run anymore.
References
==========
https://etherpad.openstack.org/p/kilo-nova-zero-downtime-upgrades