Dev Docs for Writing E-M-C Migrations

Add dev docs to describe how Glance database migrations must be written for zero-downtime upgrades using Alembic. Change-Id: Ic8b21d2cc1e42f3e1478973df0f80792e5098f90
2017-03-08 14:48:25 -06:00 · 2017-03-08 14:48:25 -06:00 · 15cc9a71c0
commit 15cc9a71c0
parent b702bce146
2 changed files with 348 additions and 0 deletions
--- a/doc/source/database_migrations.rst
+++ b/doc/source/database_migrations.rst
@ -0,0 +1,347 @@
+..
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+======================================================
+Writing Database Migrations for Zero-Downtime Upgrades
+======================================================
+
+Beginning in Ocata, OpenStack Glance uses Alembic, which replaced SQLAlchemy
+Migrate as the database migration engine. Moving to Alembic is particularly
+motivated by the zero-downtime upgrade work. Refer to [GSPEC1]_ and [GSPEC2]_
+for more information on zero-downtime upgrades in Glance and why a move to
+Alembic was deemed necessary.
+
+Stop right now and go read [GSPEC1]_ and [GSPEC2]_ if you haven't done so
+already. Those documents explain the strategy Glance has approved for database
+migrations, and we expect you to be familiar with them in what follows.  This
+document focuses on the "how", but unless you understand the "what" and "why",
+you'll be wasting your time reading this document.
+
+Prior to Ocata, database migrations were conceived as monoliths.  Thus, they
+did not need to carefully distinguish and manage database schema expansions,
+data migrations, or database schema contractions. The modern database
+migrations are more sensitive to the characteristics of changes being
+attempted and thus we clearly identify three phases of a database migration:
+(1) expand, (2) migrate, and (3) contract.  A developer modifying the Glance
+database must supply a script for each of these phases.
+
+Here's a quick reminder of what each phase entails.
+For more information, see [GSPEC1]_.
+
+Expand
+  Expand migrations MUST be additive in nature. Expand migrations
+  should be seen as the minimal set of schema changes required by the new
+  services that can be applied while the old services are still running.
+  Expand migrations should optionally include temporary database triggers that
+  keep the old and new columns in sync. If a database change needs data to be
+  migrated between columns, then temporary database triggers are required to
+  keep the columns in sync while the data migrations are in-flight.
+
+  .. note::
+      Sometimes there could be an exception to the additive-only change
+      strategy for expand phase. It is described more elaborately in [GSPEC1]_.
+      Again, consider this as a last reminder to read [GSPEC1]_, if you haven't
+      already done so.
+
+Migrate
+  Data migrations MUST NOT attempt any schema changes and only move existing
+  data between old and new columns such that new services can start consuming
+  the new tables and/or columns introduced by the expand migrations.
+
+Contract
+  Contract migrations usually include the remaining schema changes required by
+  the new services that couldn't be applied during expand phase due to their
+  incompatible nature with the old services. Any temporary database triggers
+  added during the expand migrations MUST be dropped with contract migrations.
+
+
+Alembic Migrations
+==================
+As mentioned earlier, starting in Ocata Glance database migrations must be
+written for Alembic. All existing Glance migrations have been ported to
+Alembic. They can be found here [GMIGS1]_.
+
+
+Schema Migrations (Expand/Contract)
+-----------------------------------
+
+* All Glance schema migrations must reside in
+  ``glance.db.sqlalchemy.alembic_migrations.versions`` package
+
+* Every Glance schema migration must be a python module with the following
+  structure
+
+  .. code::
+
+    """<docstring describing the migration>
+
+    Revision ID: <unique revision id>
+    Revises: <parent revision id>
+    """
+
+    <your imports here>
+
+    revision = <unique revision id>
+    down_revision = <parent revision id>
+    depends_on = <id of dependent revision or None>
+
+    def upgrade():
+        <your schema changes here>
+
+
+  Identifiers ``revision``, ``down_revision`` and ``depends_on`` are
+  elaborated below.
+
+* The ``revision`` identifier is a unique revision id for every migration.
+  It must conform to one of the following naming schemes.
+
+  All monolith migrations must conform to:
+
+  .. code::
+
+    <release name><two-digit sequence number per release>
+
+
+  And, all expand/contract migrations must conform to:
+
+  .. code::
+
+    <release name>_[expand|contract]<two-digit sequence number per release>
+
+
+  Example:
+
+  .. code::
+
+    Monolith migration: ocata01
+    Expand migration: ocata_expand01
+    Contract migration: ocata_contract01
+
+  This name convention is devised with an intention to easily understand the
+  migration sequence. While the ``<release name>`` mentions the release a
+  migration belongs to, the ``<two-digit sequence number per release>`` helps
+  identify the order of migrations within each release. For modern migrations,
+  the ``[expand|contract]`` part of the revision id helps identify the
+  revision branch a migration belongs to.
+
+* The ``down_revision`` identifier MUST be specified for all Alembic migration
+  scripts. It points to the previous migration (or ``revision`` in Alembic
+  lingo) on which the current migration is based. This essentially
+  establishes a migration sequence very much a like a singly linked list would
+  (except that we use a ``previous`` link here instead of the more traditional
+  ``next`` link.)
+
+  The very first migration, ``liberty`` in our case, would have
+  ``down_revision`` set to ``None``. All other migrations must point to the
+  last migration in the sequence at the time of writing the migration.
+
+  For example, Glance has two migrations in Mitaka, namely, ``mitaka01``
+  and ``mitaka02``. The migration sequence for Mitaka should look like:
+
+  .. code::
+
+                 liberty
+                    ^
+                    |
+                    |
+                 mitaka01
+                    ^
+                    |
+                    |
+                 mitaka02
+
+* The ``depends_on`` identifier helps establish dependencies between two
+  migrations. If a migration ``X`` depends on running  migration ``Y`` first,
+  then ``X`` is said to depend on ``Y``. This could be specified in the
+  migration as shown below:
+
+  .. code::
+
+    revision = 'X'
+    down_revision = 'W'
+    depends_on = 'Y'
+
+  Naturally, every migration depends on the migrations preceding it in the
+  migration sequence. Hence, in a typical branch-less migration sequence,
+  ``depends_on`` is of limited use. However, this could be useful for
+  migration sequences with branches. We'll see more about this in the next
+  section.
+
+* All schema migration scripts must adhere to the naming convention
+  mentioned below:
+
+  .. code::
+
+    <unique revision id>_<very brief description>.py
+
+  Example:
+
+  .. code::
+
+    Monolith migration: ocata01_add_visibility_remove_is_public.py
+    Expand migration: ocata_expand01_add_visibility.py
+    Contract migration: ocata_contract01_remove_is_public.py
+
+
+Dependency Between Contract and Expand Migrations
+-------------------------------------------------
+
+* To achieve zero-downtime upgrades, the Glance migration sequence has been
+  branched into ``expand`` and ``contract`` branches. As the name suggests,
+  the ``expand`` branch contains only the expand migrations and the
+  ``contract`` branch contains only the contract migrations. As per the
+  zero-downtime migration strategy, the expand migrations are run first
+  followed by contract migrations. To establish this dependency, we make the
+  contract migrations explicitly depend on their corresponding expand
+  migrations. Thus, running contract migrations without running expansions is
+  not possible.
+
+  For example, the Community Images migration in Ocata includes the
+  experimental E-M-C migrations. The expand migration is ``ocata_expand01``
+  and the contract migration is ``ocata_contract01``. The dependency is
+  established as below.
+
+  .. code::
+
+    revision = 'ocata_contract01'
+    down_revision = 'mitaka02'
+    depends_on = 'ocata_expand01'
+
+
+  Every contract migration in Glance MUST depend on its corresponding expand
+  migration. Thus, the current Glance migration sequence looks as shown below:
+
+  .. code::
+
+                              liberty
+                                 ^
+                                 |
+                                 |
+                             mitaka01
+                                 ^
+                                 |
+                                 |
+                             mitaka02
+                                 ^
+                                 |
+                    +------------+------------+
+                    |                         |
+                    |                         |
+             ocata_expand01 <------  ocata_contract01
+                    ^                         ^
+                    |                         |
+                    |                         |
+              pike_expand01 <------   pike_contract01
+
+
+Data Migrations
+---------------
+
+* All Glance data migrations must reside in
+  ``glance.db.sqlalchemy.alembic_migrations.data_migrations`` package.
+
+* The data migrations themselves are not Alembic migration scripts. And, hence
+  they don't require a unique revision id. However, they must adhere to a
+  similar naming convention discussed above. That is:
+
+  .. code::
+
+    <release name>_migrate<two-digit sequence number per release>_<very brief description>.py
+
+  Example:
+
+  .. code::
+
+    Data Migration: ocata_migrate01_community_images.py
+
+* All data migrations modules must adhere to the following structure:
+
+  .. code::
+
+    def has_migrations(engine):
+        <your code to determine whether or not there are any pending rows to be
+        migrated>
+        return <boolean>
+
+
+    def migrate(engine):
+        <your code to migrate rows in the database.>
+        return <number of rows migrated>
+
+
+NOTES
+-----
+
+* Starting in Ocata, Glance needs every database migration to include both
+  monolithic and Expand-Migrate-Contract (E-M-C) style migrations. At some
+  point in Pike, E-M-C migrations will be made default. At that point, it
+  would be no longer required to include monolithic migration script.
+
+* Alembic is a database migration engine written for SQLAlchemy. So, any
+  migration script written for SQLAlchemy Migrate should work with Alembic as
+  well provided the structural differences above (primarily adding
+  ``revision``, ``down_revision`` and ``depends_on``) are taken care of.
+  Moreover, it maybe easier to do certain operations with Alembic.
+  Refer to [ALMBC]_ for information on Alembic operations.
+
+* A given database change may not require actions in each of the expand,
+  migrate, contract phases, but nonetheless, we require a script for *each*
+  phase for *every* change.  In the case where an action is not required, a
+  ``no-op`` script, described below, MUST be used.
+
+  For instance, if a database migration is completely contractive in nature,
+  say removing a column, there won't be a need for expand and migrate
+  operations. But, including a ``no-op`` expand and migrate scripts will make
+  it explicit and also preserve the one-to-one correspondence between expand,
+  migrate and contract scripts.
+
+  A no-op expand/contract Alembic migration:
+
+  .. code::
+
+
+    """An example empty Alembic migration script
+
+    Revision ID: foo02
+    Revises: foo01
+    """
+
+    revision = foo02
+    down_revision = foo01
+
+    def upgrade():
+        pass
+
+
+ A no-op migrate script:
+
+  .. code::
+
+    """An example empty data migration script"""
+
+    def has_migrations(engine):
+        return False
+
+
+    def migrate(engine):
+        return 0
+
+References
+==========
+.. [GSPEC1] `Database Strategy for Rolling Upgrades
+            <https://specs.openstack.org/openstack/glance-specs/specs/ocata/implemented/glance/database-strategy-for-rolling-upgrades.html>`_
+.. [GSPEC2] `Glance Alembic Migrations
+            <https://specs.openstack.org/openstack/glance-specs/specs/ocata/implemented/glance/alembic-migrations.html>`_
+.. [GMIGS1] `Glance Alembic Migrations
+            <http://git.openstack.org/cgit/openstack/glance/tree/glance/db/sqlalchemy/alembic_migrations/versions>`_
+.. [ALMBC] `Alembic Operations <http://alembic.zzzcomputing.com/en/latest/ops.html>`_
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@ -65,6 +65,7 @@ Developer reference

   architecture
   database_architecture
+   database_migrations
   domain_model
   domain_implementation