Port the Database Migrations doc

Moves the Database Migrations page from https://github.com/cloudkeep/barbican/wiki/Database-Migrations to the Sphinx documentation, with added notes on automatically generating version files. Change-Id: Ie6b0a63af90c27439e82889b0f1cd0c0fc0a6bea
2015-02-18 12:42:22 -05:00 · 2015-02-18 12:42:22 -05:00 · 888b68a531
commit 888b68a531
parent bfae7fc508
2 changed files with 232 additions and 0 deletions
--- a/doc/source/contribute/database_migrations.rst
+++ b/doc/source/contribute/database_migrations.rst
@ -0,0 +1,231 @@
 Database Migrations
 ====================
 Database migrations are managed using the Alembic_ library. The consensus for
 `OpenStack and SQLAlchemy`_ is that this library is preferred over
 sqlalchemy-migrate.
 Database migrations can be performed two ways: (1) via the API startup
 process, and (2) via a separate script.
 Database migrations can be optionally enabled during the API startup process.
 Corollaries for this are that a new deployment should begin with only one node
 to avoid migration race conditions.
 Alternatively, the automatic update startup behavior can be disabled, forcing
 the use of the migration script. This latter mode is probably safer to use in
 production environments.
 Policy
 -------
 A Barbican deployment goal is to update application and schema versions with
 zero downtime. The challenge is that at all times the database schema must be
 able to support two deployed application versions, so that a single migration
 does not break existing nodes running the previous deployment. For example,
 when deleting a column we would first deploy a new version that ignores the
 column. Once all nodes are ignoring the column, a second deployment would be
 made to remove the column from the database.
 To achieve this goal, the following rules will be observed for schema changes:
 1. Do not remove columns or tables directly, but rather:
   a. Create a version of the application not dependent on the removed
      column/table
   b. Replace all nodes with this new application version
   c. Create an Alembic version file to remove remove the column/table
   d. Apply this change in production manually, or automatically with a future
      version of the application
 2. Changing column attributes (types, names or widths) should be handled as
   follows:
   a. TODO: This Stack Overflow `Need to alter column types in production
      database`_ page and many others summarize the grief involved in doing
      these sort of migrations
   b. TODO: What about old and new application versions happening
      simultaneously?
      i. Maybe have the new code perform migration to new column on each read
         ...similar to how a no-sql db migration would occur?
 3. Transforming column attributes (ex: splitting one ``name`` column into a
   ``first`` and ``last`` name):
   a. TODO: An `Alembic example`_, but not robust for large datasets.
 Overview
 ---------
 *Prior to invoking any migration steps below, change to your* ``barbican`` *project's
 folder and activate your virtual environment per the* `Developer Guide`_.
 **If you are using PostgreSQL, please ensure you are using SQLAlchemy version
 0.9.3 or higher, otherwise the generated version files will not be correct.**
 **You cannot use these migration tools and techniques with SQLite databases.**
 Consider taking a look at the `Alembic tutorial`_. As a brief summary: Alembic
 keeps track of a linked list of version files, each one applying a set of
 changes to the database schema that a previous version file in the linked list
 modified. Each version file has a unique Alembic-generated ID associated with
 it. Alembic generates a table in the project table space called
 ``alembic_version`` that keeps track of the unique ID of the last version file
 applied to the schema. During an update, Alembic uses this stored version ID
 to determine what if any follow on version files to process.
 Generating Change Versions
 ---------------------------
 To make schema changes, new version files need to be added to the
 ``barbican/model/migration/alembic_migrations/versions/`` folder. This section
 discusses two ways to add these files.
 Automatically
 ''''''''''''''
 Alembic autogenerates a new script by comparing a clean database (i.e., one
 without your recent changes) with any modifications you make to the Models.py
 or other files. This being said, automatic generation may miss changes... it
 is more of an 'automatic assist with expert review'. See `What does
 Autogenerate Detect`_ in the Alembic documentation for more details.
 First, you must start Barbican using a version of the code that does not
 include your changes, so that it creates a clean database. This example uses
 Barbican launched with DevStack (see `Barbican DevStack`_ wiki page for
 instructions).
 1. Make changes to the 'barbican/model/models.py' SQLAlchemy models or
   checkout your branch that includes your changes using git.
 2. Execute ``bin/barbican-db-manage.py -d <Full URL to database, including
   user/pw> revision -m '<your-summary-of-changes>' --autogenerate``
   a. For example: ``bin/barbican-db-manage.py -d
      mysql://root:password@127.0.0.1/barbican?charset=utf8
      revision -m 'Make unneeded verification columns nullable' --autogenerate``
 3. Examine the generated version file, found in
   ``barbican/model/migration/alembic_migrations/versions/``:
   a. **Verify generated update/rollback steps, especially for modifications
      to existing columns/tables**
   b. **If you added new tables, follow this guidance**:
      1. Make sure you added your new table to the ``MODELS`` element of the
         ``barbican/model/models.py`` module.
      2. Note that when Barbican boots up, it will add the new table to the
         database. It will also try to apply the database version (that also
         tries to add this table) via alembic. Therefore, please edit the
         generated script file to add these lines:
         a. ``ctx = op.get_context()`` (to get the alembic migration context in
            current transaction)
         b. ``con = op.get_bind()`` (get the database connection)
         c. ``table_exists = ctx.dialect.has_table(con.engine,
            'your-new-table-name-here')``
         d. ``if not table_exists:``
         e. ``...remaining create table logic here...``
 *Note: For anything but trivial or brand new columns/tables, database backups
 and maintenance-window downtimes might be called for.*
 Manually
 '''''''''
 1. Execute: ``bin/barbican-db-manage.py revision -m "<insert your change
   description here>"``
 2. This will generate a new file in the
   ``barbican/model/migration/alembic_migrations/versions/`` folder, with this
   sort of file format:
   ``<unique-Alembic-ID>_<your-change-description-from-above-but-truncated>.py``.
   Note that only the first 20 characters of the description are used.
 3. You can then edit this file per tutorial and the `Alembic Operation
   Reference`_ page for available operations you may make from the version
   files. **You must properly fill in both the** ``upgrade()`` **and**
   ``downgrade()`` **methods.**
 Applying Changes
 -----------------
 Barbican utilizes the Alembic version files as managing delta changes to the
 database. Therefore the first Alembic version file does **not** contain all
 time-zero database tables.
 To create the initial Barbican tables in the database, execute the Barbican
 application per the 'Via Application' section.
 Thereafter, it is suggested that only the ``barbican-db-manage.py`` script
 above be used to update the database schema per the 'Manually' section. Also,
 automatic database updates from the Barbican application should be disabled by
 adding/updating ``db_auto_create = False`` in the ``barbican-api.conf``
 configuration file.
 Via Application
 ''''''''''''''''
 The last section of the `Alembic tutorial`_ describes the process used by the
 Barbican application to create and update the database table space
 automatically.
 By default, when the Barbican API boots up it will try to create the Barbican
 database tables (using SQLAlchemy), and then try to apply the latest version
 files (using Alembic). In this mode, the latest version of the Barbican
 application can create a new database table space updated to the latest schema
 version, or else it can update an existing database table space to the latest
 schema revision (called ``head`` in the docs).
 *To bypass this automatic behavior, add* ``db_auto_create = False`` *to the*
 ``barbican-api.conf`` *file*.
 Manually
 '''''''''
 Run ``bin/barbican-db-manage.py -d <Full URL to database, including user/pw>
 upgrade -v head``, which will cause Alembic to apply the changes found in all
 version files after the version currently written in the target database, up
 until the latest version file in the linked chain of files.
 To upgrade to a specific version, run this command:
 ``bin/barbican-db-manage.py -d <Full URL to database, including user/pw>
 upgrade -v <Alembic-ID-of-version>``. The ``Alembic-ID-of-version`` is a
 unique ID assigned to the change such ``as1a0c2cdafb38``.
 To downgrade to a specific version, run this command:
 ``bin/barbican-db-manage.py -d <Full URL to database, including user/pw>
 downgrade -v <Alembic-ID-of-version>``.
 TODO Items
 -----------
 1. *[Done - It works!]* Verify alembic works with the current SQLAlchemy model
   configuration in Barbican (which was borrowed from Glance).
 2. *[Done - It works, I was able to add/remove columns while app was running]*
   Verify that SQLAlchemy is tolerant of schema miss-matches. For example, if
   a column is added to a table schema, will this break existing deployments
   that aren't expecting this column?
 3. *[Done - It works]* Add auto-migrate code to the boot up of models (see the
   ``barbican\model\repositories.py`` file).
 4. *[Done - It works]* Add guard in Barbican model logic to guard against
   running migrations with SQLite databases.
 5. Add detailed deployment steps for production, so how new nodes are rolled
   in and old ones rolled out to complete move to new versions.
 6. *[In Progress]* Add a best-practices checklist section to this page.
   a. This would provide guidance on safely migrating schemas, do's and
      don'ts, etc.
   b. This could also provide code guidance, such as ensuring that new schema
      changes (eg. that new column) aren't required for proper functionality
      of the previous version of the code.
   c. If a server bounce is needed, notification guidelines to the devop team
      would be spelled out here.
 .. _Alembic: https://alembic.readthedocs.org/en/latest/
 .. _Alembic Example: https://julo.ch/blog/migrating-content-with-alembic/
 .. _Alembic Operation Reference: https://alembic.readthedocs.org/en/latest/ops.html
 .. _Alembic tutorial: https://alembic.readthedocs.org/en/latest/tutorial.html
 .. _Barbican DevStack: https://wiki.openstack.org/wiki/BarbicanDevStack
 .. _Developer Guide: https://github.com/cloudkeep/barbican/wiki/Developer-Guide
 .. _Need to alter column types in production database: http://stackoverflow.com/questions/5329255/need-to-alter-column-types-in-production-database-sql-server-2005
 .. _OpenStack and SQLAlchemy: https://wiki.openstack.org/wiki/OpenStack_and_SQLAlchemy#Migrations
 .. _What does Autogenerate Detect: http://alembic.readthedocs.org/en/latest/autogenerate.html#what-does-autogenerate-detect-and-what-does-it-not-detect
--- a/doc/source/index.rst
+++ b/doc/source/index.rst
@ -18,6 +18,7 @@ Getting Started
   contribute/getting_involved
   contribute/dependencies
   contribute/database_migrations
   setup/index
   testing
   plugin/index