Merge "Rolling upgrade procedure documentation"
This commit is contained in:
commit
62fd4d1a6e
@ -7,37 +7,270 @@ Bare Metal Service Upgrade Guide
|
||||
This document outlines various steps and notes for operators to consider when
|
||||
upgrading their ironic-driven clouds from previous versions of OpenStack.
|
||||
|
||||
The ironic service is tightly coupled with the ironic driver that is shipped
|
||||
with nova. Some special considerations must be taken into account
|
||||
when upgrading your cloud.
|
||||
The Bare Metal (ironic) service is tightly coupled with the ironic driver that
|
||||
is shipped with the Compute (nova) service. Some special considerations must be
|
||||
taken into account when upgrading your cloud.
|
||||
|
||||
Plan your Upgrade
|
||||
Both offline and rolling upgrades are supported.
|
||||
|
||||
Plan your upgrade
|
||||
=================
|
||||
|
||||
* Rolling upgrades are available starting with the Pike release; that is, when
|
||||
upgrading from Ocata. This means that it is possible to do an upgrade with
|
||||
minimal to no downtime of the Bare Metal API.
|
||||
|
||||
* Upgrades are only supported between two consecutive named releases.
|
||||
This means that you cannot upgrade Ocata directly into Queens; you need to
|
||||
upgrade into Pike first.
|
||||
|
||||
* The `release notes <http://docs.openstack.org/releasenotes/ironic/>`_
|
||||
should always be read carefully when upgrading the ironic service. Starting
|
||||
with the Mitaka release, specific upgrade steps and considerations are
|
||||
well-documented in the release notes.
|
||||
should always be read carefully when upgrading the Bare Metal service.
|
||||
Specific upgrade steps and considerations are documented there.
|
||||
|
||||
* Upgrades are only supported one series at a time, or within a series.
|
||||
|
||||
* Starting with the Liberty release, the ironic service should always be
|
||||
upgraded before the nova service.
|
||||
* The Bare Metal service should always be upgraded before the Compute service.
|
||||
|
||||
.. note::
|
||||
The ironic virt driver in nova always uses a specific version of the
|
||||
ironic REST API. This API version may be one that was introduced in the
|
||||
same development cycle, so upgrading nova first may result in nova being
|
||||
unable to use ironic's API.
|
||||
unable to use the Bare Metal API.
|
||||
|
||||
* When upgrading ironic, the following steps should always be taken:
|
||||
* Make a backup of your database. Ironic does not support downgrading of the
|
||||
database. Hence, in case of upgrade failure, restoring the database from
|
||||
a backup is the only choice.
|
||||
|
||||
#. Update ironic code, without restarting services.
|
||||
|
||||
#. Run database migrations.
|
||||
Offline upgrades
|
||||
================
|
||||
|
||||
#. Restart ironic-conductor and ironic-api services.
|
||||
In an offline (or cold) upgrade, the Bare Metal service is not available
|
||||
during the upgrade, because all the services have to be taken down.
|
||||
|
||||
When upgrading the Bare Metal service, the following steps should always be
|
||||
taken in this order:
|
||||
|
||||
#. upgrade the ironic-python-agent image
|
||||
|
||||
#. update ironic code, without restarting services
|
||||
|
||||
#. run database schema migrations via ``ironic-dbsync upgrade``
|
||||
|
||||
#. restart ironic-conductor and ironic-api services
|
||||
|
||||
Once the above is done, do the following:
|
||||
|
||||
* update any applicable configuration options to stop using any deprecated
|
||||
features or options, and perform any required work to transition to
|
||||
alternatives. All the deprecated features and options will be supported for
|
||||
one release cycle, so should be removed before your next upgrade is
|
||||
performed.
|
||||
|
||||
* upgrade python-ironicclient along with any other services connecting
|
||||
to the Bare Metal service as a client, such as nova-compute
|
||||
|
||||
* run the ``ironic-dbsync online_data_migrations`` command to make sure
|
||||
that data migrations are applied. The command lets you limit
|
||||
the impact of the data migrations with the ``--max-count`` option, which
|
||||
limits the number of migrations executed in one run. You should complete
|
||||
all of the migrations as soon as possible after the upgrade.
|
||||
|
||||
.. warning:: You will not be able to start an upgrade to the next release
|
||||
after this one, until this has been completed for the current
|
||||
release. For example, as part of upgrading from Ocata to Pike,
|
||||
you need to complete Pike's data migrations. If this not done,
|
||||
you will not be able to upgrade to Queens -- it will not be
|
||||
possible to execute Queens' database schema updates.
|
||||
|
||||
|
||||
Rolling upgrades
|
||||
================
|
||||
|
||||
Rolling upgrades are available starting with the Pike release; that is, when
|
||||
upgrading from Ocata. This means that it is possible to do an upgrade with
|
||||
minimal to no downtime of the Bare Metal API.
|
||||
|
||||
Concepts
|
||||
--------
|
||||
|
||||
There are four aspects of the rolling upgrade process to keep in mind:
|
||||
|
||||
* RPC version pinning and versioned object backports
|
||||
* online data migrations
|
||||
* graceful service shutdown
|
||||
* API load balancer draining
|
||||
|
||||
RPC version pinning and versioned object backports
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Through careful RPC versioning, newer services are able to talk to older
|
||||
services (and vice-versa). The ``[DEFAULT]/pin_release_version`` configuration
|
||||
option is used for this. It should be set (pinned) to the release version
|
||||
that the older services are using. The newer services will backport RPC calls
|
||||
and objects to their appropriate versions from the pinned release. If the
|
||||
``IncompatibleObjectVersion`` exception occurs, it is most likely due to an
|
||||
incorrect or unspecified ``[DEFAULT]/pin_release_version`` configuration value.
|
||||
For example, when it is not set to the older release version, no conversion
|
||||
will happen during the upgrade.
|
||||
|
||||
Online data migrations
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
To make database schema migrations less painful to execute, all data migrations
|
||||
are banned from schema migration scripts. The schema migration scripts only
|
||||
update the database schema. Data migrations must be done at the end of the
|
||||
rolling upgrade process, after the schema migration and after the services
|
||||
have been upgraded to the latest release. The data migration is performed
|
||||
using the ``ironic-dbsync online_data_migrations`` command. It can be run in
|
||||
a background process so that it does not interrupt running services.
|
||||
(You would also execute the same command with services turned off if
|
||||
you are doing a cold upgrade).
|
||||
|
||||
This data migration must be completed. If not, you will not be able to
|
||||
upgrade to future releases. For example, if you had upgraded from Ocata to
|
||||
Pike but did not do the data migrations, you will not be able to upgrade from
|
||||
Pike to Queens. (More precisely, you will not be able to apply Queens' schema
|
||||
migrations.)
|
||||
|
||||
Graceful service shutdown
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The ironic-conductor service is a Python process listening for messages on a
|
||||
message queue. When the operator sends the SIGTERM signal to the process, the
|
||||
service stops consuming messages from the queue, so that no additional work is
|
||||
picked up. It completes any outstanding work and then terminates. During this
|
||||
process, messages can be left on the queue and will be processed after the
|
||||
Python process starts back up. This gives us a way to shutdown a service using
|
||||
older code, and start up a service using newer code with minimal impact.
|
||||
|
||||
.. note::
|
||||
This was tested with RabbitMQ messaging backend and may vary with other
|
||||
backends.
|
||||
|
||||
API load balancer draining
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
If you are using a load balancer for the ironic-api services, we recommend that
|
||||
you redirect requests to the new API services and drain off of the ironic-api
|
||||
services that have not yet been upgraded.
|
||||
|
||||
Rolling upgrade process
|
||||
-----------------------
|
||||
|
||||
To reduce downtime, the services can be upgraded in a rolling fashion. It means
|
||||
upgrading one or a few services at a time. To minimise downtime you need to
|
||||
have HA ironic deployment (at least two ironic-api and two ironic-conductor
|
||||
services) so that when a service instance is being upgraded, the other
|
||||
instances are still running.
|
||||
|
||||
**New features should not be used until after the upgrade has been completed.**
|
||||
|
||||
Before maintenance window
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
* Upgrade the ironic-python-agent image
|
||||
|
||||
* Using the new release (ironic code), execute the required database schema
|
||||
updates by running the database upgrade command: ``ironic-dbsync upgrade``.
|
||||
These schema change operations should have minimal or no effect on
|
||||
performance, and should not cause any operations to fail (but please check
|
||||
the release notes). You can:
|
||||
|
||||
* install the new release on an existing system
|
||||
* install the new release in a new virtualenv or a container
|
||||
|
||||
At this point, new columns and tables may exist in the database. These
|
||||
database schema changes are done in a way that both the old and new (N and
|
||||
N+1) releases can perform operations against the same schema.
|
||||
|
||||
.. note:: Ironic bases its RPC and object storage format versions on the
|
||||
``[DEFAULT]/pin_release_version`` configuration option. It is
|
||||
advisable to automate the deployment of changes in configuration
|
||||
files to make the process less error prone and repeatable.
|
||||
|
||||
During maintenance window
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
#. ironic-conductor services should be upgraded first. Ensure that at least
|
||||
one ironic-conductor service is running at all times. For every
|
||||
ironic-conductor, either one by one or a few at a time:
|
||||
|
||||
* shut down the service. Conductors are load-balanced by the message queue,
|
||||
so the only thing you need to worry about is to shut the service down
|
||||
gracefully (using ``SIGTERM`` signal) to make sure it will finish all the
|
||||
requests being processed before shutting down
|
||||
* upgrade the code and dependencies
|
||||
* set the ``[DEFAULT]/pin_release_version`` configuration option value to
|
||||
the version you are upgrading from (that is, the old version). Based on
|
||||
this setting, the new ironic-conductor services will downgrade any
|
||||
RPC communication and data objects to conform to the old service.
|
||||
For example, if you are upgrading from Ocata to Pike, set this value to
|
||||
``ocata``.
|
||||
* start the service
|
||||
|
||||
#. The next service to upgrade is ironic-api. Ensure that at least one
|
||||
ironic-api service is running at all times. You may want to start another
|
||||
instance of the older ironic-api to handle the load while you are upgrading
|
||||
the original ironic-api services. For every ironic-api service, either one
|
||||
by one or a few at a time:
|
||||
|
||||
* in HA deployment you are typically running them behind a load balancer
|
||||
(for example HAProxy), so you need to take the service instance out of the
|
||||
balancer
|
||||
* shut it down
|
||||
* upgrade the code and dependencies
|
||||
* set the ``[DEFAULT]/pin_release_version`` configuration option value to
|
||||
the version you are upgrading from (that is, the old version). Based on
|
||||
this setting, the new ironic-api services will downgrade any RPC
|
||||
communication and data objects to conform to the old service.
|
||||
For example, if you are upgrading from Ocata to Pike, set this value to
|
||||
``ocata``.
|
||||
* restart the service
|
||||
* add it back into the load balancer
|
||||
|
||||
After upgrading all the ironic-api services, the Bare Metal service is
|
||||
running in the new version but with downgraded RPC communication and
|
||||
database object storage formats. New features can fail when objects are in
|
||||
the downgraded object formats and some internal RPC API functions may still
|
||||
not be available.
|
||||
|
||||
#. For all the ironic-conductor services, one at a time:
|
||||
|
||||
* remove the ``[DEFAULT]/pin_release_version`` configuration option setting
|
||||
* restart the ironic-conductor service
|
||||
|
||||
#. For all the ironic-api services, one at a time:
|
||||
|
||||
* remove the ``[DEFAULT]/pin_release_version`` configuration option setting
|
||||
* restart the ironic-api service
|
||||
|
||||
After maintenance window
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
Now that all the services are upgraded, the system is able to use the latest
|
||||
version of the RPC protocol and able to access all the features of the new
|
||||
release.
|
||||
|
||||
* Update any applicable configuration options to stop using any deprecated
|
||||
features or options, and perform any required work to transition to
|
||||
alternatives. All the deprecated features and options will be supported for
|
||||
one release cycle, so should be removed before your next upgrade is
|
||||
performed.
|
||||
|
||||
* Upgrade ``python-ironicclient`` along with other services connecting
|
||||
to the Bare Metal service as a client, such as nova-compute.
|
||||
|
||||
* Run the ``ironic-dbsync online_data_migrations`` command to make sure
|
||||
that data migrations are applied. The command lets you limit
|
||||
the impact of the data migrations with the ``--max-count`` option, which
|
||||
limits the number of migrations executed in one run. You should complete
|
||||
all of the migrations as soon as possible after the upgrade.
|
||||
|
||||
Note that you will not be able to start an upgrade to the next release after
|
||||
this one, until this has been completed for the current release. For example,
|
||||
as part of upgrading from Ocata to Pike, you need to complete Pike's data
|
||||
migrations. If this not done, you will not be able to upgrade to Queens --
|
||||
it will not be possible to execute Queens' database schema updates.
|
||||
|
||||
Upgrading from Ocata to Pike
|
||||
============================
|
||||
|
@ -0,0 +1,4 @@
|
||||
---
|
||||
features:
|
||||
- Adds support for rolling upgrades, starting from upgrading Ocata to Pike.
|
||||
For details, see http://docs.openstack.org/ironic/admin/upgrade-guide.html.
|
Loading…
Reference in New Issue
Block a user