diff --git a/doc/source/devref/index.rst b/doc/source/devref/index.rst index 0b5948c5810..0eaba47b9ba 100644 --- a/doc/source/devref/index.rst +++ b/doc/source/devref/index.rst @@ -36,6 +36,7 @@ Programming HowTos and Tutorials replication migration api.apache + rolling.upgrades Background Concepts for Cinder ------------------------------ diff --git a/doc/source/devref/rolling.upgrades.rst b/doc/source/devref/rolling.upgrades.rst new file mode 100644 index 00000000000..cc6aeca81d9 --- /dev/null +++ b/doc/source/devref/rolling.upgrades.rst @@ -0,0 +1,328 @@ +.. + Copyright (c) 2016 Intel Corporation + All Rights Reserved. + + Licensed under the Apache License, Version 2.0 (the "License"); you may + not use this file except in compliance with the License. You may obtain + a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, WITHOUT + WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the + License for the specific language governing permissions and limitations + under the License. + +Upgrades +======== + +Starting from Mitaka release Cinder gained the ability to be upgraded without +introducing downtime of control plane services. Operator can simply upgrade +Cinder services instances one-by-one. To achieve that, developers need to make +sure that any introduced change doesn't break older services running in the +same Cinder deployment. + +In general there is a requirement that release N will keep backward +compatibility with release N-1 and in a deployment N's and N-1's services can +safely coexist. This means that when performing a live upgrade you cannot skip +any release (e.g. you cannot upgrade N to N+2 without upgrading it to N+1 +first). Further in the document N will denote the current release, N-1 a +previous one, N+1 the next one, etc. + +Having in mind that we only support compatibility with N-1, most of the +compatibility code written in N needs to exist just for one release and can be +removed in the beginning of N+1. A good practice here is to mark them with +:code:`TODO` or :code:`FIXME` comments to make them easy to find in the future. + +Please note that proper upgrades solution should support both +release-to-release upgrades as well as upgrades of deployments following the +Cinder master more closely. We cannot just merge patches implementing +compatibility at the end of the release - we should keep things compatible +through the whole release. + +To achieve compatibility, discipline is required from the developers. There are +several planes on which incompatibility may occurr: + +* **REST API changes** - these are prohibited by definition and this document + will not describe the subject. For further information one may use `API + Working Group guidelines + `_ + for reference. + +* **Database schema migrations** - e.g. if N-1 was relying on some column in + the DB being present, N's migrations cannot remove it. N+1's however can + (assuming N has no notion of the column). + +* **Database data migrations** - if a migration requires big amount of data to + be transfered between columns or tables or converted, it will most likely + lock the tables. This may cause services to be unresponsive, causing the + downtime. + +* **RPC API changes** - adding or removing RPC method parameter, or the method + itself, may lead to incompatibilities. + +* **RPC payload changes** - adding, renaming or removing a field from the dict + passed over RPC may lead to incompatibilities. + +Next sections of this document will focus on explaining last four points and +provide means to tackle required changes in these matters while maintaining +backward compatibility. + + +Database schema and data migrations +----------------------------------- + +In general incompatible database schema migrations can be tracked to ALTER and +DROP SQL commands instruction issued either against a column or table. This is +why a unit test that blocks such migrations was introduced. We should try to +keep our DB modifications additive. Moreover we should aim not to introduce +migrations that cause the database tables to lock for a long period. Long lock +on whole table can block other queries and may make real requests to fail. + +Adding a column +............... + +This is the simplest case - we don't have any requirements when adding a new +column apart from the fact that it should be added as the last one in the +table. If that's covered, the DB engine will make sure the migraton won't be +disruptive. + +Dropping a column not referenced in SQLAlchemy code +................................................... + +When we want to remove a column that wasn't present in any SQLAlchemy model or +it was in the model, but model was not referenced in any SQLAlchemy API +function (this basically means that N-1 wasn't depending on the presence of +that column in the DB), then the situation is simple. We should be able to +safely drop the column in N release. + +Removal of unnecessary column +............................. + +When we want to remove a used column without migrating any data out of it (for +example because what's kept in the column is obsolete), then we just need to +remove it from the SQLAlchemy model and API in N release. In N+1 or as a +post-upgrade migration in N we can merge a migration issuing DROP for this +column (we cannot do that earlier because N-1 will depend on the presence of +that column). + +ALTER on a column +................. + +A rule of thumb to judge which ALTER or DROP migrations should be allowed is to +look in the `MySQL documentation +`_. +If operation has "yes" in all 4 columns besides "Copies Table?", then it +*probably* can be allowed. If operation doesn't allow concurrent DML it means +that table row modifications or additions will be blocked during the migration. +This sometimes isn't a problem - for example it's not the end of the world if a +service won't be able to report it's status one or two times (and +:code:`services` table is normally small). Please note that even if this does +apply to "rename a column" operation, we cannot simply do such ALTER, as N-1 +will depend on the older name. + +If an operation on column or table cannot be allowed, then it is required to +create a new column with desired properties and start moving the data (in a +live manner). In worst case old column can be removed in N+2. Whole procedure +is described in more details below. + +In aforementioned case we need to make more complicated steps streching through +3 releases - always keeping the backwards compatibility. In short when we want +to start to move data inside the DB, then in N we should: + +* Add a new column for the data. +* Write data in both places (N-1 needs to read it). +* Read data from the old place (N-1 writes there). +* Prepare online data migration cinder-manage command to be run before + upgrading to N+1 (because N+1 will read from new place, so we need to make + sure all the records have new place populated). + +In N+1 we should: + +* Write data to both places (N reads from old one). +* Read data from the new place (N saves there). + +In N+2 + +* Remove old place from SQLAlchemy. +* Read and write only to the new place. +* Remove the column as the post-upgrade migration (or as first migration in + N+3). + +Please note that this is the most complicated case. If data in the column +cannot actually change (for example :code:`host` in :code:`services` table), in +N we can read from new place and fallback to the old place if data is missing. +This way we can skip one release from the process. + +Of course real-world examples may be different. E.g. sometimes it may be +required to write some more compatibility code in the oslo.versionedobjects +layer to compensate for different versions of objects passed over RPC. This is +explained more in `RPC payload changes (oslo.versionedobjects)`_ section. + +More details about that can be found in the `online-schema-upgrades spec +`_. + + +RPC API changes +--------------- + +It can obviously break service communication if RPC interface changes. In +particular this applies to changes of the RPC method definitions. To avoid that +we assume N's RPC API compatibility with N-1 version (both ways - +:code:`rpcapi` module should be able to downgrade the message if needed and +:code:`manager` module should be able to tolerate receiving messages in older +version. + +Below is an example RPC compatibility shim from Mitaka's +:code:`cinder.volume.manager`. This code allows us to tolerate older versions +of the messages:: + + def create_volume(self, context, volume_id, request_spec=None, + filter_properties=None, allow_reschedule=True, + volume=None): + + """Creates the volume.""" + # FIXME(thangp): Remove this in v2.0 of RPC API. + if volume is None: + # For older clients, mimic the old behavior and look up the volume + # by its volume_id. + volume = objects.Volume.get_by_id(context, volume_id) + +And here's a contrary shim in cinder.volume.rpcapi (RPC client) that downgrades +the message to make sure it will be understood by older instances of the +service:: + + def create_volume(self, ctxt, volume, host, request_spec, + filter_properties, allow_reschedule=True): + request_spec_p = jsonutils.to_primitive(request_spec) + msg_args = {'volume_id': volume.id, 'request_spec': request_spec_p, + 'filter_properties': filter_properties, + 'allow_reschedule': allow_reschedule} + if self.client.can_send_version('1.32'): + version = '1.32' + msg_args['volume'] = volume + else: + version = '1.24' + + new_host = utils.extract_host(host) + cctxt = self.client.prepare(server=new_host, version=version) + request_spec_p = jsonutils.to_primitive(request_spec) + cctxt.cast(ctxt, 'create_volume', **msg_args) + +As can be seen there's this magic :code:`self.client.can_send_version()` method +which detects if we're running in a version-heterogenious environment and need +to downgrade the message. Detection is based on dynamic RPC version pinning. In +general all the services (managers) report supported RPC API version. RPC API +client gets all the versions from the DB, chooses the lowest one and starts to +downgrade messages to it. + +To limit impact on the DB the pinned version of certain RPC API is cached. +After all the services in the deployment are updated, operator should restart +all the services or send them a SIGHUP signal to force reload of version pins. + +As we need to support only N RPC API in N+1 release, we should be able to drop +all the compatibility shims in N+1. To be technically correct when doing so we +should also bump the major RPC API version. We do not need to do that in every +release (it may happen that through the release nothing will change in RPC API +or cost of technical debt of compatibility code is lower than the cost of +complicated procedure of increasing major version of RPC APIs). + +The process of increasing the major version is explained in details in `Nova's +documentation `_. +Please note that in case of Cinder we're accessing the DB from all of the +services, so we should follow the more complicated "Mixed version environments" +process for every of our services. + +In case of removing whole RPC method we need to leave it there in N's manager +and can remove it in N+1 (because N-1 will be talking with N). When adding a +new one we need to make sure that when the RPC client is pinned to a too low +version any attempt to send new message should fail (because client will not +know if manager receiving the message will understand it) or ensure the manager +will get updated before clients by stating the recommended order of upgrades +for that release. + +RPC payload changes (oslo.versionedobjects) +------------------------------------------ + +`oslo.versionedobjects +`_ is a library that +helps us to maintain compatibility of the payload sent over RPC. As during the +process of upgrades it is possible that a newer version of the service will +send an object to an older one, it may happen that newer object is incompatible +with older service. + +Version of an object should be bumped every time we make an incompatible change +inside it. Rule of thumb is that we should always do that, but well-thought +exceptions were also allowed in the past (for example releasing a NOT NULL +constraint). + +Imagine that we (finally!) decide that :code:`request_spec` sent in +:code:`create_volume` RPC cast is duplicating data and we want to start to +remove redundant occurrences. When running in version-mixed environment older +services will still expect this redundant data. We need a way to somehow +downgrade the :code:`request_spec` before sending it over RPC. And this is were +o.vo come in handy. o.vo provide us the infrastructure to keep the changes in +object versioned and to be able to downgrade them to a particular version. + +Let's take a step back - similarly to the RPC API situation we need a way to +tell if we need to send a backward-compatible version of the message. In this +case we need to know to what version to downgrade the object. We're using a +similar solution to the one used for RPC API for that. A problem here is that +we need a single identifier (that we will be reported to :code:`services` DB +table) to denote whole set of versions of all the objects. To do that we've +introduced a concept of :code:`CinderObjectVersionHistory` object, where we +keep sets of individual object versions aggregated into a single version +string. When making an incompatible change in a single object you need to bump +its version (we have a unit test enforcing that) *and* add a new version to +:code:`cinder.objects.base.CinderObjectVersionsHistory` (there's a unit test as +well). Example code doing that is below:: + + OBJ_VERSIONS.add('1.1', {'Service': '1.2', 'ServiceList': '1.1'}) + +This line adds a new 1.1 aggregated object version that is different from 1.0 +by two objects - :code:`Service` in 1.2 and :code:`ServiceList` in 1.1. This +means that the commit which added this line bumped versions of these two +objects. + +Now if we know that a service we're talking to is running 1.1 aggregated +version - we need to downgrade :code:`Service` and :code:`ServiceList` to 1.2 +and 1.1 respectively before sending. Please note that of course other objects +are included in the 1.1 aggregated version, but you just need to specify what +changed (all the other versions of individual objects will be taken from the +last version - 1.0 in this case). + +Getting back to :code:`request_spec` example. So let's assume we want to remove +:code:`volume_properties` from there (most of data in there is already +somewhere else inside the :code:`request_spec` object). We've made a change in +the object fields, we've bumped it's version (from 1.0 to 1.1), we've updated +hash in the :code:`cinder.tests.unit.test_objects` to synchronize it with the +current state of the object, making the unit test pass and we've added a new +aggregated object history version in :code:`cinder.objects.base`. + +What else is required? We need to provide code that actually downgrades +RequestSpec object from 1.1 to 1.0 - to be used when sending the object to +older services. This is done by implementing :code:`obj_make_compatible` method +in the object:: + + from oslo_utils import versionutils + + def obj_make_compatible(self, primitive, target_version): + super(RequestSpec, self).obj_make_compatible(primitive, target_version) + target_version = versionutils.convert_version_to_tuple(target_version) + if target_version < (1, 1) and not 'volume_properties' in primitive: + volume_properties = {} + # TODO: Aggregate all the required information from primitive. + primitive['volume_properties'] = volume_properties + +Please note that primitive is a dictionary representation of the object and not +an object itself. This is because o.vo are of course sent over RPC as dicts. + +With these pieces in place Cinder will take care of sending +:code:`request_spec` with :code:`volume_properties` when running in mixed +environment and without when all services are upgraded and will understand +:code:`request_spec` without :code:`volume_properties` element. + +Note that o.vo layer is able to recursively downgrade all of its fields, so +when `request_spec` will be used as a field in other object, it will be +correctly downgraded.