diff --git a/doc/source/project/index.rst b/doc/source/project/index.rst index 9b432fc5..6ffc3734 100644 --- a/doc/source/project/index.rst +++ b/doc/source/project/index.rst @@ -14,6 +14,7 @@ https://docs.openstack.org. charm-delivery backport-policy support-notes + issues-and-procedures release-policy release-schedule diff --git a/doc/source/project/issues-and-procedures.rst b/doc/source/project/issues-and-procedures.rst new file mode 100644 index 00000000..bf58627b --- /dev/null +++ b/doc/source/project/issues-and-procedures.rst @@ -0,0 +1,67 @@ +===================================================== +Issues, charm procedures, and OpenStack upgrade notes +===================================================== + +This page centralises all software issues, special charm procedures, and +OpenStack upgrade path notes. + +.. important:: + + The documentation of issues and procedures are not duplicated amid the + below-mentioned pages. For instance, an issue related to a specific + OpenStack upgrade path is not repeated in the + :ref:`upgrade_issues_openstack_upgrades` section of the Upgrade issues page + (an upgrade issue can potentially affect multiple OpenStack upgrade paths). + +Software issues +--------------- + +Software issues mentioned here can be related to the charms or upstream +OpenStack. Both upgrade and various non-upgrade (but noteworthy) issues are +documented. Note that an upgrade issue can affect either of the three types of +upgrades (charms, OpenStack, series). The following pages document the most +important unresolved software issues in Charmed OpenStack: + +.. toctree:: + :maxdepth: 1 + + issues/upgrade-issues + issues/various-issues + +Special charm procedures +------------------------ + +During the lifetime of a cloud, the evolution of the upstream software or that +of the charms will necessitate action on the part of the cloud administrator. +This often involves replacing existing charms with new charms. For example, a +charm may become deprecated and be superseded by a new charm, resulting in a +recommendation (or a requirement) to use the new charm. The following is the +list of documented special charm procedures: + +.. toctree:: + :maxdepth: 1 + + procedures/ceph-charm-migration + procedures/percona-series-upgrade-to-focal + procedures/placement-charm + procedures/cinder-lvm-migration + procedures/charmhub-migration + procedures/ovn-migration + +OpenStack upgrade path notes +---------------------------- + +Occasionally, a specific OpenStack upgrade path will require particular +attention. These upgrade notes typically contain software issues. If a special +charm procedure is pertinent however, it will be mentioned. It is important to +study the page that may apply to your upgrade scenario. The following is the +list of documented OpenStack upgrade path notes: + +.. toctree:: + :maxdepth: 1 + + issues/upgrade-ussuri-to-victoria + issues/upgrade-stein-to-train + issues/upgrade-queens-to-rocky + issues/upgrade-newton-to-ocata + issues/upgrade-mitaka-to-newton diff --git a/doc/source/project/issues/upgrade-issues.rst b/doc/source/project/issues/upgrade-issues.rst new file mode 100644 index 00000000..440749a6 --- /dev/null +++ b/doc/source/project/issues/upgrade-issues.rst @@ -0,0 +1,219 @@ +============== +Upgrade issues +============== + +This page documents upgrade issues and notes. These may apply to either of the +three upgrade types (charms, OpenStack, series). + +.. important:: + + It is recommended to read the :doc:`../issues-and-procedures` page before + continuing. + +The issues are organised by upgrade type: + +.. contents:: + :local: + :depth: 2 + :backlinks: top + +.. _upgrade_issues_charm_upgrades: + +Charm upgrades +-------------- + +rabbitmq-server charm +~~~~~~~~~~~~~~~~~~~~~ + +A timing issue has been observed during the upgrade of the rabbitmq-server +charm (see bug `LP #1912638`_ for tracking). If it occurs the resulting hook +error can be resolved with: + +.. code-block:: none + + juju resolved rabbitmq-server/N + +openstack-dashboard charm: upgrading to revision 294 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +When Horizon is configured with TLS (openstack-dashboard charm option +``ssl-cert``) revisions 294 and 295 of the charm have been reported to break +the dashboard (see bug `LP #1853173`_). The solution is to upgrade to a working +revision. A temporary workaround is to disable TLS without upgrading. + +.. note:: + + Most users will not be impacted by this issue as the recommended approach is + to always upgrade to the latest revision. + +To upgrade to revision 293: + +.. code-block:: none + + juju upgrade-charm openstack-dashboard --revision 293 + +To upgrade to revision 296: + +.. code-block:: none + + juju upgrade-charm openstack-dashboard --revision 296 + +To disable TLS: + +.. code-block:: none + + juju config enforce-ssl=false openstack-dashboard + +Multiple charms: option ``worker-multiplier`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Starting with OpenStack Charms 21.04 any charm that supports the +``worker-multiplier`` configuration option will, upon upgrade, modify the +active number of service workers according to the following: if the option is +not set explicitly the number of workers will be capped at four regardless of +whether the unit is containerised or not. Previously, the cap applied only to +containerised units. + +manila-ganesha charm: package updates +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +To fix long-standing issues in the manila-ganesha charm related to `Manila +exporting shares after restart`_, the nfs-ganesha Ubuntu package must be +updated on all affected units prior to the upgrading of the manila-ganesha +charm in OpenStack Charms 21.10. + +.. _charm_upgrade_issue-radosgw_gss: + +ceph-radosgw charm: upgrading to channel ``quincy/stable`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Due to a `ceph-radosgw charm change`_ in the ``quincy/stable`` channel, URLs +are processed differently by the RADOS Gateway. This will lead to breakage for +an existing ``product-streams`` endpoint, set up by the +glance-simplestreams-sync application, that includes a trailing slash in its +URL. + +The glance-simplestreams-sync charm has been fixed in the ``yoga/stable`` +channel, but it will not update a pre-existing endpoint. The URL must be +modified (remove the trailing slash) with native OpenStack tooling: + +.. code-block:: none + + openstack endpoint list --service product-streams + openstack endpoint set --url + +.. _upgrade_issues_openstack_upgrades: + +OpenStack upgrades +------------------ + +Nova RPC version mismatches: upgrading Neutron and Nova +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If it is not possible to upgrade Neutron and Nova within the same maintenance +window, be mindful that the RPC communication between nova-cloud-controller, +nova-compute, and nova-api-metadata is very likely to cause several errors +while those services are not running the same version. This is due to the fact +that currently those charms do not support RPC version pinning or +auto-negotiation. + +See bug `LP #1825999`_. + +Ceph BlueStore mistakenly enabled during OpenStack upgrade +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The Ceph BlueStore storage backend is enabled by default when Ceph Luminous is +detected. Therefore it is possible for a non-BlueStore cloud to acquire +BlueStore by default after an OpenStack upgrade (Luminous first appeared in +Queens). Problems will occur if storage is scaled out without first disabling +BlueStore (set ceph-osd charm option ``bluestore`` to 'False'). See bug `LP +#1885516`_ for details. + +.. _ceph-require-osd-release: + +Ceph: option ``require-osd-release`` +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Before upgrading Ceph its ``require-osd-release`` option should be set to the +current Ceph release (e.g. 'nautilus' if upgrading to Octopus). Failing to do +so may cause the upgrade to fail, rendering the cluster inoperable. + +On any ceph-mon unit, the current value of the option can be queried with: + +.. code-block:: none + + sudo ceph osd dump | grep require_osd_release + +If it needs changing, it can be done manually on any ceph-mon unit. Here the +current release is Nautilus: + +.. code-block:: none + + sudo ceph osd require-osd-release nautilus + +In addition, upon completion of the upgrade, the option should be set to the +new release. Here the new release is Octopus: + +.. code-block:: none + + sudo ceph osd require-osd-release octopus + +The charms should be able to respond intelligently to these two situations. Bug +`LP #1929254`_ is for tracking this effort. + +Octavia +~~~~~~~ + +An Octavia upgrade may entail an update of its load balancers (amphorae) as a +post-upgrade task. Reasons for doing this include: + +* API incompatibility between the amphora agent and the new Octavia service +* the desire to use features available in the new amphora agent or haproxy + +See the upstream documentation on `Rotating amphora images`_. + +.. _upgrade_issues_series_upgrades: + +Series upgrades +--------------- + +DNS HA: upgrade to focal +~~~~~~~~~~~~~~~~~~~~~~~~ + +DNS HA has been reported to not work on the focal series. See `LP #1882508`_ +for more information. + +Upgrading while Vault is sealed +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +If a series upgrade is attempted while Vault is sealed then manual intervention +will be required (see bugs `LP #1886083`_ and `LP #1890106`_). The vault leader +unit (which will be in error) will need to be unsealed and the hook error +resolved. The `vault charm`_ README has unsealing instructions, and the hook +error can be resolved with: + +.. code-block:: none + + juju resolved vault/N + +.. LINKS +.. _Release Notes: https://docs.openstack.org/charm-guide/latest/release-notes.html +.. _Ubuntu Cloud Archive: https://wiki.ubuntu.com/OpenStack/CloudArchive +.. _Upgrades: https://docs.openstack.org/operations-guide/ops-upgrades.html +.. _Update services: https://docs.openstack.org/operations-guide/ops-upgrades.html#update-services +.. _Various issues: various-issues.html +.. _Special charm procedures: upgrade-special.html +.. _vault charm: https://opendev.org/openstack/charm-vault/src/branch/master/src/README.md#unseal-vault +.. _manila exporting shares after restart: https://bugs.launchpad.net/charm-manila-ganesha/+bug/1889287 +.. _Rotating amphora images: https://docs.openstack.org/octavia/latest/admin/guides/operator-maintenance.html#rotating-the-amphora-images +.. _ceph-radosgw charm change: https://review.opendev.org/c/openstack/charm-ceph-radosgw/+/835827 + +.. BUGS +.. _LP #1825999: https://bugs.launchpad.net/charm-nova-compute/+bug/1825999 +.. _LP #1853173: https://bugs.launchpad.net/charm-openstack-dashboard/+bug/1853173 +.. _LP #1882508: https://bugs.launchpad.net/charm-deployment-guide/+bug/1882508 +.. _LP #1885516: https://bugs.launchpad.net/charm-deployment-guide/+bug/1885516 +.. _LP #1886083: https://bugs.launchpad.net/vault-charm/+bug/1886083 +.. _LP #1890106: https://bugs.launchpad.net/vault-charm/+bug/1890106 +.. _LP #1912638: https://bugs.launchpad.net/charm-rabbitmq-server/+bug/1912638 +.. _LP #1929254: https://bugs.launchpad.net/charm-ceph-osd/+bug/1929254 diff --git a/doc/source/project/issues/upgrade-mitaka-to-newton.rst b/doc/source/project/issues/upgrade-mitaka-to-newton.rst new file mode 100644 index 00000000..032498c5 --- /dev/null +++ b/doc/source/project/issues/upgrade-mitaka-to-newton.rst @@ -0,0 +1,21 @@ +========================= +Upgrade: Mitaka to Newton +========================= + +This page contains notes specific to the Mitaka to Newton upgrade path. See the +main :doc:`cdg:upgrade-openstack` page for full coverage. + +neutron-gateway charm options +----------------------------- + +Between the Mitaka and Newton OpenStack releases, the neutron-gateway charm +added two options, ``bridge-mappings`` and ``data-port``, which replaced the +(now) deprecated ``ext-port`` option. This was to provide for more control over +how ``neutron-gateway`` can configure external networking. Unfortunately, the +charm was only designed to work with either ``ext-port`` (no longer +recommended) *or* ``bridge-mappings`` and ``data-port``. + +See bug `LP #1809190`_. + +.. BUGS +.. _LP #1809190: https://bugs.launchpad.net/charm-neutron-gateway/+bug/1809190 diff --git a/doc/source/project/issues/upgrade-newton-to-ocata.rst b/doc/source/project/issues/upgrade-newton-to-ocata.rst new file mode 100644 index 00000000..f7cad9a9 --- /dev/null +++ b/doc/source/project/issues/upgrade-newton-to-ocata.rst @@ -0,0 +1,63 @@ +======================== +Upgrade: Newton to Ocata +======================== + +This page contains notes specific to the Newton to Ocata upgrade path. See the +main :doc:`cdg:upgrade-openstack` page for full coverage. + +cinder/ceph topology change +--------------------------- + +If cinder is directly related to ceph-mon rather than via cinder-ceph then +upgrading from Newton to Ocata will result in the loss of some block storage +functionality, specifically live migration and snapshotting. To remedy this +situation the deployment should migrate to using the cinder-ceph charm. This +can be done after the upgrade to Ocata. + +.. warning:: + + Do not attempt to migrate a deployment with existing volumes to use the + cinder-ceph charm prior to Ocata. + +The intervention is detailed in the below three steps. + +Step 0: Check existing configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Confirm existing volumes are in an RBD pool called 'cinder': + +.. code-block:: none + + juju run --unit cinder/0 "rbd --name client.cinder -p cinder ls" + +Sample output: + +.. code-block:: none + + volume-b45066d3-931d-406e-a43e-ad4eca12cf34 + volume-dd733b26-2c56-4355-a8fc-347a964d5d55 + +Step 1: Deploy new topology +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Deploy the ``cinder-ceph`` charm and set the 'rbd-pool-name' to match the pool +that any existing volumes are in (see above): + +.. code-block:: none + + juju deploy --config rbd-pool-name=cinder cinder-ceph + juju add-relation cinder cinder-ceph + juju add-relation cinder-ceph ceph-mon + juju remove-relation cinder ceph-mon + juju add-relation cinder-ceph nova-compute + +Step 2: Update volume configuration +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The existing volumes now need to be updated to associate them with the newly +defined cinder-ceph backend: + +.. code-block:: none + + juju run-action cinder/0 rename-volume-host currenthost='cinder' \ + newhost='cinder@cinder-ceph#cinder.volume.drivers.rbd.RBDDriver' diff --git a/doc/source/project/issues/upgrade-queens-to-rocky.rst b/doc/source/project/issues/upgrade-queens-to-rocky.rst new file mode 100644 index 00000000..a24f22ed --- /dev/null +++ b/doc/source/project/issues/upgrade-queens-to-rocky.rst @@ -0,0 +1,33 @@ +======================== +Upgrade: Queens to Rocky +======================== + +This page contains notes specific to the Queens to Rocky upgrade path. See the +main :doc:`cdg:upgrade-openstack` page for full coverage. + +Keystone and Fernet tokens +-------------------------- + +Starting with OpenStack Rocky only the Fernet format for authentication tokens +is supported. Therefore, prior to upgrading Keystone to Rocky a transition must +be made from the legacy format (of UUID) to Fernet. + +Fernet support is available upstream (and in the keystone charm) starting with +Ocata so the transition can be made on either Ocata, Pike, or Queens. + +A keystone charm upgrade will not alter the token format. The charm's +``token-provider`` option must be used to make the transition: + +.. code-block:: none + + juju config keystone token-provider=fernet + +This change may result in a minor control plane outage but any running +instances will remain unaffected. + +The ``token-provider`` option has no effect starting with Rocky, where the +charm defaults to Fernet and where upstream removes support for UUID. See +`Keystone Fernet Token Implementation`_ for more information. + +.. LINKS +.. _Keystone Fernet Token Implementation: https://specs.openstack.org/openstack/charm-specs/specs/rocky/implemented/keystone-fernet-tokens.html diff --git a/doc/source/project/issues/upgrade-stein-to-train.rst b/doc/source/project/issues/upgrade-stein-to-train.rst new file mode 100644 index 00000000..4006060d --- /dev/null +++ b/doc/source/project/issues/upgrade-stein-to-train.rst @@ -0,0 +1,43 @@ +======================= +Upgrade: Stein to Train +======================= + +This page contains notes specific to the Stein to Train upgrade path. See the +main :doc:`cdg:upgrade-openstack` page for full coverage. + +New placement charm +------------------- + +This upgrade path requires the inclusion of a new charm. Please see special +charm procedure :doc:`../procedures/placement-charm`. + +Placement endpoints not updated in Keystone service catalogue +------------------------------------------------------------- + +When the placement charm is deployed during the upgrade to Train (as per the +above section) the Keystone service catalogue is not updated accordingly. This +issue is tracked in bug `LP #1928992`_, which also includes an explicit +workaround (comment #4). + +Neutron LBaaS retired +--------------------- + +As of Train, support for Neutron LBaaS has been retired. The load-balancing +services are now provided by Octavia LBaaS. There is no automatic migration +path, please review the :doc:`../../admin/networking/load-balancing` page in +the Charm Guide for more information. + +Designate encoding issue +------------------------ + +When upgrading Designate to Train, there is an encoding issue between the +designate-producer and memcached that causes the designate-producer to crash. +See bug `LP #1828534`_. This can be resolved by restarting the memcached service. + +.. code-block:: none + + juju run --application=memcached 'sudo systemctl restart memcached' + +.. BUGS +.. _LP #1828534: https://bugs.launchpad.net/charm-designate/+bug/1828534 +.. _LP #1928992: https://bugs.launchpad.net/charm-deployment-guide/+bug/1928992 diff --git a/doc/source/project/issues/upgrade-ussuri-to-victoria.rst b/doc/source/project/issues/upgrade-ussuri-to-victoria.rst new file mode 100644 index 00000000..f4e73351 --- /dev/null +++ b/doc/source/project/issues/upgrade-ussuri-to-victoria.rst @@ -0,0 +1,23 @@ +=========================== +Upgrade: Ussuri to Victoria +=========================== + +This page contains notes specific to the Ussuri to Victoria upgrade path. See +the main :doc:`cdg:upgrade-openstack` page for full coverage. + +FWaaS project retired +--------------------- + +The Firewall-as-a-Service (`FWaaS v2`_) OpenStack project is retired starting +with OpenStack Victoria. Consequently, the neutron-api charm will no longer +make this service available starting with that OpenStack release. See the +`21.10 release notes`_ on this topic. + +Prior to upgrading to Victoria users of FWaaS should remove any existing +firewall groups to avoid the possibility of orphaning active firewalls (see the +`FWaaS v2 CLI documentation`_). + +.. LINKS +.. _21.10 release notes: https://docs.openstack.org/charm-guide/latest/2110.html +.. _FWaaS v2: https://docs.openstack.org/neutron/ussuri/admin/fwaas.html +.. _FWaaS v2 CLI documentation: https://docs.openstack.org/python-neutronclient/ussuri/cli/osc/v2/firewall-group.html diff --git a/doc/source/project/issues/various-issues.rst b/doc/source/project/issues/various-issues.rst new file mode 100644 index 00000000..b9e11466 --- /dev/null +++ b/doc/source/project/issues/various-issues.rst @@ -0,0 +1,63 @@ +============== +Various issues +============== + +This page documents various issues (software limitations/bugs) that may apply +to a Charmed OpenStack cloud. These are still-valid issues that have arisen +during the development cycles of past OpenStack Charms releases. The most +recently discovered issues are documented in the +:doc:`../../release-notes/index` of the latest version of the OpenStack Charms. + +.. important:: + + It is recommended to read the :doc:`../issues-and-procedures` page before + continuing. + +Lack of FQDN for containers on physical MAAS nodes may affect running services +------------------------------------------------------------------------------ + +When Juju deploys to a LXD container on a physical MAAS node, the container is +not informed of its FQDN. The services running in the container will therefore +be unable to determine the FQDN on initial deploy and on reboot. + +Adverse effects are service dependent. This issue is tracked in bug `LP +#1896630`_ in an OVN and Octavia context. Several workarounds are documented in +the bug. + +Adding Glance storage backends +------------------------------ + +When a storage backend is added to Glance a service restart may be necessary in +order for the new backend to be registered. This issue is tracked in bug `LP +#1914819`_. + +OVN and SR-IOV: servicing external DHCP and metadata requests +------------------------------------------------------------- + +When instances are deployed with SR-IOV networking in an OVN deployment a +change of configuration may be required to retain servicing of DHCP and +metadata requests. + +If your deployment has SR-IOV instances, make sure that at least one of the OVN +chassis named applications has the ``prefer-chassis-as-gw`` configuration +option set to 'true'. + +The root of the issue is in how Neutron handles scheduling of gateway chassis +for L3 routers and external services differently, and is tracked in bug `LP +#1946456`_. + +Ceph RBD Mirror and Ceph Octopus +-------------------------------- + +Due to an unresolved permission issue the ceph-rbd-mirror charm will stay in a +blocked state after configuring mirroring for pools when connected to a Ceph +Octopus cluster. See bug `LP #1879749`_ for details. + +.. LINKS +.. _Release notes: https://docs.openstack.org/charm-guide/latest/release-notes.html + +.. BUGS +.. _LP #1896630: https://bugs.launchpad.net/charm-layer-ovn/+bug/1896630 +.. _LP #1914819: https://bugs.launchpad.net/charm-glance/+bug/1914819 +.. _LP #1946456: https://bugs.launchpad.net/bugs/1946456 +.. _LP #1879749: https://bugs.launchpad.net/charm-ceph-rbd-mirror/+bug/1879749 diff --git a/doc/source/project/procedures/ceph-charm-migration.rst b/doc/source/project/procedures/ceph-charm-migration.rst new file mode 100644 index 00000000..1e948f58 --- /dev/null +++ b/doc/source/project/procedures/ceph-charm-migration.rst @@ -0,0 +1,137 @@ +============================================== +ceph charm: migration to ceph-mon and ceph-osd +============================================== + +.. note:: + + This page describes a procedure that may be required when performing an + upgrade of an OpenStack cloud. Please read the more general + :doc:`cdg:upgrade-overview` before attempting any of the instructions given + here. + +In order to continue to receive updates to newer Ceph versions, and for general +improvements and features in the charms to deploy Ceph, users of the ceph charm +should migrate existing services to using ceph-mon and ceph-osd. + +.. note:: + + This example migration assumes that the ceph charm is deployed to machines + 0, 1 and 2 with the ceph-osd charm deployed to other machines within the + model. + +Procedure +--------- + +Upgrade charms +~~~~~~~~~~~~~~ + +The entire suite of charms used to manage the cloud should be upgraded to the +latest stable charm revision before any major change is made to the cloud such +as the current migration to these new charms. See `Charms upgrade`_ for +guidance. + +Deploy ceph-mon +~~~~~~~~~~~~~~~ + +.. warning:: + + Every new ceph-mon unit introduced will result in a Ceph monitor receiving a + new IP address. However, due to an issue in Nova, this fact is not + propagated completely throughout the cloud under certain circumstances, + thereby affecting Ceph RBD volume reachability. + + Any instances previously deployed using Cinder to interface with Ceph, or + using Nova's ``libvirt-image-backend=rbd`` setting will require a manual + database update to change to the new addresses. For Cinder, its stale data + will also need to be updated in the 'block_device_mapping' table. + + Failure to do this can result in instances being unable to start as their + volumes cannot be reached. See bug `LP #1452641`_. + +First deploy the ceph-mon charm; if the existing ceph charm is deployed to machines +0, 1 and 2, you can place the ceph-mon units in LXD containers on these machines: + +.. code-block:: none + + juju deploy --to lxd:0 ceph-mon + juju config ceph-mon no-bootstrap=True + juju add-unit --to lxd:1 ceph-mon + juju add-unit --to lxd:2 ceph-mon + +These units will install ceph, but will not bootstrap into a running monitor cluster. + +Bootstrap ceph-mon from ceph +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Next, we'll use the existing ceph application to bootstrap the new ceph-mon units: + +.. code-block:: none + + juju add-relation ceph ceph-mon + +Once this process has completed, you should have a Ceph MON cluster of 6 units; +this can be verified on any of the ceph or ceph-mon units: + +.. code-block:: none + + sudo ceph -s + +Deploy ceph-osd to ceph units +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +In order to retain any running Ceph OSD processes on the ceph units, the ceph-osd +charm must be deployed to the existing machines running the ceph units: + +.. code-block:: none + + juju config ceph-osd osd-reformat=False + juju add-unit --to 0 ceph-osd + juju add-unit --to 1 ceph-osd + juju add-unit --to 2 ceph-osd + +As of the 18.05 charm release, the ``osd-reformat`` configuration option has +been completely removed. + +The charm installation and configuration will not impact any existing running +Ceph OSDs. + +Relate ceph-mon to all ceph clients +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The new ceph-mon units now need to be related to the ceph-osd application: + +.. code-block:: none + + juju add-relation ceph-mon ceph-osd + +Depending on your deployment you'll also need to add relations for other +applications, for example: + +.. code-block:: none + + juju add-relation ceph-mon cinder-ceph + juju add-relation ceph-mon glance + juju add-relation ceph-mon nova-compute + juju add-relation ceph-mon ceph-radosgw + juju add-relation ceph-mon gnocchi + +Once hook execution completes across all units, each client should be +configured with six MON addresses. + +Remove the ceph application +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Its now safe to remove the ceph application from your deployment: + +.. code-block:: none + + juju remove-application ceph + +As each unit of the ceph application is destroyed, its stop hook will remove +the MON process from the Ceph cluster monmap and disable Ceph MON and MGR +processes running on the machine; any Ceph OSD processes remain untouched and +are now owned by the ceph-osd units deployed alongside ceph. + +.. LINKS +.. _Charms upgrade: upgrade-charms.html +.. _LP #1452641: https://bugs.launchpad.net/nova/+bug/1452641 diff --git a/doc/source/project/procedures/charmhub-migration.rst b/doc/source/project/procedures/charmhub-migration.rst new file mode 100644 index 00000000..7eea36c8 --- /dev/null +++ b/doc/source/project/procedures/charmhub-migration.rst @@ -0,0 +1,167 @@ +================================= +All charms: migration to channels +================================= + +Charmed OpenStack deployments must eventually migrate from legacy non-channel +charms to charms that use a channel. + +.. important:: + + See `Charm delivery`_ for an overview of how OpenStack charms are + distributed. + +Background +---------- + +All charms are now served from the `Charmhub`_, regardless of which prefix +(``cs:`` or ``ch:``) is used to deploy charms. Furthermore, when a channel is +not requested at deploy time, the ``latest/stable`` channel in the Charmhub is +sourced, which points to the 21.10 stable release of OpenStack Charms (or the +21.06 stable release of Trilio Charms). + +All maintenance for stable charms occurs on the various explicitly-named +channels (i.e. not based on the ``latest`` track). These are the channels that +the charms must be migrated to. + +.. warning:: + + The OpenStack Charms project strongly advises against the use of the + ``latest`` track due to its implicit nature. In doing so, a future charm + upgrade may result in a charm version that does not support your current + OpenStack release. + +Determine current versions +-------------------------- + +A charm's channel is selected based on its corresponding service's software +version. Use the information in the below two sub-sections to determine the +version running for each application in your deployment. + +OpenStack service charms +~~~~~~~~~~~~~~~~~~~~~~~~ + +For OpenStack service charms, to get the running OpenStack version you can +inspect the value assigned to the ``openstack-origin`` charm configuration +option. + +For example, if the :command:`juju config keystone openstack-origin` command +outputs 'focal-xena' then the running OpenStack version is Xena. + +It is expected that all OpenStack service charms will report the same OpenStack +version. + +All other charms +~~~~~~~~~~~~~~~~ + +For all other charms, utilise the :command:`juju status` command. + +Examples: + +* rabbitmq-server: + + .. code-block:: console + + App Version Status Scale Charm Store Channel Rev OS Message + rabbitmq-server 3.8.2 active 1 rabbitmq-server charmstore stable 117 ubuntu Unit is ready + + RabbitMQ is running version ``3.8.2``. + +* ceph-osd: + + .. code-block:: none + + App Version Status Scale Charm Store Channel Rev OS Message + ceph-osd 16.2.6 active 3 ceph-osd charmstore stable 315 ubuntu Unit is ready (2 OSD) + + Ceph is running version ``16.2.6`` (Pacific). + + Since the Ceph channels are based on code names, as a convenience, a mapping + of versions to code names is provided: + + +---------+-----------+ + | Version | Code name | + +=========+===========+ + | 12.2.13 | Luminous | + +---------+-----------+ + | 13.2.9 | Mimic | + +---------+-----------+ + | 14.2.22 | Nautilus | + +---------+-----------+ + | 15.2.14 | Octopus | + +---------+-----------+ + | 16.2.6 | Pacific | + +---------+-----------+ + +Select the channels +------------------- + +`Charm delivery`_ includes a list of all the tracks available to the +OpenStack charms. + +Examples: + +* if Ceph Octopus is running, then any Ceph charm that supports the + ``octopus/stable`` channel should use that channel + +* if OVN 20.03 is running, then any OVN charm that supports the + ``20.03/stable`` channel should use that channel + +* if RabbitMQ 3.8.2 is running, then the rabbitmq-server charm should use the + ``3.8/stable`` channel + +Based on this information, select the appropriate channel for each charm in +your deployment. + +Upgrade Juju +------------ + +Upgrade every Juju component of the given deployment to Juju ``2.9``. This +includes the Juju client, the controller model, and the workload model. See the +`Juju documentation`_ for guidance. + +Perform the migration +--------------------- + +The migration consists of replacing all charms with new but software-equivalent +charms. Technically, this is not an upgrade but a form of crossgrade. + +.. note:: + + There is no need to upgrade the current charms to their latest stable + revision prior to the migration. + +The charm of a currently-deployed application is migrated according to the +following syntax: + +.. code-block:: none + + juju refresh --switch ch: --channel= + +For example, if the selected channel for the rabbitmq-server charm is +``3.8/stable`` then: + +.. code-block:: none + + juju refresh --switch ch:rabbitmq-server --channel=3.8/stable rabbitmq-server + +The application argument represents the application as it appears in the model. +That is, it may be a named application (e.g. 'mysql' and not +'mysql-innodb-cluster'). + +Change operator behaviour +------------------------- + +Once all of your deployment's charms have been migrated to channels it is +important to: + +* stop using the ``cs:`` prefix when referencing charms, whether in bundles or + on the command line. Use the ``ch:`` prefix instead. Note that Juju ``2.9`` + uses the ``ch:`` prefix by default on the command line. + +* always specify a channel when deploying a charm (e.g. :command:`juju deploy + --channel=pacific/stable ceph-radosgw`) + +.. LINKS +.. _Charmhub: https://charmhub.io +.. _Juju documentation: https://juju.is/docs/olm/upgrading +.. _Charm delivery: https://docs.openstack.org/charm-guide/latest/project/charm-delivery.html diff --git a/doc/source/project/procedures/cinder-lvm-migration.rst b/doc/source/project/procedures/cinder-lvm-migration.rst new file mode 100644 index 00000000..80dcff6b --- /dev/null +++ b/doc/source/project/procedures/cinder-lvm-migration.rst @@ -0,0 +1,71 @@ +==================================================== +LVM support in cinder charm: migration to cinder-lvm +==================================================== + +As of the 21.10 release of OpenStack Charms, support for local (LVM) Cinder +storage in the `cinder`_ charm is deprecated. This functionality has been +de-coupled and is now managed by the `cinder-lvm`_ subordinate charm. This page +shows how to migrate from the cinder charm to the cinder-lvm charm. + +.. warning:: + + The migration will necessitate a short cloud maintenance window, enough time + to deploy the subordinate charm onto the existing machine(s). Cloud + operators will be unable to create new Cinder volumes during this window. + +The LVM feature in the cinder charm is potentially in use if the +``block-device`` option is set to a value other than 'None'. Check this with +the following command: + +.. code-block:: none + + juju config cinder block-device + +Begin by disabling the LVM feature in the cinder charm: + +.. code-block:: none + + juju config cinder block-device=None + +Migrating LVM functionality amounts to setting the cinder-lvm charm +configuration options to the same values as those used for the identically +named options in the cinder charm: + +* ``block-device`` +* ``ephemeral-unmount`` +* ``remove-missing`` +* ``remove-missing-force`` +* ``overwrite`` +* ``volume-group`` + +Secondly, the ``volume-backend-name`` option specific to the cinder-lvm charm +needs to be set to 'cinder_lvm', the LVM backend driver. + +All of this configuration can be stated most easily with a configuration file, +say ``cinder-lvm.yaml``: + +.. code-block:: yaml + + cinder-lvm: + block-device: sdb + ephemeral-unmount: + remove-missing: false + remove-missing-force: false + overwrite: false + volume-group: cinder-volumes + volume-backend-name: cinder_lvm + +Now deploy cinder-lvm while referencing the configuration file and then add a +relation to the cinder application: + +.. code-block:: none + + juju deploy --config cinder-lvm.yaml cinder-lvm + juju add-relation cinder-lvm:storage-backend cinder:storage-backend + +Verify that Cinder volumes can be created as usual and that existing VMs +utilising Cinder volumes have not been adversely affected. + +.. LINKS +.. _cinder: https://jaas.ai/cinder +.. _cinder-lvm: https://jaas.ai/cinder-lvm diff --git a/doc/source/project/procedures/ovn-migration.rst b/doc/source/project/procedures/ovn-migration.rst new file mode 100644 index 00000000..07d923f6 --- /dev/null +++ b/doc/source/project/procedures/ovn-migration.rst @@ -0,0 +1,391 @@ +================ +Migration to OVN +================ + +Starting with OpenStack Ussuri, Charmed OpenStack recommends OVN as the cloud's +software defined networking framework (SDN). This page outlines the procedure +for migrating an existing non-OVN cloud to OVN. Technically, it describes how +to move from "Neutron ML2+OVS" to "Neutron ML2+OVN". + +On a charm level, the migration entails replacing these charms: + +* neutron-gateway +* neutron-openvswitch + +with these charms: + +* ovn-central +* ovn-chassis (or ovn-dedicated-chassis) +* neutron-api-plugin-ovn charms + +Post-migration, the :doc:`../../admin/networking/ovn/index` page in the Charm +Guide includes information on configuration and usage. + +MTU considerations +------------------ + +When migrating from ML2+OVS to ML2+OVN there will be a change of encapsulation +for the tunnels in the overlay network to ``geneve``. A side effect of the +change of encapsulation is that the packets transmitted on the physical network +get larger. + +You must examine the existing configuration of network equipment, physical +links on hypervisors and configuration of existing virtual project networks to +determine if there is room for this growth. + +Making room for the growth could be accomplished by increasing the MTU +configuration on the physical network equipment and hypervisor physical links. +If this can be done then steps #1 and #9 below can be skipped, where it is +shown how to **reduce** the MTU on all existing cloud instances. + +Remember to take any other encapsulation used in your physical network +equipment into account when calculating the MTU (VLAN tags, MPLS labels etc.). + +Encapsulation types and their overhead: + ++---------------+----------+------------------------+ +| Encapsulation | Overhead | Difference from Geneve | ++===============+==========+========================+ +| Geneve | 38 Bytes | 0 Bytes | ++---------------+----------+------------------------+ +| VXLAN | 30 Bytes | 8 Bytes | ++---------------+----------+------------------------+ +| GRE | 22 Bytes | 16 bytes | ++---------------+----------+------------------------+ + +Confirmation of migration actions +--------------------------------- + +Many of the actions used for the migration require a confirmation from the +operator by way of the ``i-really-mean-it`` parameter. + +This parameter accepts the values 'true' or 'false'. If 'false' the requested +operation will either not be performed, or will be performed in dry-run mode, +if 'true' the requested operation will be performed. + +In the examples below the parameter will not be listed, this is deliberate to +avoid accidents caused by cutting and pasting the wrong command into a +terminal. + +Prepare for the migration +------------------------- + +This section contains the preparation steps that will ensure minimal instance +down time during the migration. Ensure that you have studied them in advance +of the actual migration. + +.. important:: + + Allow for at least 24 hours to pass between the completion of the + preparation steps and the commencement of the actual migration steps. + This is particularly necesseary because depending on your physical network + configuration, it may be required to reduce the MTU size on all cloud + instances as part of the migration. + +1. Reduce MTU on all instances in the cloud if required + + Please refer to the MTU considerations section above. + + * Instances using DHCP can be controlled centrally by the cloud operator + by overriding the MTU advertised by the DHCP server. + + .. code-block:: none + + juju config neutron-gateway instance-mtu=1300 + + juju config neutron-openvswitch instance-mtu=1300 + + * Instances using IPv6 RA or SLAAC will automatically adjust + their MTU as soon as OVN takes over announcing the RAs. + + * Any instances not using DHCP must be configured manually by the end user of + the instance. + +2. Confirm cloud subnet configuration + + * Confirm that all subnets have IP addresses available for allocation. + + During the migration OVN may create a new port in subnets and allocate an + IP address to it. Depending on the type of network, this port will be used + for either the OVN metadata service or for the SNAT address assigned to an + external router interface. + + .. warning:: + + If a subnet has no free IP addresses for allocation the migration will + fail. + + * Confirm that all subnets have a valid DNS server configuration. + + OVN handles instance access to DNS differently to how ML2+OVS does. Please + refer to the Internal DNS resolution paragraph in this document for + details. + + When the subnet ``dns_nameservers`` attribute is empty the OVN DHCP server + will provide instances with the DNS addresses specified in the + neutron-api-plugin-ovn ``dns-servers`` configuration option. If any of + your subnets have the ``dns_nameservers`` attribute set to the IP address + ML2+OVS used for instance DNS (usually the .2 address of the project + subnet) you will need to remove this configuration. + +3. Make a fresh backup copy of the Neutron database + +4. Deploy the OVN components and Vault + + In your Juju model you can have a charm deployed multiple times using + different application names. In the text below this will be referred to as + "named application". One example where this is common is for deployments + with Octavia where it is common to use a separate named application for + neutron-openvswtich for use with the Octavia units. + + In addition to the central components you should deploy an ovn-chassis + named application for every neutron-openvswitch named application in your + deployment. For every neutron-gateway named application you should deploy an + ovn-dedicated-chassis named application to the same set of machines. + + At this point in time each hypervisor or gateway will have a Neutron + Open vSwitch (OVS) agent managing the local OVS instance. Network loops + may occur if an ovn-chassis unit is started as it will also attempt to + manage OVS. To avoid this, deploy ovn-chassis (or ovn-dedicated-chassis) in + a paused state by setting the ``new-units-paused`` configuration option to + 'true': + + .. code-block:: none + + juju deploy ovn-central \ + --series focal \ + -n 3 \ + --to lxd:0,lxd:1,lxd:2 + + juju deploy ovn-chassis \ + --series focal \ + --config new-units-paused=true \ + --config bridge-interface-mappings='br-provider:00:00:5e:00:00:42' \ + --config ovn-bridge-mappings=physnet1:br-provider + + juju deploy ovn-dedicated-chassis \ + --series focal \ + --config new-units-paused=true \ + --config bridge-interface-mappings='br-provider:00:00:5e:00:00:51' \ + --config ovn-bridge-mappings=physnet1:br-provider \ + -n 2 \ + --to 3,4 + + juju deploy --series focal mysql-router vault-mysql-router + juju deploy --series focal vault + + juju add-relation vault-mysql-router:db-router \ + mysql-innodb-cluster:db-router + juju add-relation vault-mysql-router:shared-db vault:shared-db + + juju add-relation ovn-central:certificates vault:certificates + + juju add-relation ovn-chassis:certificates vault:certificates + juju add-relation ovn-chassis:ovsdb ovn-central:ovsdb + juju add-relation nova-compute:neutron-plugin ovn-chassis:nova-compute + + The values to use for the ``bridge-interface-mappings`` and + ``ovn-bridge-mappings`` configuration options can be found by looking at + what is set for the ``data-port`` and ``bridge-mappings`` configuration + options on the neutron-openvswitch and/or neutron-gateway applications. + + .. note:: + + In the above example the placement given with the ``--to`` parameter to + :command:`juju` is just an example. Your deployment may also have + multiple named applications of the neutron-openvswitch charm and/or + mutliple applications related to the neutron-openvswitch named + applications. You must tailor the commands to fit with your deployments + topology. + +5. Unseal Vault (see the `vault charm`_), set up TLS certificates (see + `Managing TLS certificates`_), and validate that the services on ovn-central + units are running as expected. Please refer to the OVN + :doc:`../../admin/networking/ovn/index` page in the Charm Guide for more + information. + +Perform the migration +--------------------- + +6. Change firewall driver to 'openvswitch' + + To be able to successfully clean up after the Neutron agents on hypervisors + we need to instruct the neutron-openvswitch charm to use the 'openvswitch' + firewall driver. This is accomplished by setting the ``firewall-driver`` + configuration option to 'openvswitch'. + + .. code-block:: none + + juju config neutron-openvswitch firewall-driver=openvswitch + +7. Pause neutron-openvswitch and/or neutron-gateway units. + + If your deployments have two neutron-gateway units and four + neutron-openvswitch units the sequence of commands would be: + + .. code-block:: none + + juju run-action neutron-gateway/0 pause + juju run-action neutron-gateway/1 pause + juju run-action neutron-openvswitch/0 pause + juju run-action neutron-openvswitch/1 pause + juju run-action neutron-openvswitch/2 pause + juju run-action neutron-openvswitch/3 pause + +8. Deploy the Neutron OVN plugin application + + .. code-block:: none + + juju deploy neutron-api-plugin-ovn \ + --series focal \ + --config dns-servers=="1.1.1.1 8.8.8.8" + + juju add-relation neutron-api-plugin-ovn:neutron-plugin \ + neutron-api:neutron-plugin-api-subordinate + juju add-relation neutron-api-plugin-ovn:certificates \ + vault:certificates + juju add-relation neutron-api-plugin-ovn:ovsdb-cms ovn-central:ovsdb-cms + + The values to use for the ``dns-servers`` configuration option can be + found by looking at what is set for the ``dns-servers`` configuration + option on the neutron-openvswitch and/or neutron-gateway applications. + + .. note:: + + The plugin will not be activated until the neutron-api + ``manage-neutron-plugin-legacy-mode`` configuration option is changed in + step 9. + +9. Adjust MTU on overlay networks (if required) + + Now that 24 hours have passed since we reduced the MTU on the instances + running in the cloud as described in step 1, we can update the MTU setting + for each individual Neutron network: + + .. code-block:: none + + juju run-action --wait neutron-api-plugin-ovn/0 migrate-mtu + +10. Enable the Neutron OVN plugin + + .. code-block:: none + + juju config neutron-api manage-neutron-plugin-legacy-mode=false + + Wait for the deployment to settle. + +11. Pause the Neutron API units + + .. code-block:: none + + juju run-action neutron-api/0 pause + juju run-action neutron-api/1 pause + juju run-action neutron-api/2 pause + + Wait for the deployment to settle. + +12. Perform initial synchronization of the Neutron and OVN databases + + .. code-block:: none + + juju run-action --wait neutron-api-plugin-ovn/0 migrate-ovn-db + +13. (Optional) Perform Neutron database surgery to update ``network_type`` of + overlay networks to 'geneve'. + + At the time of this writing the Neutron OVN ML2 driver will assume that all + chassis participating in a network are using the 'geneve' tunnel protocol + and it will ignore the value of the `network_type` field in any + non-physical network in the Neutron database. It will also ignore the + `segmentation_id` field and let OVN assign the VNIs. + + The Neutron API currently does not support changing the type of a network, + so when doing a migration the above described behaviour is actually a + welcome one. + + However, after the migration is done and all the primary functions are + working, i.e. packets are forwarded. The end user of the cloud will be left + with the false impression of their existing 'gre' or 'vxlan' typed networks + still being operational on said tunnel protocols, while in reality 'geneve' + is used under the hood. + + The end user will also run into issues with modifying any existing networks + with `openstack network set` throwing error messages about networks of type + 'gre' or 'vxlan' not being supported. + + After running this action said networks will have their `network_type` + field changed to 'geneve' which will fix the above described problems. + + .. code-block:: none + + juju run-action --wait neutron-api-plugin-ovn/0 offline-neutron-morph-db + +14. Resume the Neutron API units + + .. code-block:: none + + juju run-action neutron-api/0 resume + juju run-action neutron-api/1 resume + juju run-action neutron-api/2 resume + + Wait for the deployment to settle. + +15. Migrate hypervisors and gateways + + The final step of the migration is to clean up after the Neutron agents + on the hypervisors/gateways and enable the OVN services so that they can + reprogram the local Open vSwitch. + + This can be done one gateway / hypervisor at a time or all at once to your + discretion. + + .. note:: + + During the migration instances running on a non-migrated hypervisor will + not be able to reach instances on the migrated hypervisors. + + .. caution:: + + When migrating a cloud with Neutron ML2+OVS+DVR+SNAT topology care should + be taken to take into account on which hypervisors essential agents are + running to minimize downtime for any instances on other hypervisors with + dependencies on them. + + .. code-block:: none + + juju run-action --wait neutron-openvswitch/0 cleanup + juju run-action --wait ovn-chassis/0 resume + + juju run-action --wait neutron-gateway/0 cleanup + juju run-action --wait ovn-dedicated-chassis/0 resume + +16. Post migration tasks + + Remove the now redundant Neutron ML2+OVS agents from hypervisors and + any dedicated gateways as well as the neutron-gateway and + neutron-openvswitch applications from the Juju model: + + .. code-block:: none + + juju run --application neutron-gateway '\ + apt remove -y neutron-dhcp-agent neutron-l3-agent \ + neutron-metadata-agent neutron-openvswitch-agent' + + juju remove-application neutron-gateway + + juju run --application neutron-openvswitch '\ + apt remove -y neutron-dhcp-agent neutron-l3-agent \ + neutron-metadata-agent neutron-openvswitch-agent' + + juju remove-application neutron-openvswitch + + Remove the now redundant Neutron ML2+OVS agents from the Neutron database: + + .. code-block:: none + + openstack network agent list + openstack network agent delete ... + +.. LINKS +.. _vault charm: https://charmhub.io/vault +.. _Managing TLS certificates: app-certificate-management.html diff --git a/doc/source/project/procedures/percona-series-upgrade-to-focal.rst b/doc/source/project/procedures/percona-series-upgrade-to-focal.rst new file mode 100644 index 00000000..a95d997a --- /dev/null +++ b/doc/source/project/procedures/percona-series-upgrade-to-focal.rst @@ -0,0 +1,258 @@ +============================================== +percona-cluster charm: series upgrade to focal +============================================== + +.. note:: + + This page describes a procedure that may be required when performing an + upgrade of an OpenStack cloud. Please read the more general + :doc:`cdg:upgrade-overview` before attempting any of the instructions given + here. + +In Ubuntu 20.04 LTS (Focal) the percona-xtradb-cluster-server package will no +longer be available. It has been replaced by mysql-server-8.0 and mysql-router +in Ubuntu main. Therefore, there is no way to series upgrade percona-cluster to +Focal. Instead the databases hosted by percona-cluster will need to be migrated +to mysql-innodb-cluster and mysql-router will need to be deployed as a +subordinate on the applications that use MySQL as a data store. + +.. warning:: + + Since the DB affects most OpenStack services it is important to have a + sufficient downtime window. The following procedure is written in an attempt + to migrate one service at a time (i.e. keystone, glance, cinder, etc). + However, it may be more practical to migrate all databases at the same time + during an extended downtime window, as there may be unexpected + interdependencies between services. + +.. note:: + + It is possible for percona-cluster to remain on Ubuntu 18.04 LTS while + the rest of the cloud migrates to Ubuntu 20.04 LTS. In fact, this state + will be one step of the migration process. + +.. caution:: + + It is recommended that all machines touched by the migration have their + system software packages updated prior to the migration. There are reports + indicating that MySQL authentication problems may arise later if this is not + done. + + +Procedure +^^^^^^^^^ + +* Leave all the percona-cluster machines on Bionic and upgrade the series of + the remaining machines in the cloud per the `Series upgrade OpenStack`_ page. + +* Deploy a mysql-innodb-cluster on Focal. + + .. warning:: + + If multiple network spaces are used in the deployment, due to the way + MySQL Innodb clustering promulgates cluster addresses via metadata, the + following space bindings must be bound to the same network space: + + * Primary application (i.e. keystone) + + * shared-db + + * Application's mysql-router (i.e. keystone-mysql-router) + + * shared-db + * db-rotuer + + * mysql-innodb-cluster + + * db-router + * cluster + + Space bindings are configured at application deploy time. Failure to + ensure this may lead to authentication errors if the client connection + uses the wrong interface to connect to the cluster. In the example below + we use space "db-space." + + .. code-block:: none + + juju deploy -n 3 mysql-innodb-cluster --series focal --bind "cluster=db-space db-router=db-space" + + .. note:: + + Any existing percona-cluster configuration related to performance tuning + should be configured on the mysql-innodb-cluster charm also. Although + there is not a one-to-one parity of options between the charms, there are + still several identical ones such as ``max-connections`` and + ``innodb-buffer-pool-size``. + + .. code-block:: none + + juju config mysql-innodb-cluster max-connections= innodb-buffer-pool-size= + +* Deploy (but do not yet relate) an instance of mysql-router for every + application that requires a data store (i.e. every application that was + related to percona-cluster). + + .. code-block:: none + + juju deploy mysql-router cinder-mysql-router --bind "shared-db=db-space db-router=db-space" + juju deploy mysql-router glance-mysql-router --bind "shared-db=db-space db-router=db-space" + juju deploy mysql-router keystone-mysql-router --bind "shared-db=db-space db-router=db-space" + ... + +* Add relations between the mysql-router instances and the + mysql-innodb-cluster. + + .. code-block:: none + + juju add-relation cinder-mysql-router:db-router mysql-innodb-cluster:db-router + juju add-relation glance-mysql-router:db-router mysql-innodb-cluster:db-router + juju add-relation keystone-mysql-router:db-router mysql-innodb-cluster:db-router + ... + +On a per-application basis: + +* Remove the relation between the application charm and the percona-cluster + charm. You can view existing relations with the :command:`juju status + percona-cluster --relations` command. + + .. code-block:: none + + juju remove-relation keystone:shared-db percona-cluster:shared-db + +* Dump the existing database(s) from percona-cluster. + + .. note:: + + In the following, the percona-cluster/0 and mysql-innodb-cluster/0 units + are used as examples. For percona, any unit of the application may be used, + though all the steps should use the same unit. For mysql-innodb-cluster, + the RW unit should be used. The RW unit of the mysql-innodb-cluster can be + determined from the :command:`juju status mysql-innodb-cluster` command. + + * Allow Percona to dump databases. See `Percona strict mode`_ to understand + the implications of this setting. + + .. code-block:: none + + juju run-action --wait percona-cluster/0 set-pxc-strict-mode mode=MASTER + + * Here is a non-exhaustive example that lists databases using the :command:`mysql` client: + + .. code-block:: none + + mysql> SHOW DATABASES; + +--------------------+ + | Database | + +--------------------+ + | information_schema | + | aodh | + | cinder | + | designate | + | dpm | + | glance | + | gnocchi | + | horizon | + | keystone | + | mysql | + | neutron | + | nova | + | nova_api | + | nova_cell0 | + | performance_schema | + | placement | + | sys | + +--------------------+ + 17 rows in set (0.10 sec) + + * Dump the specific application's database(s). + + .. note:: + + Depending on downtime restrictions it is possible to dump all OpenStack + databases at one time: run the ``mysqldump`` action and select them via + the ``databases`` parameter. For example: + ``databases=keystone,cinder,glance,nova,nova_api,nova_cell0,horizon`` + + Similarly, it is possible to import all the databases into + mysql-innodb-clulster from that single dump file. + + .. warning:: + + Do not (back up and) restore the Percona Cluster version of the 'mysql', + 'performance_schema', 'sys' or any other system specific databases into + the MySQL Innodb Cluster. Doing so will corrupt the DB and necessitate + the destruction and re-creation of the mysql-innodb-cluster application. + For more information see bug `LP #1936210`_. + + .. note:: + + The database name may or may not match the application name. For example, + while keystone has a DB named keystone, openstack-dashboard has a database + named horizon. Some applications have multiple databases. Notably, + nova-cloud-controller which has at least: nova,nova_api,nova_cell0 and a + nova_cellN for each additional cell. See upstream documentation for the + respective application to determine the database name. + + .. code-block:: none + + # Single DB + juju run-action --wait percona-cluster/0 mysqldump databases=keystone + + # Multiple DBs + juju run-action --wait percona-cluster/0 mysqldump \ + databases=aodh,cinder,designate,glance,gnochii,horizon,keystone,neutron,nova,nova_api,nova_cell0,placement + + * Return Percona enforcing strict mode. See `Percona strict mode`_ to + understand the implications of this setting. + + .. code-block:: none + + juju run-action --wait percona-cluster/0 set-pxc-strict-mode mode=ENFORCING + +* Transfer the mysqldump file from the percona-cluster unit to the + mysql-innodb-cluster RW unit. The RW unit of the mysql-innodb-cluster can be + determined with :command:`juju status mysql-innodb-cluster`. Bellow we use + mysql-innodb-cluster/0 as an example. + + .. code-block:: none + + juju scp percona-cluster/0:/var/backups/mysql/mysqldump-keystone-.gz . + juju scp mysqldump-keystone-.gz mysql-innodb-cluster/0:/home/ubuntu + +* Import the database(s) into mysql-innodb-cluster. + + .. code-block:: none + + juju run-action --wait mysql-innodb-cluster/0 restore-mysqldump dump-file=/home/ubuntu/mysqldump-keystone-.gz + +* Relate an instance of mysql-router for every application that requires a data + store (i.e. every application that needed percona-cluster): + + .. code-block:: none + + juju add-relation keystone:shared-db keystone-mysql-router:shared-db + +* Repeat for remaining applications. + +An overview of this process can be seen in the OpenStack charmer's team CI +`Zaza migration code`_. + +Post-migration +^^^^^^^^^^^^^^ + +As noted above, it is possible to run the cloud with percona-cluster remaining +on Bionic indefinitely. Once all databases have been migrated to +mysql-innodb-cluster, all the databases have been backed up, and the cloud has +been verified to be in good working order the percona-cluster application (and +its probable hacluster subordinates) may be removed. + +.. code-block:: none + + juju remove-application percona-cluster-hacluster + juju remove-application percona-cluster + +.. LINKS +.. _Zaza migration code: https://github.com/openstack-charmers/zaza-openstack-tests/blob/master/zaza/openstack/charm_tests/mysql/tests.py#L556 +.. _Percona strict mode: https://www.percona.com/doc/percona-xtradb-cluster/LATEST/features/pxc-strict-mode.html +.. _Series upgrade OpenStack: upgrade-series-openstack.html +.. _`LP #1936210`: https://bugs.launchpad.net/charm-deployment-guide/+bug/1936210 diff --git a/doc/source/project/procedures/placement-charm.rst b/doc/source/project/procedures/placement-charm.rst new file mode 100644 index 00000000..4bd0ccfa --- /dev/null +++ b/doc/source/project/procedures/placement-charm.rst @@ -0,0 +1,54 @@ +=========================================== +placement charm: OpenStack upgrade to Train +=========================================== + +.. note:: + + This page describes a procedure that is required when performing an upgrade + of an OpenStack cloud. Please read the more general + :doc:`cdg:upgrade-overview` before attempting any of the instructions given + here. + +As of OpenStack Train, the Placement API is managed by the new `placement`_ +charm and is no longer managed by the nova-cloud-controller charm. The upgrade +to Train therefore involves some coordination to transition to the new API +endpoints. + +Prior to upgrading nova-cloud-controller services to Train, the placement charm +must be deployed for Train and related to the Stein-based nova-cloud-controller +application. It is important that the nova-cloud-controller unit leader is +paused while the API transition occurs (paused prior to adding relations for +the placement charm) as the placement charm will migrate existing placement +tables from the nova_api database to a new placement database. Once the new +placement endpoints are registered, nova-cloud-controller can be resumed. + +Here are example commands for the process just described: + +.. code-block:: none + + juju deploy --series bionic --config openstack-origin=cloud:bionic-train cs:placement + juju run-action --wait nova-cloud-controller/leader pause + juju add-relation placement percona-cluster + juju add-relation placement keystone + juju add-relation placement nova-cloud-controller + +List endpoints and ensure placement endpoints are now listening on the new +placement IP address. Follow this up by resuming nova-cloud-controller: + +.. code-block:: none + + openstack endpoint list + juju run-action --wait nova-cloud-controller/leader resume + +Finally, upgrade the nova-cloud-controller services. Below all units are +upgraded simultaneously but see the :ref:`cdg:paused_single_unit` service +upgrade method for a more controlled approach: + +.. code-block:: none + + juju config nova-cloud-controller openstack-origin=cloud:bionic-train + +The Compute service (nova-compute) should then be upgraded. + +.. LINKS +.. _placement: https://charmhub.io/placement