Unify ceph nova-compute credentials
Ceph credentials should be the same across all nova-compute charm apps to allow live-migration of VMs booted from image (and stored in ceph via use of libvirt-image-backend=rbd config) to succeed. Related-bug: #2028559 Change-Id: If73469af2a372c0afbdbf338677782d475fd02e2
This commit is contained in:
parent
873641735c
commit
2c3f8bd9c0
@ -16,7 +16,6 @@ Here you can find the specs, and spec template, for each release:
|
|||||||
specs/2023.1/index
|
specs/2023.1/index
|
||||||
specs/zed/index
|
specs/zed/index
|
||||||
specs/yoga/index
|
specs/yoga/index
|
||||||
specs/yoga/implemented/*
|
|
||||||
specs/xena/index
|
specs/xena/index
|
||||||
specs/wallaby/index
|
specs/wallaby/index
|
||||||
specs/victoria/index
|
specs/victoria/index
|
||||||
|
@ -22,3 +22,11 @@ Template:
|
|||||||
:maxdepth: 1
|
:maxdepth: 1
|
||||||
|
|
||||||
backlog/*
|
backlog/*
|
||||||
|
|
||||||
|
2024.1 (Caracal) approved specs:
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:glob:
|
||||||
|
:maxdepth: 1
|
||||||
|
|
||||||
|
approved/*
|
||||||
|
290
specs/2024.1/approved/unify-ceph-nova-compute-credentials.rst
Normal file
290
specs/2024.1/approved/unify-ceph-nova-compute-credentials.rst
Normal file
@ -0,0 +1,290 @@
|
|||||||
|
..
|
||||||
|
Copyright 2023 Canonical Ltd.
|
||||||
|
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0
|
||||||
|
Unported License.
|
||||||
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
..
|
||||||
|
|
||||||
|
===================================
|
||||||
|
Unify ceph nova-compute credentials
|
||||||
|
===================================
|
||||||
|
|
||||||
|
Ceph credentials should be the same across all nova-compute charm apps
|
||||||
|
to allow live-migration of VMs booted from image (and stored in ceph
|
||||||
|
via use of libvirt-image-backend=rbd config) to succeed.
|
||||||
|
|
||||||
|
|
||||||
|
Problem Description
|
||||||
|
===================
|
||||||
|
|
||||||
|
Currently every nova-compute charm app registers a ceph credential named
|
||||||
|
after the charm app itself. Therefore, if a cloud has two nova-compute charm
|
||||||
|
apps named nova-compute-haswell and nova-compute-skylake, then their ceph
|
||||||
|
credentials will be the respective names.
|
||||||
|
|
||||||
|
The result of this is that live-migrating a VM booted from image where both
|
||||||
|
nova-compute charm apps have been configured to use libvirt-image-backend=rbd
|
||||||
|
(resulting in the image being stored in ceph) fails because the libvirt XML
|
||||||
|
for the instance will have the ceph credential and key associated with the
|
||||||
|
name on the source, to which the destination does not possess the key to access
|
||||||
|
the image, resulting in a failed migration due to permission denied error, as
|
||||||
|
documented in [1]_.
|
||||||
|
|
||||||
|
|
||||||
|
Proposed Change
|
||||||
|
===============
|
||||||
|
|
||||||
|
In order to solve the problem, all nova-compute charm apps need to have access
|
||||||
|
to the same ceph credentials and by extent, to the same key, allowing
|
||||||
|
credentials specified in the libvirt XML to be compatible with both the host
|
||||||
|
and destination.
|
||||||
|
|
||||||
|
To achieve that, it is necessary to move away from having ceph credentials
|
||||||
|
deriving their name from the nova-compute charm app names. Here we propose
|
||||||
|
a new name to be a common one between all of
|
||||||
|
them: 'nova-compute-ceph-auth-<secret-uuid-first-block>'.
|
||||||
|
|
||||||
|
There are many reasons for choosing the above-mentioned new credentials name:
|
||||||
|
|
||||||
|
1) As part of the name transition, it is necessary to make changes to the
|
||||||
|
libvirt secret entry. The secret entry in libvirt consists of:
|
||||||
|
|
||||||
|
secret UUID - name/usage - key
|
||||||
|
|
||||||
|
For each registered libvirt secret, we can only update the key. In order
|
||||||
|
to allow continued function of existing VMs as we upgrade the charms
|
||||||
|
performing the credential transition, we cannot delete and replace the
|
||||||
|
entire secret. It is necessary then to add a new secret with the new
|
||||||
|
credentials and new UUID, while also preserving the old secret.
|
||||||
|
|
||||||
|
2) The upgrade path from the old name to the new name is simpler if every
|
||||||
|
single existing name is handled as old. For example, if we chose
|
||||||
|
'nova-compute' to be the new name, then deployments that had nova-compute
|
||||||
|
charm apps named 'nova-compute' would need to be handled differently. The
|
||||||
|
main technical difficulty of this approach is again, the libvirt secret
|
||||||
|
management limitations, preventing us from adding a new libvirt secret
|
||||||
|
with the same name and a new UUID, nor can we achieve the desired
|
||||||
|
functionality while avoiding the need of adding a new secret UUID.
|
||||||
|
|
||||||
|
3) Having a credential name that is very unlikely to clash with any
|
||||||
|
charm app name avoids the situation described above. By using the first
|
||||||
|
block of the secret UUID in the credential name we also make it easier to
|
||||||
|
identify the credential as being the new one. If more credential names are
|
||||||
|
required to be performed in the future, by following this pattern we can
|
||||||
|
easily have versioning between them.
|
||||||
|
|
||||||
|
|
||||||
|
Charm Impact
|
||||||
|
------------
|
||||||
|
|
||||||
|
Only the nova-compute charm is affected by the code changes. The ceph-mon
|
||||||
|
charm does not require code changes.
|
||||||
|
|
||||||
|
|
||||||
|
Upgrade Impact
|
||||||
|
--------------
|
||||||
|
|
||||||
|
On upgrade, transitioning from old credentials to new credentials consists of:
|
||||||
|
|
||||||
|
1) Every nova-compute unit needs to send the new application-name to
|
||||||
|
ceph-mon. Ceph-mon will then register the new credential when receiving
|
||||||
|
it for the first time (subsequent receipt of the new application-name
|
||||||
|
will not need to re-register the new credentials, as they would already
|
||||||
|
exist) and send back the key of the new credentials.
|
||||||
|
|
||||||
|
2) Nova-compute units, upon receiving the updated key, will rewrite the
|
||||||
|
config files with the new credentials, keys and UUID, and invoke libvirt
|
||||||
|
to register the new secret. New VMs created from this point onwards will
|
||||||
|
use the new credentials.
|
||||||
|
|
||||||
|
The nova-compute units will retain their old credentials key file in their
|
||||||
|
/etc/ceph folder and registered in libvirt, so existing workloads are not
|
||||||
|
affected and allows the units to be able to receive live-migration of
|
||||||
|
non-rebooted instances.
|
||||||
|
|
||||||
|
|
||||||
|
Existing Workloads Impact
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
Existing running VMs will have their old credentials declared in their
|
||||||
|
libvirt XML and continue to run fine as the old credentials as retained.
|
||||||
|
A full power cycle of those VMs is required to update their libvirt XML
|
||||||
|
and switch to the new credentials. Those VMs cannot be migrated to
|
||||||
|
nova-compute nodes operated by different nova-compute charm apps until
|
||||||
|
rebooted.
|
||||||
|
|
||||||
|
|
||||||
|
Scale-out Impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
New nodes deployed with the updated charm need to install both old and new
|
||||||
|
credentials. To achieve this, the nova-compute code will need to send the old
|
||||||
|
application-name on ceph-relation-joined, then upon successful configuration
|
||||||
|
of the old credential, send the new application-name and trigger the upgrade
|
||||||
|
path. The end result is that a newly deployed node will have both old and new
|
||||||
|
credentials and can receive migration of non-rebooted VMs that are still using
|
||||||
|
the old credentials. This was a concern already pointed out by [2]_.
|
||||||
|
|
||||||
|
|
||||||
|
Revert/Downgrade Possibility
|
||||||
|
----------------------------
|
||||||
|
|
||||||
|
Unfortunately, downgrading and reverting the changes in case something
|
||||||
|
unexpected happens is not smooth, but is fairly straightforward. The
|
||||||
|
main problem is that the current code does not have a way to re-send
|
||||||
|
the application-name to ceph-mon outside of the ceph-relation-joined hook
|
||||||
|
function, so this information has to be exchanged manually through the
|
||||||
|
juju relation-set command. This is enough to revert all the changes.
|
||||||
|
|
||||||
|
If something went wrong during the upgrade or downgrade, then libvirt secrets
|
||||||
|
may need to be manually maintained to address a potential error or conflict of
|
||||||
|
names, UUID, or re-set the keys.
|
||||||
|
|
||||||
|
|
||||||
|
Charm Configuration Options
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
No charm config options are affected or have to be added for this change.
|
||||||
|
The change, however, only affects users that have the config
|
||||||
|
libvir-image-backend set to 'rbd'.
|
||||||
|
|
||||||
|
|
||||||
|
Configuration Files
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
The config files are affected in the following way:
|
||||||
|
|
||||||
|
* */etc/ceph/secret.xml* - This file will be updated to have the new UUID
|
||||||
|
and credential name. It is only used for the secret to be registered in
|
||||||
|
libvirt once.
|
||||||
|
|
||||||
|
* */etc/ceph/ceph.client.nova-compute-ceph-auth-<secret-uuid-first-block>* -
|
||||||
|
This file will be added with the key for the new credentials.
|
||||||
|
|
||||||
|
* */etc/nova/nova.conf* - The properties rbd_user and rbd_secret_uuid will be
|
||||||
|
updated with the new credentials and UUID respectively.
|
||||||
|
|
||||||
|
|
||||||
|
Non-Charm Configuration
|
||||||
|
-----------------------
|
||||||
|
|
||||||
|
The relation-data changes between the nova-compute and ceph charms affect
|
||||||
|
the following 2 fields:
|
||||||
|
|
||||||
|
* *application_name* - Ceph-mon will receive the application name
|
||||||
|
nova-compute-ceph-auth-<secret-uuid-first-block> instead of the
|
||||||
|
nova-compute charm app name.
|
||||||
|
|
||||||
|
* *key* - Nova-compute will receive the new credential's key instead of the
|
||||||
|
old credential's key.
|
||||||
|
|
||||||
|
|
||||||
|
OpenStack Versions
|
||||||
|
------------------
|
||||||
|
|
||||||
|
This feature will be enabled for Yoga and newer OpenStack releases.
|
||||||
|
|
||||||
|
Operating System Versions
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
This feature will be enabled for Ubuntu 20.04 (focal) and newer Ubuntu
|
||||||
|
releases.
|
||||||
|
|
||||||
|
Juju Version Dependencies
|
||||||
|
-------------------------
|
||||||
|
|
||||||
|
This feature has no dependency on Juju versions.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
A design alternative is possible to achieve the same result, although
|
||||||
|
with some advantages and disadvantages:
|
||||||
|
|
||||||
|
* Change the relation-data to exchange a dictionary containing all
|
||||||
|
nova-compute charm app names and keys between all nova-compute units. To
|
||||||
|
achieve this, either the nova-cloud-controller needs to be involved (as it
|
||||||
|
already currently is for exchanging SSH keys), or a new relation needs to be
|
||||||
|
created between nova-compute between different charm apps. This alternative
|
||||||
|
requires more code changes and relation data structure changes, but the end
|
||||||
|
result is generally more consistent and resilient against unexpected
|
||||||
|
behavior, such as hook errors or hooks running out of order.
|
||||||
|
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
ganso
|
||||||
|
|
||||||
|
Gerrit Topic
|
||||||
|
------------
|
||||||
|
|
||||||
|
Use Gerrit topic "lp2028559" for all patches related to this spec.
|
||||||
|
|
||||||
|
.. code-block:: bash
|
||||||
|
|
||||||
|
git-review -t lp2028559
|
||||||
|
|
||||||
|
|
||||||
|
Repositories
|
||||||
|
------------
|
||||||
|
|
||||||
|
No new repositories are required for this work.
|
||||||
|
|
||||||
|
Documentation
|
||||||
|
-------------
|
||||||
|
|
||||||
|
As part of this effort, the following documentation will need to be updated:
|
||||||
|
|
||||||
|
- Charm Guide
|
||||||
|
- Release Notes
|
||||||
|
|
||||||
|
Security
|
||||||
|
--------
|
||||||
|
|
||||||
|
The changes required in the charm do not introduce any additional security
|
||||||
|
implications beyond the security requirements and compromises already in place.
|
||||||
|
|
||||||
|
The extra credentials in ceph and extra keys in nova-compute are subject to the
|
||||||
|
same vulnerability as the existing credentials and keys.
|
||||||
|
|
||||||
|
Testing
|
||||||
|
-------
|
||||||
|
|
||||||
|
Unit tests and functional tests will be implemented for this feature. The
|
||||||
|
functional tests will validate whether the nova-compute nodes have
|
||||||
|
credential files for both the name derived from their charm app name, and
|
||||||
|
the new proposed unique name.
|
||||||
|
|
||||||
|
Currently there are not multiple nova-compute charm apps configured in
|
||||||
|
the CI, nor their presence supported by the functional tests code, nor any
|
||||||
|
live-migration tests. As future work there could be live-migration tests
|
||||||
|
across different nova-compute charm apps.
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
- Implement code changes in nova-compute charm
|
||||||
|
- Add functional tests to zaza-openstack-tests
|
||||||
|
- Provide user documentation on impact of changes
|
||||||
|
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
No hardware, software or version dependencies are required for this change
|
||||||
|
to be functional.
|
||||||
|
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [1] https://launchpad.net/bugs/2028559
|
||||||
|
.. [2] https://launchpad.net/bugs/2037003
|
||||||
|
|
Loading…
x
Reference in New Issue
Block a user