openstack-armada-app/openstack-helm
Lucas de Ataides eaa8b41cb0 Allow VMs to be created via volumes
After STX-Openstack upversioned to Antelope, we noticed that it was not
possible to create VMs by volume, as they would be stuck on ERROR
status. The first proposed solution was to create a patch containing
[1] and [2], because, as specified in [3], Nova now requires
a service token in order to be able to manipulate Cinder volumes. This
unfortunately did not solve the issue by itself, as now an error message
showed up on the nova-conductor pods with the following (not full error
message, only important part): "nova.exception.RescheduledException:
Build of instance 2f32c7ea-1720-4f61-bce8-dbe970c40b0c was re-scheduled:
Secret not found: no secret with matching uuid 'a7f3ae2e-cee7-4f04-9402
-a78047747654". This UUID was not the same one present when issuing
`virsh secret-list` on Cinder, Nova and Libvirt containers.

Turns out openstack-helm and openstack-helm-infra have a Ceph UUID
hardcoded in them, in Cinder [4], Nova [5] [6] and Libvirt [7] values.
By changing these values to the UUID that libvirt was trying to find
(7f3ae2e-cee7-4f04-9402-a78047747654), and it worked to solve the issue.
However, it is not a good practice to use hardcoded values, and,
searching on where this UUID was coming from, it turns out it was
defined by the platform's Ceph configuration under
`/etc/ceph/ceph.conf`.

This still leaves the question, why was this working on Ussuri and
stopped working on Antelope? First of all, the Ceph official
documentation [8] [9] about using it with OpenStack explains the
process of adding the secret to libvirt, to store the ceph admin
keyring. You can see that the secret uuid is generated "on the fly" and
both docs mention that old/hard-coded value
(i.e., 457eb676-33da-42ec-9a8c-9293d545c337). This is the reason why it
used to work until our upversion to OpenStack Antelope/2023.1: this
UUID does not really matter (as long as nova and libvirt have the same
value for it)! It is a given UUID to the libvirt secret that will store
ceph keyring [10], and the key ring will ensure proper communication
between our services and the platform ceph.

What changed between Ussuri and Antelope (2023.1), is that now there is
a specific method [11] to set a default value (Ceph's Cluster UUID) for
this UUID when it is not specified in the driver configuration.

What this change does is dynamically read this `/etc/ceph/ceph.conf`
file to search for the UUID value, and use it to override the [4] [5]
[6] and [7] values. It also adds the patch including the Nova service
token configuration. The combination of these 2 changes allows VMs to be
created by volumes.

[1] 91c8a5baf2
[2] 7d39af25fd
[3] https://docs.openstack.org/releasenotes/cinder/2023.1.html#upgrade-notes
[4] https://opendev.org/openstack/openstack-helm/src/branch/master/cinder/values.yaml#L942
[5] https://opendev.org/openstack/openstack-helm/src/branch/master/nova/values.yaml#L594
[6] https://opendev.org/openstack/openstack-helm/src/branch/master/nova/values.yaml#L1432
[7] https://opendev.org/openstack/openstack-helm-infra/src/branch/master/libvirt/values.yaml#L100
[8] https://github.com/ceph/ceph/blob/main/doc/rbd/rbd-openstack.rst
[9] https://docs.huihoo.com/ceph/v0.80.5/rbd/rbd-openstack/index.html
[10] https://opendev.org/starlingx/openstack-armada-app/src/branch/master/python3-k8sapp-openstack/k8sapp_openstack/k8sapp_openstack/helm/libvirt.py#L60
[11] 6464d37d16 (diff-9b122c182b4333b747e7fd7e07f73d68ff30256a)

Test Plan:
PASS: Build openstack-helm, python3-k8sapp-openstack and
      stx-openstack-helm-fluxcd
PASS: Upload / apply / remove STX-Openstack
PASS: Create a VM by an image
PASS: Create a volume and launch a VM from it
PASS: Create a VM using the `boot-from-volume` flag
PASS: Delete a VM created by a volume

Closes-Bug: 2037463

Change-Id: Ia00bb8dbe3460ce817d69049f97f56a96ad6a298
Signed-off-by: Lucas de Ataides <lucas.deataidesbarreto@windriver.com>
2023-10-17 10:03:20 -03:00
..
debian Allow VMs to be created via volumes 2023-10-17 10:03:20 -03:00
files Update user to execute commands in cinder related pods 2023-04-20 09:22:28 -03:00
Readme.rst Adding openstack-helm and openstack-helm-infra to the build 2018-11-06 09:38:06 -06:00

This repo is for https://github.com/openstack/openstack-helm

Changes to this repo are needed for StarlingX and those changes are not yet merged. Rather than clone and diverge the repo, the repo is extracted at a particular git SHA, and patches are applied on top.

As those patches are merged, the SHA can be updated and the local patches removed.