Merge "Add rbd-glance-multistore spec"

This commit is contained in:
Zuul 2020-05-27 17:13:31 +00:00 committed by Gerrit Code Review
commit 7708a7dd2e
1 changed files with 246 additions and 0 deletions

View File

@ -0,0 +1,246 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
=======================================================
Libvirt RBD image backend support for glance multistore
=======================================================
https://blueprints.launchpad.net/nova/+spec/rbd-glance-multistore
Currently, Nova does not natively support a deployment where there are
multiple Ceph RBD backends that are known to glance. If there is only
one, Nova and Glance collaborate for fast-and-light image-to-VM
cloning behaviors. If there is more than one, Nova generally does not
handle the situation well, resulting in silent slow-and-heavy behavior
in the worst case, and a failed instance boot failsafe condition in
the best case. We can do better.
Problem description
===================
There are certain situations where it is desirable to have multiple
independent Ceph clusters in a single openstack deployment. The most
common would be a multi-site or edge deployment where it is important
that the Ceph cluster is physically close to the compute nodes that it
serves. Glance already has the ability to address multiple ceph
clusters, but Nova is so naive about this that such a configuration
will result in highly undesirable behavior.
Normally when Glance and Nova collaborate on a single Ceph deployment,
images are stored in Ceph by Glance when uploaded by the operator or
the user. When Nova starts to boot an instance, it asks Ceph to make a
Copy-on-Write clone of that image, which extremely fast and
efficient, resulting in not only reduced time to boot and lower
network traffic, but a shared base image across all compute nodes.
If, on the other hand, you have two groups of compute nodes, each with
their own Ceph deployment, extreme care must be taken currently to
ensure that an image stored in one is not booted on a compute node
assigned to the other. Glance can represent that a single logical
image is stored in one or both of those Ceph stores and Nova looks at
this during instance boot. However, if the image is not in its local
Ceph cluster, it will quietly download the image from Glance and then
upload it to its local Ceph as a raw flat image each time an instance
from that image is booted. This results in more network traffic and
disk usage than is expected. We merged a workaround to make Nova
refuse to do this antithetical behavior, but it just causes a failed
instance boot.
Use Cases
---------
- As an operator I want to be able to have a multi-site single Nova
deployment with one Ceph cluster per site and retain the
high-performance copy-on-write behavior that I get with a single
one.
- As a power user which currently has to pre-copy images to a
remote-site ceph backend with glance before being able to boot an
instance, I want to not have to worry about such things and just
have Nova do that for me.
Proposed change
===============
Glance can already represent that a single logical image is stored in
multiple locations. Recently, it gained an API to facilitate copying
images between backend stores. This means that an API consumer can
request that it copy an image from one store to another by doing an
"import" operation where the method is "copy-image".
The change proposed in this spec is to augment the existing libvirt
RBD imagebackend code so that it can use this image copying API when
needed. Currently, we already look at all the image locations to find
which one matches our Ceph cluster, and then use that to do the
clone. After this spec is implemented, that code will still examine
all the *current* locations, and if none match, ask Glance to copy the
image to the appropriate backend store so we can continue without
failure or other undesirable behavior.
In the case where we do need Glance to copy the image to our store,
Nova can monitor the progress of the operation through special image
properties that Glance maintains on the image. These indicate that the
process is in-progress (via ``os_glance_importing_to_stores``) and
also provide notice when an import has failed (via
``os_glance_failed_import``). Nova will need to poll the image,
waiting for the process to complete, and some configuration knobs will
be needed to allow for appropriate tuning.
Alternatives
------------
One alternative is always to do nothing. This is enhanced behavior on
top of what we already support. We *could* just tell people not to use
multiple Ceph deployments or add further checks to make sure we do not
do something stupid if they do.
We could teach nova about multiple RBD stores in a more comprehensive
way, which would basically require either pulling ceph information out
of Glance, or configuring Nova with all the same RBD backends that
Glance has. However, we would need to teach Nova about the topology
and configure it to not do stupid things like use a remote Ceph just
because the image is there.
Data model impact
-----------------
None.
REST API impact
---------------
None.
Security impact
---------------
Users can already use the image import mechanism in Glance, so Nova
using it on their behalf does not result in privilege escalation.
Notifications impact
--------------------
None.
Other end user impact
---------------------
This removes the need for users to know details about the deployment
configuration and topology, as well as eliminates the need to manually
pre-place images in stores.
Performance Impact
------------------
Image boot time will be impacted in the case when a copy needs to
happen, of course. Performance overall will be much better because
operators will be able to utilize more Ceph clusters if they wish,
and locate them closer to the compute nodes they serve.
Other deployer impact
---------------------
Some additional configuration will be needed in order to make this
work. Specifically, Nova will need to know the Glance store name that
represents the RBD backend it is configured to use. Additionally,
there will be some timeout tunables related to how often we poll the
Glance server for status on the copy, as well as an overall timeout
for how long we are willing to wait.
Developer impact
----------------
The actual impact to the imagebackend code is not large as we are just
using a new mechanism in Glance's API to do the complex work of
copying images between backends.
Upgrade impact
--------------
In order to utilize this new functionality, at least Glance from
Ussuri will be required for a Victoria Nova. Individual
``nova-compute`` services can utilize this new functionality
immediately during a partial upgrade scenario so no minimum service
version checks are required. The control plane does not know which RBD
backend each compute node is connected to, and thus there is no need
for control-plane-level upgrade sensitivity to this feature.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
danms
Feature Liaison
---------------
Feature liaison:
danms
Work Items
----------
* Plumb the ``image_import`` function through the
``nova.image.glance`` modules
* Teach the libvirt RBD imagebackend module how to use the new API to
copy images to its own backend when necessary and appropriate.
* Document the proper setup requirements for administrators
Dependencies
============
* Glance requirements are already landed and available
Testing
=======
* Functional testing should be relatively easy to implement for decent
coverage.
* Devstack and tempest testing is *possible* although probably not
very fruitful. The simplest way to do this is to deploy a devstack
with both ceph and file stores. Uploading an image to the file store
will cause it to be copied to the RBD backend on first boot,
providing a very relevant (from Nova's perspective) single-RBD
analog of the multi-RBD environment. That will require devstack
changes, as well as a tempest test. It may be doable without tempest
*config* changes requiring tempest to be told about the stores.
Documentation Impact
====================
This is largely admin-focused. Users that are currently aware of this
limitation already have admin-level knowledge if they are working
around it. Successful implementation will just eliminate the need to
care about multiple Ceph deployments going forward. Thus admin and
configuration documentation should be sufficient.
References
==========
* https://blueprints.launchpad.net/glance/+spec/copy-existing-image
* https://docs.openstack.org/glance/latest/admin/interoperable-image-import.html
* https://review.opendev.org/#/c/699656/8
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - Victoria
- Introduced