docs: Add reference docs for internal block device structures

It's time to shine a light on this area of the codebase ahead of some
much required cleanup. This documentation is based on an email sent
almost 5 years ago but is still accurate today.

Change-Id: I66cc2c5549833f269872748fb1532438f9ba8489
This commit is contained in:
Lee Yarwood 2021-01-20 14:52:27 +00:00
parent 242002fa97
commit d5420bbb50
3 changed files with 232 additions and 2 deletions

View File

@ -0,0 +1,224 @@
==========================
Driver BDM Data Structures
==========================
In addition to the :doc:`API BDM data format </user/block-device-mapping>`
there are also several internal data structures within Nova that map out how
block devices are attached to instances. This document aims to outline the two
general data structures and two additional specific data structures used by the
libvirt virt driver.
.. note::
This document is based on an email to the openstack-dev mailing
list by Matthew Booth below provided as a primer for developers working on
virt drivers and interacting with these data structures.
http://lists.openstack.org/pipermail/openstack-dev/2016-June/097529.html
.. note::
References to local disks in the following document refer to any
disk directly managed by nova compute. If nova is configured to use RBD or
NFS for instance disks then these disks won't actually be local, but they
are still managed locally and referred to as local disks. As opposed to RBD
volumes provided by Cinder that are not considered local.
Generic BDM data structures
===========================
``BlockDeviceMapping``
----------------------
The 'top level' data structure is the ``BlockDeviceMapping`` (BDM) object. It
is a ``NovaObject``, persisted in the DB. Current code creates a BDM object for
every disk associated with an instance, whether it is a volume or not.
The BDM object describes properties of each disk as specified by the user. It
is initially from a user request, for more details on the format of these
requests please see the :doc:`Block Device Mapping in Nova
<../user/block-device-mapping>` document.
The Compute API transforms and consolidates all BDMs to ensure that all disks,
explicit or implicit, have a BDM, and then persists them. Look in
``nova.objects.block_device`` for all BDM fields, but in essence they contain
information like (source_type='image', destination_type='local',
image_id='<image uuid'>), or equivalents describing ephemeral disks, swap disks
or volumes, and some associated data.
.. note::
BDM objects are typically stored in variables called ``bdm`` with lists
in ``bdms``, although this is obviously not guaranteed (and unfortunately
not always true: ``bdm`` in ``libvirt.block_device`` is usually a
``DriverBlockDevice`` object). This is a useful reading aid (except when
it's proactively confounding), as there is also something else typically
called ``block_device_mapping`` which is not a ``BlockDeviceMapping``
object.
``block_device_info``
---------------------
Drivers do not directly use BDM objects. Instead, they are transformed into a
different driver-specific representation. This representation is normally
called ``block_device_info``, and is generated by
``virt.driver.get_block_device_info()``. Its output is based on data in BDMs.
``block_device_info`` is a dict containing:
``root_device_name``
Hypervisor's notion of the root device's name
``ephemerals``
A list of all ephemeral disks
``block_device_mapping``
A list of all cinder volumes
``swap``
A swap disk, or None if there is no swap disk
The disks are represented in one of two ways, depending on the specific
driver currently in use. There's the 'new' representation, used by the libvirt
and vmwareAPI drivers, and the 'legacy' representation used by all other
drivers. The legacy representation is a plain dict. It does not contain the
same information as the new representation.
The new representation involves subclasses of
``nova.block_device.DriverBlockDevice``. As well as containing different
fields, the new representation significantly also retains a reference to the
underlying BDM object. This means that by manipulating the
``DriverBlockDevice`` object, the driver is able to persist data to the BDM
object in the DB.
.. note::
Common usage is to pull ``block_device_mapping`` out of this
dict into a variable called ``block_device_mapping``. This is not a
``BlockDeviceMapping`` object, or list of them.
.. note::
If ``block_device_info`` was passed to the driver by compute manager, it
was probably generated by ``_get_instance_block_device_info()``.
By default, this function filters out all cinder volumes from
``block_device_mapping`` which don't currently have ``connection_info``.
In other contexts this filtering will not have happened, and
``block_device_mapping`` will contain all volumes.
.. note::
Unlike BDMs, ``block_device_info`` does not currently represent all
disks that an instance might have. Significantly, it will not contain any
representation of an image-backed local disk, i.e. the root disk of a
typical instance which isn't boot-from-volume. Other representations used
by the libvirt driver explicitly reconstruct this missing disk.
libvirt driver specific BDM data structures
===========================================
``instance_disk_info``
----------------------
The virt driver API defines a method ``get_instance_disk_info``, which returns
a JSON blob. The compute manager calls this and passes the data over RPC
between calls without ever looking at it. This is driver-specific opaque data.
It is also only used by the libvirt driver, despite being part of the API for
all drivers. Other drivers do not return any data. The most interesting aspect
of ``instance_disk_info`` is that it is generated from the libvirt XML, not
from nova's state.
.. note::
``instance_disk_info`` is often named ``disk_info`` in code, which
is unfortunate as this clashes with the normal naming of the next
structure. Occasionally the two are used in the same block of code.
.. note::
RBD disks (including non-volume disks) and cinder volumes
are not included in ``instance_disk_info``.
``instance_disk_info`` is a list of dicts for some of an instance's disks. Each
dict contains the following:
``type``
libvirt's notion of the disk's type
``path``
libvirt's notion of the disk's path
``virt_disk_size``
The disk's virtual size in bytes (the size the guest OS sees)
``backing_file``
libvirt's notion of the backing file path
``disk_size``
The file size of path, in bytes.
``over_committed_disk_size``
As-yet-unallocated disk size, in bytes.
``disk_info``
-------------
.. note::
As opposed to ``instance_disk_info``, which is frequently called
``disk_info``.
This data structure is actually described pretty well in the comment block at
the top of ``nova.virt.libvirt.blockinfo``. It is internal to the libvirt
driver. It contains:
``disk_bus``
The default bus used by disks
``cdrom_bus``
The default bus used by cdrom drives
``mapping``
Defined below
``mapping`` is a dict which maps disk names to a dict describing how that disk
should be passed to libvirt. This mapping contains every disk connected to the
instance, both local and volumes.
First, a note on disk naming. Local disk names used by the libvirt driver are
well defined. They are:
``disk``
The root disk
``disk.local``
The flavor-defined ephemeral disk
``disk.ephX``
Where X is a zero-based index for BDM defined ephemeral disks
``disk.swap``
The swap disk
``disk.config``
The config disk
These names are hardcoded, reliable, and used in lots of places.
In ``disk_info``, volumes are keyed by device name, eg 'vda', 'vdb'. Different
buses will be named differently, approximately according to legacy Linux
device naming.
Additionally, ``disk_info`` will contain a mapping for 'root', which is the
root disk. This will duplicate one of the other entries, either 'disk' or a
volume mapping.
Each dict within the ``mapping`` dict contains the following 3 required fields
of bus, dev and type with two optional fields of format and ``boot_index``:
``bus``:
The guest bus type ('ide', 'virtio', 'scsi', etc)
``dev``:
The device name 'vda', 'hdc', 'sdf', 'xvde' etc
``type``:
Type of device eg 'disk', 'cdrom', 'floppy'
``format``
Which format to apply to the device if applicable
``boot_index``
Number designating the boot order of the device
.. note::
``BlockDeviceMapping`` and ``DriverBlockDevice`` store boot index
zero-based. However, libvirt's boot index is 1-based, so the value stored
here is 1-based.
.. todo::
Add a section for the per disk ``disk.info`` file within instance
directory when using the libvirt driver.

View File

@ -39,6 +39,8 @@ The following is a dive into some of the internals in nova.
works in nova to isolate groups of hosts.
* :doc:`/reference/attach-volume`: Describes the attach volume flow, using the
libvirt virt driver as an example.
* :doc:`/reference/block-device-structs`: Block Device Data Structures
.. # NOTE(amotoki): toctree needs to be placed at the end of the secion to
# keep the document structure in the PDF doc.
@ -59,6 +61,7 @@ The following is a dive into some of the internals in nova.
isolate-aggregates
api-microversion-history
attach-volume
block-device-structs
Debugging
=========

View File

@ -48,6 +48,9 @@ When we talk about block device mapping, we usually refer to one of two things
virt driver code). We will refer to this format as 'Driver BDMs' from now
on.
For more details on this please refer to the :doc:`Driver BDM Data
Structures <../reference/block-device-structs>` refernce document.
.. note::
The maximum limit on the number of disk devices allowed to attach to
@ -55,8 +58,8 @@ When we talk about block device mapping, we usually refer to one of two things
:oslo.config:option:`compute.max_disk_devices_to_attach`.
Data format and its history
----------------------------
API BDM data format and its history
-----------------------------------
In the early days of Nova, block device mapping general structure closely
mirrored that of the EC2 API. During the Havana release of Nova, block device