Merge "docs: Add reference docs for internal block device structures"
This commit is contained in:
commit
67c76de5f4
doc/source
224
doc/source/reference/block-device-structs.rst
Normal file
224
doc/source/reference/block-device-structs.rst
Normal file
@ -0,0 +1,224 @@
|
|||||||
|
==========================
|
||||||
|
Driver BDM Data Structures
|
||||||
|
==========================
|
||||||
|
|
||||||
|
In addition to the :doc:`API BDM data format </user/block-device-mapping>`
|
||||||
|
there are also several internal data structures within Nova that map out how
|
||||||
|
block devices are attached to instances. This document aims to outline the two
|
||||||
|
general data structures and two additional specific data structures used by the
|
||||||
|
libvirt virt driver.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
This document is based on an email to the openstack-dev mailing
|
||||||
|
list by Matthew Booth below provided as a primer for developers working on
|
||||||
|
virt drivers and interacting with these data structures.
|
||||||
|
|
||||||
|
http://lists.openstack.org/pipermail/openstack-dev/2016-June/097529.html
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
References to local disks in the following document refer to any
|
||||||
|
disk directly managed by nova compute. If nova is configured to use RBD or
|
||||||
|
NFS for instance disks then these disks won't actually be local, but they
|
||||||
|
are still managed locally and referred to as local disks. As opposed to RBD
|
||||||
|
volumes provided by Cinder that are not considered local.
|
||||||
|
|
||||||
|
Generic BDM data structures
|
||||||
|
===========================
|
||||||
|
|
||||||
|
``BlockDeviceMapping``
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
The 'top level' data structure is the ``BlockDeviceMapping`` (BDM) object. It
|
||||||
|
is a ``NovaObject``, persisted in the DB. Current code creates a BDM object for
|
||||||
|
every disk associated with an instance, whether it is a volume or not.
|
||||||
|
|
||||||
|
The BDM object describes properties of each disk as specified by the user. It
|
||||||
|
is initially from a user request, for more details on the format of these
|
||||||
|
requests please see the :doc:`Block Device Mapping in Nova
|
||||||
|
<../user/block-device-mapping>` document.
|
||||||
|
|
||||||
|
The Compute API transforms and consolidates all BDMs to ensure that all disks,
|
||||||
|
explicit or implicit, have a BDM, and then persists them. Look in
|
||||||
|
``nova.objects.block_device`` for all BDM fields, but in essence they contain
|
||||||
|
information like (source_type='image', destination_type='local',
|
||||||
|
image_id='<image uuid'>), or equivalents describing ephemeral disks, swap disks
|
||||||
|
or volumes, and some associated data.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
BDM objects are typically stored in variables called ``bdm`` with lists
|
||||||
|
in ``bdms``, although this is obviously not guaranteed (and unfortunately
|
||||||
|
not always true: ``bdm`` in ``libvirt.block_device`` is usually a
|
||||||
|
``DriverBlockDevice`` object). This is a useful reading aid (except when
|
||||||
|
it's proactively confounding), as there is also something else typically
|
||||||
|
called ``block_device_mapping`` which is not a ``BlockDeviceMapping``
|
||||||
|
object.
|
||||||
|
|
||||||
|
``block_device_info``
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
Drivers do not directly use BDM objects. Instead, they are transformed into a
|
||||||
|
different driver-specific representation. This representation is normally
|
||||||
|
called ``block_device_info``, and is generated by
|
||||||
|
``virt.driver.get_block_device_info()``. Its output is based on data in BDMs.
|
||||||
|
``block_device_info`` is a dict containing:
|
||||||
|
|
||||||
|
``root_device_name``
|
||||||
|
Hypervisor's notion of the root device's name
|
||||||
|
``ephemerals``
|
||||||
|
A list of all ephemeral disks
|
||||||
|
``block_device_mapping``
|
||||||
|
A list of all cinder volumes
|
||||||
|
``swap``
|
||||||
|
A swap disk, or None if there is no swap disk
|
||||||
|
|
||||||
|
The disks are represented in one of two ways, depending on the specific
|
||||||
|
driver currently in use. There's the 'new' representation, used by the libvirt
|
||||||
|
and vmwareAPI drivers, and the 'legacy' representation used by all other
|
||||||
|
drivers. The legacy representation is a plain dict. It does not contain the
|
||||||
|
same information as the new representation.
|
||||||
|
|
||||||
|
The new representation involves subclasses of
|
||||||
|
``nova.block_device.DriverBlockDevice``. As well as containing different
|
||||||
|
fields, the new representation significantly also retains a reference to the
|
||||||
|
underlying BDM object. This means that by manipulating the
|
||||||
|
``DriverBlockDevice`` object, the driver is able to persist data to the BDM
|
||||||
|
object in the DB.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Common usage is to pull ``block_device_mapping`` out of this
|
||||||
|
dict into a variable called ``block_device_mapping``. This is not a
|
||||||
|
``BlockDeviceMapping`` object, or list of them.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
If ``block_device_info`` was passed to the driver by compute manager, it
|
||||||
|
was probably generated by ``_get_instance_block_device_info()``.
|
||||||
|
By default, this function filters out all cinder volumes from
|
||||||
|
``block_device_mapping`` which don't currently have ``connection_info``.
|
||||||
|
In other contexts this filtering will not have happened, and
|
||||||
|
``block_device_mapping`` will contain all volumes.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
Unlike BDMs, ``block_device_info`` does not currently represent all
|
||||||
|
disks that an instance might have. Significantly, it will not contain any
|
||||||
|
representation of an image-backed local disk, i.e. the root disk of a
|
||||||
|
typical instance which isn't boot-from-volume. Other representations used
|
||||||
|
by the libvirt driver explicitly reconstruct this missing disk.
|
||||||
|
|
||||||
|
libvirt driver specific BDM data structures
|
||||||
|
===========================================
|
||||||
|
|
||||||
|
``instance_disk_info``
|
||||||
|
----------------------
|
||||||
|
|
||||||
|
The virt driver API defines a method ``get_instance_disk_info``, which returns
|
||||||
|
a JSON blob. The compute manager calls this and passes the data over RPC
|
||||||
|
between calls without ever looking at it. This is driver-specific opaque data.
|
||||||
|
It is also only used by the libvirt driver, despite being part of the API for
|
||||||
|
all drivers. Other drivers do not return any data. The most interesting aspect
|
||||||
|
of ``instance_disk_info`` is that it is generated from the libvirt XML, not
|
||||||
|
from nova's state.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
``instance_disk_info`` is often named ``disk_info`` in code, which
|
||||||
|
is unfortunate as this clashes with the normal naming of the next
|
||||||
|
structure. Occasionally the two are used in the same block of code.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
RBD disks (including non-volume disks) and cinder volumes
|
||||||
|
are not included in ``instance_disk_info``.
|
||||||
|
|
||||||
|
``instance_disk_info`` is a list of dicts for some of an instance's disks. Each
|
||||||
|
dict contains the following:
|
||||||
|
|
||||||
|
``type``
|
||||||
|
libvirt's notion of the disk's type
|
||||||
|
``path``
|
||||||
|
libvirt's notion of the disk's path
|
||||||
|
``virt_disk_size``
|
||||||
|
The disk's virtual size in bytes (the size the guest OS sees)
|
||||||
|
``backing_file``
|
||||||
|
libvirt's notion of the backing file path
|
||||||
|
``disk_size``
|
||||||
|
The file size of path, in bytes.
|
||||||
|
``over_committed_disk_size``
|
||||||
|
As-yet-unallocated disk size, in bytes.
|
||||||
|
|
||||||
|
``disk_info``
|
||||||
|
-------------
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
As opposed to ``instance_disk_info``, which is frequently called
|
||||||
|
``disk_info``.
|
||||||
|
|
||||||
|
This data structure is actually described pretty well in the comment block at
|
||||||
|
the top of ``nova.virt.libvirt.blockinfo``. It is internal to the libvirt
|
||||||
|
driver. It contains:
|
||||||
|
|
||||||
|
``disk_bus``
|
||||||
|
The default bus used by disks
|
||||||
|
``cdrom_bus``
|
||||||
|
The default bus used by cdrom drives
|
||||||
|
``mapping``
|
||||||
|
Defined below
|
||||||
|
|
||||||
|
``mapping`` is a dict which maps disk names to a dict describing how that disk
|
||||||
|
should be passed to libvirt. This mapping contains every disk connected to the
|
||||||
|
instance, both local and volumes.
|
||||||
|
|
||||||
|
First, a note on disk naming. Local disk names used by the libvirt driver are
|
||||||
|
well defined. They are:
|
||||||
|
|
||||||
|
``disk``
|
||||||
|
The root disk
|
||||||
|
``disk.local``
|
||||||
|
The flavor-defined ephemeral disk
|
||||||
|
``disk.ephX``
|
||||||
|
Where X is a zero-based index for BDM defined ephemeral disks
|
||||||
|
``disk.swap``
|
||||||
|
The swap disk
|
||||||
|
``disk.config``
|
||||||
|
The config disk
|
||||||
|
|
||||||
|
These names are hardcoded, reliable, and used in lots of places.
|
||||||
|
|
||||||
|
In ``disk_info``, volumes are keyed by device name, eg 'vda', 'vdb'. Different
|
||||||
|
buses will be named differently, approximately according to legacy Linux
|
||||||
|
device naming.
|
||||||
|
|
||||||
|
Additionally, ``disk_info`` will contain a mapping for 'root', which is the
|
||||||
|
root disk. This will duplicate one of the other entries, either 'disk' or a
|
||||||
|
volume mapping.
|
||||||
|
|
||||||
|
Each dict within the ``mapping`` dict contains the following 3 required fields
|
||||||
|
of bus, dev and type with two optional fields of format and ``boot_index``:
|
||||||
|
|
||||||
|
``bus``:
|
||||||
|
The guest bus type ('ide', 'virtio', 'scsi', etc)
|
||||||
|
``dev``:
|
||||||
|
The device name 'vda', 'hdc', 'sdf', 'xvde' etc
|
||||||
|
``type``:
|
||||||
|
Type of device eg 'disk', 'cdrom', 'floppy'
|
||||||
|
``format``
|
||||||
|
Which format to apply to the device if applicable
|
||||||
|
``boot_index``
|
||||||
|
Number designating the boot order of the device
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
|
||||||
|
``BlockDeviceMapping`` and ``DriverBlockDevice`` store boot index
|
||||||
|
zero-based. However, libvirt's boot index is 1-based, so the value stored
|
||||||
|
here is 1-based.
|
||||||
|
|
||||||
|
.. todo::
|
||||||
|
|
||||||
|
Add a section for the per disk ``disk.info`` file within instance
|
||||||
|
directory when using the libvirt driver.
|
@ -39,6 +39,8 @@ The following is a dive into some of the internals in nova.
|
|||||||
works in nova to isolate groups of hosts.
|
works in nova to isolate groups of hosts.
|
||||||
* :doc:`/reference/attach-volume`: Describes the attach volume flow, using the
|
* :doc:`/reference/attach-volume`: Describes the attach volume flow, using the
|
||||||
libvirt virt driver as an example.
|
libvirt virt driver as an example.
|
||||||
|
* :doc:`/reference/block-device-structs`: Block Device Data Structures
|
||||||
|
|
||||||
|
|
||||||
.. # NOTE(amotoki): toctree needs to be placed at the end of the secion to
|
.. # NOTE(amotoki): toctree needs to be placed at the end of the secion to
|
||||||
# keep the document structure in the PDF doc.
|
# keep the document structure in the PDF doc.
|
||||||
@ -59,6 +61,7 @@ The following is a dive into some of the internals in nova.
|
|||||||
isolate-aggregates
|
isolate-aggregates
|
||||||
api-microversion-history
|
api-microversion-history
|
||||||
attach-volume
|
attach-volume
|
||||||
|
block-device-structs
|
||||||
|
|
||||||
Debugging
|
Debugging
|
||||||
=========
|
=========
|
||||||
|
@ -48,6 +48,9 @@ When we talk about block device mapping, we usually refer to one of two things
|
|||||||
virt driver code). We will refer to this format as 'Driver BDMs' from now
|
virt driver code). We will refer to this format as 'Driver BDMs' from now
|
||||||
on.
|
on.
|
||||||
|
|
||||||
|
For more details on this please refer to the :doc:`Driver BDM Data
|
||||||
|
Structures <../reference/block-device-structs>` refernce document.
|
||||||
|
|
||||||
.. note::
|
.. note::
|
||||||
|
|
||||||
The maximum limit on the number of disk devices allowed to attach to
|
The maximum limit on the number of disk devices allowed to attach to
|
||||||
@ -55,8 +58,8 @@ When we talk about block device mapping, we usually refer to one of two things
|
|||||||
:oslo.config:option:`compute.max_disk_devices_to_attach`.
|
:oslo.config:option:`compute.max_disk_devices_to_attach`.
|
||||||
|
|
||||||
|
|
||||||
Data format and its history
|
API BDM data format and its history
|
||||||
----------------------------
|
-----------------------------------
|
||||||
|
|
||||||
In the early days of Nova, block device mapping general structure closely
|
In the early days of Nova, block device mapping general structure closely
|
||||||
mirrored that of the EC2 API. During the Havana release of Nova, block device
|
mirrored that of the EC2 API. During the Havana release of Nova, block device
|
||||||
|
Loading…
x
Reference in New Issue
Block a user