Merge "trivial: Fix spelling, formatting of vDPA spec"

This commit is contained in:
Zuul 2021-04-22 11:40:08 +00:00 committed by Gerrit Code Review
commit 8411c164db
1 changed files with 96 additions and 93 deletions

View File

@ -5,86 +5,88 @@
http://creativecommons.org/licenses/by/3.0/legalcode http://creativecommons.org/licenses/by/3.0/legalcode
====================================== ======================================
Libvirt: support vdpa based networking Libvirt: support vDPA based networking
====================================== ======================================
https://blueprints.launchpad.net/nova/+spec/libvirt-vdpa-support https://blueprints.launchpad.net/nova/+spec/libvirt-vdpa-support
Over the years a number of different technologies have been developed Over the years a number of different technologies have been developed to
to offload networking and other function to external process to accelerate offload networking and other function to external processes in order to
QEMU instance performance. In kernel 5.7+ a new virtual bus know as the accelerate QEMU instance performance. In kernel 5.7+ a new virtual bus know as
``vDPA`` (vHost data path acceleration) was introduced to provide a vendor *vDPA* (vHost data path acceleration) was introduced to provide a vendor
neutral way to accelerate standard virtio device using software or hardware neutral way to accelerate standard virtio device using software or hardware
accelerator implementations. In Libvirt 6.9.0 vDPA support was introduced accelerator implementations. vDPA support was introduced in Libvirt 6.9.0 to
to leverage the VDPA capabilities introduced in QEMU 5.1+. leverage the vDPA capabilities introduced in QEMU 5.1+.
This blueprint tracks enhancing nova to leverage these new capabilities This blueprint tracks enhancing nova to leverage these new capabilities
for offloading to hardware-based smart NICs via hardware offloaded ovs. for offloading to hardware-based smart NICs via hardware offloaded OVS.
Problem description Problem description
=================== ===================
Current hardware offloaded networking solutions require vendor specific drivers Current hardware offloaded networking solutions require vendor specific drivers
in the guest to function. The vdpa bus allows an abstraction layer to exist in the guest to function. The vDPA bus allows an abstraction layer to exist
between the accelerator and the vm without the cpu overhead of traditional between the accelerator and the VM without the CPU overhead of traditional
software vHost implementations. VDPA enabled vswitch offloads allow software vHost implementations. vDPA-enabled vSwitch offloads allow
the guest to use standard virtio drivers instead of a vendor specific driver. the guest to use standard virtio drivers instead of a vendor specific driver.
.. note:: .. note::
While VDPA technically can support live migration in the future QEMU While vDPA is expected to support live migration in the future, QEMU does
currently does not support live migration with VDPA devices. One of the main not currently support live migration with vDPA devices. One of the main
advantages of VDPA based networking over sriov is the ability to abstract the advantages of vDPA based networking over SR-IOV is the ability to abstract
device state from the VM allowing transparent live migration via a software the device state from the VM, allowing transparent live migration via a
fallback. Until that fallback is implemented in QEMU, live migration will be software fallback. Until that fallback is implemented in QEMU, live
blocked at the api via a HTTP 409 (Conflict) error response so that we can migration will be blocked at the API layer via a HTTP 409 (Conflict) error
enable it without a new micro-version. response so that we can enable it without a new micro-version.
As Open vSiwtch is currently the only control plane capable of managing VDPA As Open vSwitch is currently the only control plane capable of managing vDPA
devices and since that requires hardware offloads to function this spec devices, and since that requires hardware offloads to function, this spec
will focus on enabling VDPA networking exclusively with hardware offloaded will focus on enabling vDPA networking exclusively with hardware-offloaded
OVS. In a future release this functionality can be extended to other vswitch OVS. In a future release this functionality can be extended to other vSwitch
implementations such as VPP or linux bridge if they become VDPA enabled. implementations such as VPP or linux bridge if they become vDPA enabled.
Use Cases Use Cases
--------- ---------
As an operator, I want to offer hardware accelerated networking without As an operator, I want to offer hardware-accelerated networking without
requiring tenants to install vendor-specific drivers in the guest. requiring tenants to install vendor-specific drivers in their guests.
As an operator, I want to leverage hardware accelerated networking while As an operator, I want to leverage hardware-accelerated networking while
maintaining the ability to have transparent live migration. maintaining the ability to have transparent live migration.
.. note:: .. note::
Transparent live migration will not initially be supported and will be
enabled only after it is supported officially in a future QEMU release. Transparent live migration will not initially be supported and will be
enabled only after it is supported officially in a future QEMU release.
Proposed change Proposed change
=============== ===============
* A new vnic-type vdpa has been introduced to neutron to request vdpa offloaded * A new vnic-type ``vdpa`` has been introduced to neutron to request vDPA
networking. https://github.com/openstack/neutron-lib/commit/8c6ab5e offloaded networking. https://github.com/openstack/neutron-lib/commit/8c6ab5e
* The ``nova.network.model`` class will be extended to define the new vDPA * The ``nova.network.model`` class will be extended to define the new ``vdpa``
vnic-type constant. vnic-type constant.
* The libvirt driver will be extended to generate the vDPA interface XML * The libvirt driver will be extended to generate the vDPA interface XML
* The PCI tracker will be extended with a new device-type ``type-VDPA``. * The PCI tracker will be extended with a new device-type, ``VDPA``.
While the existing whitelist mechanism is unchanged, if a device is While the existing whitelist mechanism is unchanged, if a device is
whitelisted and is bound to a vdpa driver, it will be inventoried as whitelisted and is bound to a vDPA driver, it will be inventoried as
type-VDPA. In the libvirt driver this will be done by extending the private ``VDPA``. In the libvirt driver this will be done by extending the private
_get_pci_passthrough_devices and _get_device_type functions to detect ``_get_pci_passthrough_devices`` and ``_get_device_type`` functions to detect
if a VF is a parent of a VDPA nodedev. these function are if a VF is a parent of a vDPA nodedev. These functions are called by
called via get_available_resources in the resource tracker to generate the ``get_available_resources`` in the resource tracker to generate the
resources dictionary consumed by _setup_pci_tracker at the start up of resources dictionary consumed by ``_setup_pci_tracker`` at the startup of
the compute agent in _init_compute_node. the compute agent in ``_init_compute_node``.
.. note:: .. note::
The vdpa device type is required to ensure that the VF associated with vDPA
devices cannot be allocated to VMs via PCI alias or standard neutron SR-IOV The vDPA device type is required to ensure that the VF associated with vDPA
support. VFs associated with VDPA devices cannot be managed using standard devices cannot be allocated to VMs via PCI alias or standard neutron SR-IOV
kernel control plane command such as ip tools, as a result allocating them support. VFs associated with vDPA devices cannot be managed using standard
to an interface managed by the ``sriov-nic-agent`` or via alias based PCI kernel control plane command such as ``ip``. As a result, allocating them
pass-through is not valid. This will also provide automatic numa affinity and to an interface managed by the ``sriov-nic-agent`` or via alias-based PCI
a path to eventurally report vdpa devices in placement as part of generic pci passthrough is not valid. This will also provide automatic NUMA affinity
device tracking in placement in the future. and a path to eventurally report vDPA devices in placement as part of
generic PCI device tracking in placement in the future.
Alternatives Alternatives
@ -92,27 +94,27 @@ Alternatives
We could delegate vDPA support to cyborg. We could delegate vDPA support to cyborg.
This would still require the libvirt changes and neutron changes while This would still require the libvirt changes and neutron changes while
also complicating the deployment. Since vDPA based NICs are fixed function also complicating the deployment. Since vDPA-based NICs are fixed function
NICs there is not really any advantage to this approach that justifies NICs there is not really any advantage to this approach that justifies
the added complexity of inter service interaction. the added complexity of inter-service interaction.
We could use the resources table added for vPMEM devices to track the devices We could use the resources table added for vPMEM devices to track the devices
in the DB instead of the PCI tracker. This would complicate the code paths as in the DB instead of the PCI tracker. This would complicate the code paths as
we would not be able to share any of the PCI numa affinity code that already we would not be able to share any of the PCI NUMA affinity code that already
exists. exists.
we could support live migration by treating vDPA devices as if they are We could support live migration by treating vDPA devices as if they are
direct mode SR-IOV interface, nova would hot unplug and plug direct mode SR-IOV interface, meaning nova would hot unplug and plug
the interface during the migration. In the future this could be replaced with the interface during the migration. In the future this could be replaced with
transparent live migration if the QEMU version on both hosts is new enough. transparent live migration if the QEMU version on both hosts is new enough.
since we don't know when that will be this option is deferred until a future Since we don't know when that will be, this option is deferred until a future
release to reduce complexity. release to reduce complexity.
A new workaround config option could be added A new workaround config option could be added
enable_virtual_vdpa_devices=True|False (default: False). When set to True it ``enable_virtual_vdpa_devices=True|False`` (default: ``False``). When set to
would allow a virtual vdpa devices such as the ``vdpa_sim`` ``True`` it would allow a virtual vDPA devices such as the ``vdpa_sim``
devices to be tracked and used. Virtual PCI devices do not have a VF or PCI devices to be tracked and used. Virtual PCI devices do not have a VF or PCI
device associated with them, setting this value would result in the no-op device associated with them, so setting this value would result in the no-op
os-vif driver being used and a sentinel value being used to track the device os-vif driver being used and a sentinel value being used to track the device
in the PCI tracker. This would allow testing without real vDPA hardware in CI in the PCI tracker. This would allow testing without real vDPA hardware in CI
and is not intended for production use. This was declared out of scope to and is not intended for production use. This was declared out of scope to
@ -121,30 +123,30 @@ Functional tests will be used instead to ensure adequate code coverage.
A new standard trait ``HW_NIC_VDPA`` could be reported by the A new standard trait ``HW_NIC_VDPA`` could be reported by the
libvirt driver on hosts with vDPA devices, and the required version of QEMU libvirt driver on hosts with vDPA devices, and the required version of QEMU
and libvirt. This would be used in combination with a new placement pre filter and libvirt. This would be used in combination with a new placement prefilter
to append a required trait request to the unnamed group if any VM interface is to append a required trait request to the unnamed group if any VM interface has
vnic-type ``vdpa``. This will not be done as it will not be required when vnic-type ``vdpa``. This will not be done as it will not be required when
PCI devices are tracked in placement. Since standard traits cannot be removed PCI devices are tracked in placement. Since standard traits cannot be removed,
no new trait will be added and the PCI passthrough filter will instead be used no new trait will be added and the PCI passthrough filter will instead be used
to filter host on device type. to filter host on device type.
Data model impact Data model impact
----------------- -----------------
The allowed values of the vnic-type in the nova.network.model.VIF class will The allowed values of the vnic-type in the ``nova.network.model.VIF`` class
be extended. The allowed values of the pci_deivces table device_type will be will be extended. The allowed values of the ``device_type`` column in the
extended. ``pci_devices`` table will be extended.
optionally the VIFMigrateData or LibvirtLiveMigrateData object can be extended Optionally, the ``VIFMigrateData`` or ``LibvirtLiveMigrateData`` object can be
to denote the destination host support transparent vdpa live migration. This is extended to denote the destination host support transparent vDPA live
optional as currently it would always be false until a QEMU and libvirt version migration. This is optional as currently it would always be false until a QEMU
are released that support this feature. and libvirt version are released that support this feature.
REST API impact REST API impact
--------------- ---------------
The neutron port-binding extension vnic-types has been extended The neutron port-binding extension vnic-types has been extended
to add vnic-type ``vdpa``, no nova api changes are required. to add vnic-type ``vdpa``. No nova API changes are required.
Security impact Security impact
--------------- ---------------
@ -159,29 +161,29 @@ None
Other end user impact Other end user impact
--------------------- ---------------------
vDPA ports will work like sriov ports from an enduser perspective vDPA ports will work like SR-IOV ports from an end user perspective,
however the device model presented to the guest will be a VirtIO NIC however the device model presented to the guest will be a virtio NIC
and live migration will initially be blocked until supported by QEMU. and live migration will initially be blocked until supported by QEMU.
Performance Impact Performance Impact
------------------ ------------------
The performance will be the same as SR-IOV The performance will be the same as SR-IOV
in terms of dataplane performance and nova scheduling or vm creation in terms of dataplane performance and nova scheduling or VM creation
a.k.a. None. a.k.a. None.
Other deployer impact Other deployer impact
--------------------- ---------------------
vDPA requires a very new kernel to use. vDPA requires a very new kernel and very new versions of QEMU and libvirt to
initial support for vdpa was added in kernel 5.7 use. Initial support for vDPA was added in kernel 5.7, QEMU 5.1 and
requiring qemu 5.1 and libvirt 6.9.0 to function. libvirt 6.9.0.
The operator will need to ensure all dependencies are present to use The operator will need to ensure all dependencies are present to use
this feature. Intel NIC support is present in 5.7 but a the time of this feature. Intel NIC support is present in 5.7 at the time of writing,
this spec no NIC that support vDPA is available on the market from intel. no NIC that support vDPA is available on the market from Intel.
That means the first publicly available nics for vdpa are the mellanox/nvidia The first publicly available NICs with vDPA capabilities are the
connectx-6 dx/lx which are only enabled in kernel 5.9 Mellanox/Nvidia ConnectX-6 DX/LX NICs, which are only enabled in kernel 5.9
Developer impact Developer impact
---------------- ----------------
@ -213,42 +215,43 @@ sean-k-mooney
Work Items Work Items
---------- ----------
- update libvirt driver - Update libvirt driver
- add prefilter - Add prefilter
- add docs - Add docs
- update tests - Update tests
Dependencies Dependencies
============ ============
libvirt 6.9.0+ - libvirt 6.9.0+
qemu 5.1+ - QEMU 5.1+
linux 5.7+ - Linux 5.7+
Testing Testing
======= =======
This will be tested primarly via unit and functional This will be tested primarily via unit and functional tests,
tests however a tempest job using the vdpa sim module may be created however a Tempest job using the vDPA sim module may be created
if it proves practical to do so. The main challenges to this are if it proves practical to do so. The main challenges to this are
creating a stable testing environment with the required dependencies. creating a stable testing environment with the required dependencies.
fedora rawhide has all the required depencies but ship with python 3.9 Fedora rawhide has all the required dependencies but ships with python 3.9.
openstack currently does not work properly under python 3.9 OpenStack currently does not work properly under python 3.9
Alternative test environments such as ubuntu 20.04 do not provide new enough Alternative test environments such as Ubuntu 20.04 do not provide new enough
kernel by default or ship the require libvirt. compilation from source kernel by default or do not ship the required libvirt and QEMU versions.
is an option but we may or may not want to do that in the upstream ci. Compilation from source is an option but we may or may not want to do that in
the upstream CI.
Documentation Impact Documentation Impact
==================== ====================
The existing admin networking document will be extended to introduce vdpa The existing admin networking document will be extended to introduce vDPA
and describe the requirement for use. and describe the requirement for use.
References References
========== ==========
The nova neutron ptg discussion on this topic can be found on line 186 The nova-neutron PTG discussion on this topic can be found on line 186
here: https://etherpad.opendev.org/p/r.321f34cf3eb9caa9d87a9ec8349c3d29 here: https://etherpad.opendev.org/p/r.321f34cf3eb9caa9d87a9ec8349c3d29
An introduction to this topic and is available as a blog at An introduction to this topic and is available as a blog at