--- features: - | The libvirt driver now supports booting instances by asking for virtual GPUs. In order to support that, the operators should specify the enabled vGPU types in the nova-compute configuration file by using the configuration option ``[devices]/enabled_vgpu_types``. Only the enabled vGPU types can be used by instances. For knowing which types the physical GPU driver supports for libvirt, the operator can look at the sysfs by doing:: ls /sys/class/mdev_bus//mdev_supported_types Operators can specify a VGPU resource in a flavor by adding in the flavor's extra specs:: nova flavor-key set resources:VGPU=1 That said, Nova currently has some caveats for using vGPUs. * For the moment, only a single type can be supported across one compute node, which means that libvirt will create the vGPU by using that specific type only. It's also possible to have two compute nodes having different types but there is no possibility yet to specify in the flavor which specific type we want to use for that instance. * Suspending a guest having vGPUs doesn't work yet given a libvirt concern (it can't hot-unplug mediated devices from a guest). Workarounds using other instance actions (like snapshotting the instance or shelving it) are recommended until libvirt supports that. If a user asks to suspend the instance, Nova will get an exception that will set the instance state back to ``ACTIVE``, and you can see the suspend action in ``os-instance-action`` API will be Error. * Resizing an instance with a new flavor that has vGPU resources doesn't allocate those vGPUs to the instance (the instance is created without vGPU resources). We propose to work around this problem by rebuilding the instance once it has been resized so then it will have allocated vGPUs. * Migrating an instance to another host will have the same problem as resize. In case you want to migrate an instance, make sure to rebuild it. * Rescuing an instance having vGPUs will mean that the rescue image won't use the existing vGPUs. When unrescuing, it will use again the existing vGPUs that were allocated to the instance. That said, given Nova looks at all the allocated vGPUs when trying to find unallocated ones, there could be a race condition if an instance is rescued at the moment a new instance asking for vGPUs is created, because both instances could use the same vGPUs. If you want to rescue an instance, make sure to disable the host until we fix that in Nova. * Mediated devices that are created by the libvirt driver are not persisted upon reboot. Consequently, a guest startup would fail since the virtual device wouldn't exist. In order to prevent that issue, when restarting the compute service, the libvirt driver now looks at all the guest XMLs to check if they have mediated devices, and if the mediated device no longer exists, then Nova recreates it by using the same UUID. * If you use NVIDIA GRID cards, please know that there is a limitation with the NVIDIA driver that prevents one guest to have more than one virtual GPU from the same physical card. One guest can have two or more virtual GPUs but then it requires each vGPU to be hosted by a separate physical card. Until that limitation is removed, please avoid creating flavors asking for more than one vGPU. We are working actively to remove or workaround those caveats, but please understand that for the moment this feature is experimental given all the above.