nova/doc/source/admin/virtual-gpu.rst
Stephen Finucane 5c5927a3d2 docs: Rewrite host aggregate, availability zone docs
These closely related features are the source of a disproportionate
number of bugs and a large amount of confusion among users. The spread
of information around multiple docs probably doesn't help matters.

Do what we've already done for the metadata service and remote consoles
and clean these docs up. There are a number of important changes:

- All documentation related to host aggregates and availability zones is
  placed in one of three documents, '/user/availability-zones',
  '/admin/aggregates' and '/admin/availability-zones'. (note that there
  is no '/user/aggregates' document since this is not user-facing)

- References to these features are updated to point to the new location

- A glossary is added. Currently this only contains definitions for host
  aggregates and availability zones

- nova CLI commands are replaced with their openstack CLI counterparts

- Some gaps in related documentation are closed

Change-Id: If847b0085dbfb4c813d4a8d14d99346f8252bc19
Signed-off-by: Stephen Finucane <sfinucan@redhat.com>
2019-10-07 17:26:35 +01:00

16 KiB

Attaching virtual GPU devices to guests

The virtual GPU feature in Nova allows a deployment to provide specific GPU types for instances using physical GPUs that can provide virtual devices.

For example, a single Intel GVT-g or a NVIDIA GRID vGPU physical Graphics Processing Unit (pGPU) can be virtualized as multiple virtual Graphics Processing Units (vGPUs) if the hypervisor supports the hardware driver and has the capability to create guests using those virtual devices.

This feature is highly dependent on the hypervisor, its version and the physical devices present on the host. In addition, the vendor's vGPU driver software must be installed and configured on the host at the same time.

Hypervisor-specific caveats are mentioned in the Caveats section.

To enable virtual GPUs, follow the steps below:

  1. Enable GPU types (Compute)
  2. Configure a flavor (Controller)

Enable GPU types (Compute)

  1. Specify which specific GPU type(s) the instances would get.

    Edit :oslo.configdevices.enabled_vgpu_types:

    [devices]
    enabled_vgpu_types = nvidia-35

    Note

    As of the Queens release, Nova only supports a single type. If more than one vGPU type is specified (as a comma-separated list), only the first one will be used.

    To know which specific type(s) to mention, please refer to How to discover a GPU type.

  2. Restart the nova-compute service.

    Warning

    Changing the type is possible but since existing physical GPUs can't address multiple guests having different types, that will make Nova return you a NoValidHost if existing instances with the original type still exist. Accordingly, it's highly recommended to instead deploy the new type to new compute nodes that don't already have workloads and rebuild instances on the nodes that need to change types.

Configure a flavor (Controller)

Configure a flavor to request one virtual GPU:

$ openstack flavor set vgpu_1 --property "resources:VGPU=1"

Note

As of the Queens release, all hypervisors that support virtual GPUs only accept a single virtual GPU per instance.

The enabled vGPU types on the compute hosts are not exposed to API users. Flavors configured for vGPU support can be tied to host aggregates as a means to properly schedule those flavors onto the compute hosts that support them. See /admin/aggregates for more information.

Create instances with virtual GPU devices

The nova-scheduler selects a destination host that has vGPU devices available by calling the Placement API for a specific VGPU resource class provided by compute nodes.

$ openstack server create --flavor vgpu_1 --image cirros-0.3.5-x86_64-uec --wait test-vgpu

Note

As of the Queens release, only the FilterScheduler scheduler driver uses the Placement API.

How to discover a GPU type

Depending on your hypervisor:

  • For libvirt, virtual GPUs are seen as mediated devices. Physical PCI devices (the graphic card here) supporting virtual GPUs propose mediated device (mdev) types. Since mediated devices are supported by the Linux kernel through sysfs files after installing the vendor's virtual GPUs driver software, you can see the required properties as follows:

    $ ls /sys/class/mdev_bus/*/mdev_supported_types
    /sys/class/mdev_bus/0000:84:00.0/mdev_supported_types:
    nvidia-35  nvidia-36  nvidia-37  nvidia-38  nvidia-39  nvidia-40  nvidia-41  nvidia-42  nvidia-43  nvidia-44  nvidia-45
    
    /sys/class/mdev_bus/0000:85:00.0/mdev_supported_types:
    nvidia-35  nvidia-36  nvidia-37  nvidia-38  nvidia-39  nvidia-40  nvidia-41  nvidia-42  nvidia-43  nvidia-44  nvidia-45
    
    /sys/class/mdev_bus/0000:86:00.0/mdev_supported_types:
    nvidia-35  nvidia-36  nvidia-37  nvidia-38  nvidia-39  nvidia-40  nvidia-41  nvidia-42  nvidia-43  nvidia-44  nvidia-45
    
    /sys/class/mdev_bus/0000:87:00.0/mdev_supported_types:
    nvidia-35  nvidia-36  nvidia-37  nvidia-38  nvidia-39  nvidia-40  nvidia-41  nvidia-42  nvidia-43  nvidia-44  nvidia-45
  • For XenServer, virtual GPU types are created by XenServer at startup depending on the available hardware and config files present in dom0. You can run the command of xe vgpu-type-list from dom0 to get the available vGPU types. The value for the field of model-name ( RO): is the vGPU type's name which can be used to set the nova config option [devices]/enabled_vgpu_types. See the following example:

    [root@trailblazer-2 ~]# xe vgpu-type-list
    uuid ( RO)              : 78d2d963-41d6-4130-8842-aedbc559709f
           vendor-name ( RO): NVIDIA Corporation
            model-name ( RO): GRID M60-8Q
             max-heads ( RO): 4
        max-resolution ( RO): 4096x2160
    
    
    uuid ( RO)              : a1bb1692-8ce3-4577-a611-6b4b8f35a5c9
           vendor-name ( RO): NVIDIA Corporation
            model-name ( RO): GRID M60-0Q
             max-heads ( RO): 2
        max-resolution ( RO): 2560x1600
    
    
    uuid ( RO)              : 69d03200-49eb-4002-b661-824aec4fd26f
           vendor-name ( RO): NVIDIA Corporation
            model-name ( RO): GRID M60-2A
             max-heads ( RO): 1
        max-resolution ( RO): 1280x1024
    
    
    uuid ( RO)              : c58b1007-8b47-4336-95aa-981a5634d03d
           vendor-name ( RO): NVIDIA Corporation
            model-name ( RO): GRID M60-4Q
             max-heads ( RO): 4
        max-resolution ( RO): 4096x2160
    
    
    uuid ( RO)              : 292a2b20-887f-4a13-b310-98a75c53b61f
           vendor-name ( RO): NVIDIA Corporation
            model-name ( RO): GRID M60-2Q
             max-heads ( RO): 4
        max-resolution ( RO): 4096x2160
    
    
    uuid ( RO)              : d377db6b-a068-4a98-92a8-f94bd8d6cc5d
           vendor-name ( RO): NVIDIA Corporation
            model-name ( RO): GRID M60-0B
             max-heads ( RO): 2
        max-resolution ( RO): 2560x1600
    
    ...

Checking allocations and inventories for virtual GPUs

Note

The information below is only valid from the 19.0.0 Stein release and only for the libvirt driver. Before this release or when using the Xen driver, inventories and allocations related to a VGPU resource class are still on the root resource provider related to the compute node. If upgrading from Rocky and using the libvirt driver, VGPU inventory and allocations are moved to child resource providers that represent actual physical GPUs.

The examples you will see are using the osc-placement plugin for OpenStackClient. For details on specific commands, see its documentation.

  1. Get the list of resource providers

    $ openstack resource provider list
    +--------------------------------------+---------------------------------------------------------+------------+
    | uuid                                 | name                                                    | generation |
    +--------------------------------------+---------------------------------------------------------+------------+
    | 5958a366-3cad-416a-a2c9-cfbb5a472287 | virtlab606.xxxxxxxxxxxxxxxxxxxxxxxxxxx                  |          7 |
    | fc9b9287-ef5e-4408-aced-d5577560160c | virtlab606.xxxxxxxxxxxxxxxxxxxxxxxxxxx_pci_0000_86_00_0 |          2 |
    | e2f8607b-0683-4141-a8af-f5e20682e28c | virtlab606.xxxxxxxxxxxxxxxxxxxxxxxxxxx_pci_0000_85_00_0 |          3 |
    | 85dd4837-76f9-41f2-9f19-df386017d8a0 | virtlab606.xxxxxxxxxxxxxxxxxxxxxxxxxxx_pci_0000_87_00_0 |          2 |
    | 7033d860-8d8a-4963-8555-0aa902a08653 | virtlab606.xxxxxxxxxxxxxxxxxxxxxxxxxxx_pci_0000_84_00_0 |          2 |
    +--------------------------------------+---------------------------------------------------------+------------+

    In this example, we see the root resource provider 5958a366-3cad-416a-a2c9-cfbb5a472287 with four other resource providers that are its children and where each of them corresponds to a single physical GPU.

  2. Check the inventory of each resource provider to see resource classes

    $ openstack resource provider inventory list 5958a366-3cad-416a-a2c9-cfbb5a472287
    +----------------+------------------+----------+----------+-----------+----------+-------+
    | resource_class | allocation_ratio | max_unit | reserved | step_size | min_unit | total |
    +----------------+------------------+----------+----------+-----------+----------+-------+
    | VCPU           |             16.0 |       48 |        0 |         1 |        1 |    48 |
    | MEMORY_MB      |              1.5 |    65442 |      512 |         1 |        1 | 65442 |
    | DISK_GB        |              1.0 |       49 |        0 |         1 |        1 |    49 |
    +----------------+------------------+----------+----------+-----------+----------+-------+
    $ openstack resource provider inventory list e2f8607b-0683-4141-a8af-f5e20682e28c
    +----------------+------------------+----------+----------+-----------+----------+-------+
    | resource_class | allocation_ratio | max_unit | reserved | step_size | min_unit | total |
    +----------------+------------------+----------+----------+-----------+----------+-------+
    | VGPU           |              1.0 |       16 |        0 |         1 |        1 |    16 |
    +----------------+------------------+----------+----------+-----------+----------+-------+

    Here you can see a VGPU inventory on the child resource provider while other resource class inventories are still located on the root resource provider.

  3. Check allocations for each server that is using virtual GPUs

    $ openstack server list
    +--------------------------------------+-------+--------+---------------------------------------------------------+--------------------------+--------+
    | ID                                   | Name  | Status | Networks                                                | Image                    | Flavor |
    +--------------------------------------+-------+--------+---------------------------------------------------------+--------------------------+--------+
    | 5294f726-33d5-472a-bef1-9e19bb41626d | vgpu2 | ACTIVE | private=10.0.0.14, fd45:cdad:c431:0:f816:3eff:fe78:a748 | cirros-0.4.0-x86_64-disk | vgpu   |
    | a6811fc2-cec8-4f1d-baea-e2c6339a9697 | vgpu1 | ACTIVE | private=10.0.0.34, fd45:cdad:c431:0:f816:3eff:fe54:cc8f | cirros-0.4.0-x86_64-disk | vgpu   |
    +--------------------------------------+-------+--------+---------------------------------------------------------+--------------------------+--------+
    
    $ openstack resource provider allocation show 5294f726-33d5-472a-bef1-9e19bb41626d
    +--------------------------------------+------------+------------------------------------------------+
    | resource_provider                    | generation | resources                                      |
    +--------------------------------------+------------+------------------------------------------------+
    | 5958a366-3cad-416a-a2c9-cfbb5a472287 |          8 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} |
    | 7033d860-8d8a-4963-8555-0aa902a08653 |          3 | {u'VGPU': 1}                                   |
    +--------------------------------------+------------+------------------------------------------------+
    
    $ openstack resource provider allocation show a6811fc2-cec8-4f1d-baea-e2c6339a9697
    +--------------------------------------+------------+------------------------------------------------+
    | resource_provider                    | generation | resources                                      |
    +--------------------------------------+------------+------------------------------------------------+
    | e2f8607b-0683-4141-a8af-f5e20682e28c |          3 | {u'VGPU': 1}                                   |
    | 5958a366-3cad-416a-a2c9-cfbb5a472287 |          8 | {u'VCPU': 1, u'MEMORY_MB': 512, u'DISK_GB': 1} |
    +--------------------------------------+------------+------------------------------------------------+

    In this example, two servers were created using a flavor asking for 1 VGPU, so when looking at the allocations for each consumer UUID (which is the server UUID), you can see that VGPU allocation is against the child resource provider while other allocations are for the root resource provider. Here, that means that the virtual GPU used by a6811fc2-cec8-4f1d-baea-e2c6339a9697 is actually provided by the physical GPU having the PCI ID 0000:85:00.0.

Caveats

Note

This information is correct as of the 17.0.0 Queens release. Where improvements have been made or issues fixed, they are noted per item.

For libvirt:

  • Suspending a guest that has vGPUs doesn't yet work because of a libvirt limitation (it can't hot-unplug mediated devices from a guest). Workarounds using other instance actions (like snapshotting the instance or shelving it) are recommended until libvirt gains mdev hot-unplug support. If a user attempts to suspend the instance, the libvirt driver will raise an exception that will cause the instance to be set back to ACTIVE. The suspend action in the os-instance-actions API will have an Error state.

  • Resizing an instance with a new flavor that has vGPU resources doesn't allocate those vGPUs to the instance (the instance is created without vGPU resources). The proposed workaround is to rebuild the instance after resizing it. The rebuild operation allocates vGPUS to the instance.

  • Cold migrating an instance to another host will have the same problem as resize. If you want to migrate an instance, make sure to rebuild it after the migration.

  • Rescue images do not use vGPUs. An instance being rescued does not keep its vGPUs during rescue. During that time, another instance can receive those vGPUs. This is a known issue. The recommended workaround is to rebuild an instance immediately after rescue. However, rebuilding the rescued instance only helps if there are other free vGPUs on the host.

    Note

    This has been resolved in the Rocky release1.

For XenServer:

  • Suspend and live migration with vGPUs attached depends on support from the underlying XenServer version. Please see XenServer release notes for up to date information on when a hypervisor supporting live migration and suspend/resume with vGPUs is available. If a suspend or live migrate operation is attempted with a XenServer version that does not support that operation, an internal exception will occur that will cause nova setting the instance to be in ERROR status. You can use the command of openstack server set --state active <server> to set it back to ACTIVE.
  • Resizing an instance with a new flavor that has vGPU resources doesn't allocate those vGPUs to the instance (the instance is created without vGPU resources). The proposed workaround is to rebuild the instance after resizing it. The rebuild operation allocates vGPUS to the instance.
  • Cold migrating an instance to another host will have the same problem as resize. If you want to migrate an instance, make sure to rebuild it after the migration.

  1. https://bugs.launchpad.net/nova/+bug/1762688↩︎