Juju Charm - Nova Compute - plugin for NVidia vGPU
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
Alex Kavanagh c699c7d135 Add *.charm to gitignore 1 month ago
src Fix exception when passing empty string as config option 2 months ago
templates Disable nouveau driver 2 months ago
tests Drop support for impish 2 months ago
unit_tests Fix exception when passing empty string as config option 2 months ago
.gitignore Add *.charm to gitignore 1 month ago
.gitreview Add unit tests for _has_nvidia_gpu_hardware() 4 months ago
.stestr.conf Initial Operator framework boilerplate 6 months ago
.zuul.yaml Add unit tests for _has_nvidia_gpu_hardware() 4 months ago
README.md Add documentation 2 months ago
actions.yaml Add list-vgpu-types action 2 months ago
build-requirements.txt Add charmcraft.yaml for publishing to Charmhub 4 months ago
charmcraft.yaml Drop support for impish 2 months ago
config.yaml Install NVIDIA vGPU software 6 months ago
metadata.yaml Drop support for impish 2 months ago
osci.yaml Drop support for impish 2 months ago
pip.sh Initial Operator framework boilerplate 6 months ago
rename.sh Migrate test bundles to charmhub 3 months ago
requirements.txt Fix missing dependency on bionic 2 months ago
test-requirements.txt Add unit tests for _has_nvidia_gpu_hardware() 4 months ago
tox.ini Fix missing dependency on bionic 2 months ago

README.md

Overview

This subordinate charm provides the Nvidia vGPU support to the OpenStack Nova Compute service.

Usage

Deployment

We are assuming a pre-existing OpenStack deployment (Queens or newer).

Deploy nova-compute-nvidia-vgpu as a subordinate to the nova-compute charm:

juju deploy ch:nova-compute-nvidia-vgpu --channel=yoga/edge
juju add-relation nova-compute-nvidia-vgpu:nova-vgpu nova-compute:nova-vgpu

Pass the proprietary NVIDIA software package (510.47.03 or newer) as a resource to the charm:

juju attach nova-compute-nvidia-vgpu \
    nvidia-vgpu-software=./nvidia-vgpu-ubuntu-510_510.47.03_amd64.deb

Once the model settles, reboot the corresponding compute nodes:

juju run -a nova-compute-nvidia-vgpu -- sudo reboot

vGPU type definition

Each compute node has one or several physical GPUs. Each physical GPU can then be divided into one or several virtual GPUs of a given type. Virtual GPUs will later be claimed by and exposed to guests upon guest creation. Start by listing the available vGPU types for each physical GPU:

juju run-action nova-compute-nvidia-vgpu/0 list-vgpu-types --wait
[...]
    nvidia-256, 0000:41:00.0, GRID RTX6000-1Q, num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=24
    nvidia-257, 0000:41:00.0, GRID RTX6000-2Q, num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=7680x4320, max_instance=12
    nvidia-258, 0000:41:00.0, GRID RTX6000-3Q, num_heads=4, frl_config=60, framebuffer=3072M, max_resolution=7680x4320, max_instance=8
    nvidia-259, 0000:41:00.0, GRID RTX6000-4Q, num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=7680x4320, max_instance=6
[...]
    nvidia-105, 0000:c1:00.0, GRID V100-1Q, num_heads=4, frl_config=60, framebuffer=1024M, max_resolution=5120x2880, max_instance=16
    nvidia-106, 0000:c1:00.0, GRID V100-2Q, num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=7680x4320, max_instance=8
    nvidia-107, 0000:c1:00.0, GRID V100-4Q, num_heads=4, frl_config=60, framebuffer=4096M, max_resolution=7680x4320, max_instance=4
    nvidia-108, 0000:c1:00.0, GRID V100-8Q, num_heads=4, frl_config=60, framebuffer=8192M, max_resolution=7680x4320, max_instance=2
[...]

As we can see, nova-compute-nvidia-vgpu/0 has two physical GPUs: 0000:41:00.0 and 0000:c1:00.0. By selecting the vGPU type nvidia-108 on 0000:c1:00.0, two vGPUs will be available for future guests:

juju config nova-compute-nvidia-vgpu vgpu-device-mappings="{'nvidia-108': ['0000:c1:00.0']}"

NOTE: on releases older than Stein, only one vGPU type can be selected accross all available physical GPUs. Starting from Stein each physical GPU can be assigned a different vGPU type.

On OpenStack Stein and newer, once the model has settled, these vGPUs can be listed via the OpenStack CLI:

openstack resource provider list
+--------------------------------------+-----------------------------------+------------+--------------------------------------+--------------------------------------+
| uuid                                 | name                              | generation | root_provider_uuid                   | parent_provider_uuid                 |
+--------------------------------------+-----------------------------------+------------+--------------------------------------+--------------------------------------+
| 0883c2b5-bad2-4abc-a179-e33344361475 | node-sparky.maas                  |          2 | 0883c2b5-bad2-4abc-a179-e33344361475 | None                                 |
| 4b0dbc58-0c85-4a80-8dd6-d43d1bd6ec53 | node-sparky.maas_pci_0000_c1_00_0 |          1 | 0883c2b5-bad2-4abc-a179-e33344361475 | 0883c2b5-bad2-4abc-a179-e33344361475 |
+--------------------------------------+-----------------------------------+------------+--------------------------------------+--------------------------------------+

openstack resource provider inventory list 4b0dbc58-0c85-4a80-8dd6-d43d1bd6ec53
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | used |
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| VGPU           |              1.0 |        1 |        2 |        0 |         1 |     2 |    0 |
+----------------+------------------+----------+----------+----------+-----------+-------+------+

Nova flavor definition

In order to expose a vGPU of the type defined earlier to any guest created with the m1.small flavor, create a new trait and assign it to the flavor:

openstack --os-placement-api-version 1.6 trait create CUSTOM_NVIDIA_108
openstack --os-placement-api-version 1.6 resource provider trait set --trait CUSTOM_NVIDIA_108 4b0dbc58-0c85-4a80-8dd6-d43d1bd6ec53
openstack flavor set m1.small --property resources:VGPU=1 --property trait:CUSTOM_NVIDIA_108=required

NOTE: on releases older than Stein, since there is only one vGPU type and it doesn't show up in the resource provider list, no trait can be created. The flavor can simply be modified with openstack flavor set m1.small --property resources:VGPU=1

After creating an instance of this flavor, the resource provider inventory list will show one vGPU being used:

openstack server create --flavor m1.small ...
openstack resource provider inventory list 4b0dbc58-0c85-4a80-8dd6-d43d1bd6ec53
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| resource_class | allocation_ratio | min_unit | max_unit | reserved | step_size | total | used |
+----------------+------------------+----------+----------+----------+-----------+-------+------+
| VGPU           |              1.0 |        1 |        2 |        0 |         1 |     2 |    1 |
+----------------+------------------+----------+----------+----------+-----------+-------+------+

Bugs

Please report bugs on Launchpad.

For general questions please refer to the OpenStack Charm Guide.