Maintain specs of approved and completed of Antelop version

Change-Id: I821474c6e8c19765b68cc35d5a4822c8c89e9919
2023-03-11 16:49:03 +08:00 · 2023-03-11 16:49:03 +08:00 · eb76508050
commit eb76508050
parent e4b326e4fe
4 changed files with 1054 additions and 0 deletions
--- a/specs/2023.1/approved/attribute-api-support.rst
+++ b/specs/2023.1/approved/attribute-api-support.rst
@ -0,0 +1,301 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=====================
+Support attribute API
+=====================
+
+This spec adds a new group of APIs to manage the lifecycle of accelerator's
+attributes.
+
+Problem description
+===================
+
+Attribute is designed for describing customized information of an accelerator.
+Now they are generated by drivers, users can not add/delete/update them, it's
+not applicable to our scenarios now.
+
+Use Cases
+---------
+
+An admin or operator needs a group of APIs to manage his accelerator's
+attributes.
+Here are some useful scenarios:
+
+* For a NIC accelerator, we need to add a phys_net attribute, it's should be
+  created by deployer or other components.
+* For some Function Volatile Accelerators, we can create the Function name as
+  an attribute.
+* Also for some information, such as Function_UUID is machine readable.
+
+Proposed change
+===============
+None
+
+Alternatives
+------------
+
+None
+
+Data model impact
+-----------------
+
+* Add attribute object to deployable object.
+
+REST API impact
+---------------
+
+URL: ``/v2/deployable/{uuid}/attribute``
+
+METHOD: ``GET``
+
+    List all attributes of specified deployable.
+
+Normal response code (200) and body::
+
+ {
+     "attributes":[{
+         "key":"key1",
+         "value":"value1",
+         "uuid":"uuid1"
+         }
+     ]
+ }
+
+Error response code and body:
+
+* 401 (Unauthorized): Unauthorized
+
+* 403 (Forbidden): RBAC check failed
+
+* No response body
+
+
+URL: ``/v2/deployable/{uuid}/attribute/{uuid_or_key}``
+
+METHOD: ``GET``
+
+    GET specified attribute of specified deployable.
+
+Query Parameters: None
+
+Normal response code (200) and body::
+
+ {
+     "attribute":
+     {
+         "key":"key1",
+         "value":"value1",
+         "uuid":"uuid1",
+         "created_at":"2020-05-28T03:03:20",
+         "updated_at":"2020-05-28T03:03:20"
+     }
+ }
+
+Error response code and body:
+
+* 401 (Unauthorized): Unauthorized
+
+* 403 (Forbidden): RBAC check failed
+
+* 404 (NotFound): No deployable of that UUID or no attribute of that UUID
+  exists
+
+* No response body
+
+
+URL: ``/v2/deployable/{uuid}/attribute``
+
+METHOD: ``POST``
+
+    Create one or more deployable attribute(s).
+
+Request body::
+
+ [
+   {
+     "key": "key1",
+     "value": "value1"
+  },
+   {
+     "key": "key2",
+     "value": "value2"
+   },
+  ...
+ ]
+
+Normal response code and body:
+
+* 204 (No content)
+
+* No response body
+
+Error response code:
+
+* 401 (Unauthorized): Unauthorized
+
+* 403 (Forbidden): RBAC check failed
+
+* 409 (Conflict): Bad input or key is not unique
+
+Error response body::
+
+ {"error": "error-string"}
+
+
+URL: ``/v2/deployable/{uuid}/attribute/{uuid_or_key}``
+
+METHOD: ``DELETE``
+
+    Delete an exist deployable attribute.
+
+Query Parameters: None
+
+Normal response code and body:
+
+* 204 (No content)
+
+* No response body
+
+Error response code:
+
+* 401 (Unauthorized): Unauthorized
+
+* 403 (Forbidden): RBAC check failed
+
+* 404 (NotFound): No deployable of that UUID or no attribute of that UUID
+  exists
+
+Error response body::
+
+ {"error": "error-string"}
+
+
+URL: ``/v2/deployable/{uuid}/attribute``
+
+METHOD: ``DELETE``
+
+    Delete all attributes of a deployable.
+
+Query Parameters: None
+
+Normal response code and body:
+
+* 204 (No content)
+
+* No response body
+
+Error response code:
+
+* 401 (Unauthorized): Unauthorized
+
+* 403 (Forbidden): RBAC check failed
+
+Error response body::
+
+ {"error": "error-string"}
+
+
+URL: ``/v2/deployable/{uuid}/attribute/{uuid_or_key}``
+
+METHOD: ``PUT``
+
+    Update an exist deployable attribute.
+
+Query Parameters: None
+
+Request body (Value of deployable attribute)::
+
+ {"value": "value1"}
+
+Normal response code and body:
+
+* 204 (No content)
+
+* No response body
+
+Error response code and body:
+
+* 401 (Unauthorized): Unauthorized
+
+* 403 (Forbidden): RBAC check failed
+
+* 404 (NotFound): No deployable of that UUID or no attribute of that UUID
+  exists
+
+Error response body::
+
+ {"error": "error-string"}
+
+Security impact
+---------------
+None
+
+Notifications impact
+--------------------
+None
+
+Other end user impact
+---------------------
+* Change Cyborg Attribute table.
+
+
+Performance Impact
+------------------
+None
+
+Other deployer impact
+---------------------
+None
+
+Developer impact
+----------------
+* If the user want to use these feature, they should upgrade their Cyborg
+* project to latest to support these changes.
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+Primary assignee:
+  hejunli
+
+Work Items
+----------
+
+* Change Cyborg REST APIs.
+* Change Cyborg Attribute table.
+* Change Cyborg deployable object.
+* Change cyborgclient to support Attribute management action.
+* Add related tests.
+
+Dependencies
+============
+None
+
+Testing
+=======
+Appropriate unit and functional tests should be added.
+
+Documentation Impact
+====================
+* Need a documentation to record microversion history.
+* Need a documentaiton to explain api usage.
+
+References
+==========
+None
+
+History
+=======
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - Antelope
+     - Introduced
--- a/specs/2023.1/approved/disable-enable-device.rst
+++ b/specs/2023.1/approved/disable-enable-device.rst
@ -0,0 +1,221 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=============================
+Add disable/enable device API
+=============================
+
+https://blueprints.launchpad.net/openstack-cyborg/+spec/disable-enable-device
+
+Nowadays, Cyborg discovers the device on compute node by each driver. All
+devices matching the spec of driver are discovered and reported to the
+Placement service as an accelerator resources.
+This spec proposes a set of new APIs which allow admin users to
+disable/enable a device.
+
+
+Problem description
+===================
+
+Cyborg maintains a configuration file to configure the enabled drivers. Once
+the driver is enabled, the agent will discover all devices whose vendor ID,
+device ID match the driver's requirement. If admin user do not want all devices
+to be used by virtual machine, there is no way to disable a device currently.
+
+
+Use Cases
+---------
+* Alice is an admin user, she wants some FPGAs to be reserved for its own use
+  and not allow them to be allocated to a VM at the time. For example, she
+  wants to program the FPGA device and use it as the OVS agent running on
+  the host.
+
+Proposed change
+===============
+We propose to add new API in order to enable/disable a device. If the device is
+disabled, Cyborg will report this device as a reserved resource to Placement,
+so that Nova can not schedule to this device. On the contrary, if the device is
+enabled, the device should become available and the 'reserved' field in
+Placement shoule be set to 0.
+* Since the API layer is modified, a new microversion should be introduced.
+* It also need a new field "is_maintaining" in Device object and data model to
+indicate whether the device is disbaled. If one device is disabled, the
+"is_maintaining" field should be set to "True", and if the device is enabled,
+the field should be set to "False". The default value should be "False".
+* Cyborg need call Placement API to update the "reserved" field for the
+device in this API.
+* Add "is_maintaining" field's value check during conductor's periodic report.
+
+Alternatives
+------------
+None
+
+Data model impact
+-----------------
+A new column `is_maintaining` should be added in Device's data model.
+
+
+REST API impact
+---------------
+A microversion need to be introduced since the Device API changed.
+
+List Device API
+^^^^^^^^^^^^^^^
+* Return a device list
+  URL: ``/devices``
+  METHOD: ``GET``
+  Return: 200
+
+.. code-block::
+
+    {
+        "devices": [
+            {
+                "uuid": "d2446439-0142-40b7-9eee-82d855f453d9",
+                "type": "FPGA",
+                "vendor": "0xABCD",
+                "model": "miss model info",
+                "std_board_info": "{"device_id": "0xabcd", "class": "Fake class"}",
+                "vendor_board_info": "fake_vendor_info",
+                "hostname": "devstack01",
+                "links": [
+                    {
+                        "href": "http://172.23.97.140/accelerator/v2/devices/d2446439-0142-40b7-9eee-82d855f453d9",
+                        "rel": "self"
+                    }
+                ],
+                "created_at": "2021-11-03T08:48:43+00:00",
+                "updated_at": null
+            }
+        ]
+    }
+
+Get Device API
+^^^^^^^^^^^^^^
+* Get a device by uuid and return the details
+  URL: ``/devices/{uuid}``
+  METHOD: ``GET``
+  Return: 200
+
+.. code-block::
+
+    {
+        "uuid": "d2446439-0142-40b7-9eee-82d855f453d9",
+        "type": "FPGA",
+        "vendor": "0xABCD",
+        "model": "miss model info",
+        "std_board_info": "{"device_id": "0xabcd", "class": "Fake class"}",
+        "vendor_board_info": "fake_vendor_info",
+        "hostname": "devstack01",
+        "links": [
+            {
+                "href": "http://172.23.97.140/accelerator/v2/devices/d2446439-0142-40b7-9eee-82d855f453d9",
+                "rel": "self"
+            }
+        ],
+        "created_at": "2021-11-03T08:48:43+00:00",
+        "updated_at": null
+    }
+
+
+Disable Device API
+^^^^^^^^^^^^^^^^^^
+* Disable a device
+  URL: ``/devices/disable/{device_uuid}``
+  METHOD: ``POST``
+  Return: 200
+  Error Code: 404(the device is not found),403(the role is not admin)
+
+Enable Device API
+^^^^^^^^^^^^^^^^^
+* Enable a device
+  URL: ``/devices/enable/{device_uuid}``
+  METHOD: ``POST``
+  Return: 200
+  Error Code: 404(the device is not found),403(the role is not admin)
+
+Security impact
+---------------
+None
+
+Notifications impact
+--------------------
+None
+
+Other end user impact
+---------------------
+None
+
+Performance Impact
+------------------
+None
+
+Other deployer impact
+---------------------
+The deployer need update Cyborg to the microversion which supports
+disable/enable API. Otherwise the disable/enable API will be rejected.
+
+Developer impact
+----------------
+None
+
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+Primary assignee:
+  Xinran Wang(xin-ran.wang@intel.com)
+
+Work Items
+----------
+* Add new column `is_maintaining` for device table.
+* Add disable/enable API in DeviceController.
+* Update the RP `reserved` field according to the operation. For `disable`
+  oparation, the `reserved` field need be set by the same value as the
+  `total` field, and for `enable` operation, the `reserved` field will be set
+  to zero.
+* Update GET/LIST device API with `is_maintaining` field added in returned
+  value.
+* Add disable/enable operation in cyborgclient.
+* Add unit tests.
+
+Dependencies
+============
+None
+
+
+Testing
+=======
+Need add unit test, and tempest test if needed.
+
+
+Documentation Impact
+====================
+Need add related docs.
+
+References
+==========
+None
+
+
+History
+=======
+
+Optional section intended to be used each time the spec is updated to describe
+new design, API or any database schema updated. Useful to let reader understand
+what's happened along the time.
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release Name
+     - Description
+   * - Xena
+     - Introduced
+   * - Yoga
+     - Reproposed
--- a/specs/2023.1/approved/pmem-namespace-support.rst
+++ b/specs/2023.1/approved/pmem-namespace-support.rst
@ -0,0 +1,195 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+=================================
+Cyborg Intel PMEM Driver Proposal
+=================================
+
+https://blueprints.launchpad.net/openstack-cyborg/+spec/add-pmem-driver
+
+This spec proposes to provide the initial design for Cyborg's Intel PMEM
+driver.
+
+Problem description
+===================
+
+This spec will add Intel PMEM driver for Cyborg to manage specific Intel
+PMEM devices.
+
+PMEM devices can be used as a large pool of low latency high bandwidth memory
+where they could store data for computation. This can improve the performance
+of the instance.
+
+PMEM must be partitioned into PMEM namespaces [1]_ for applications to use.
+This vPMEM feature only uses PMEM namespaces in devdax mode as QEMU vPMEM
+backends [2]_. If you want to dive into related notions, the document NVDIMM
+Linux kernel document [3]_ is recommended.
+
+Starting in the 20.0.0 (Train) release, the virtual persistent memory (vPMEM)
+feature in Nova allows a deployment using the libvirt compute driver to provide
+vPMEMs for instances using physical persistent memory (PMEM) that can provide
+virtual devices [4]_.
+
+Use Cases
+---------
+* As an operator, I would like to use Cyborg agent managing PMEM resource
+  and checking periodically, the Cyborg Intel PMEM driver should provide
+  ``discover()`` function to enumerate the list of the Intel PMEM devices,
+  and report the details of all available Intel PMEM accelerators on the
+  host, such as PID(Product id), VID(Vendor id), Device ID.
+
+* As a user, I would like to boot up a VM with Intel PMEM Device attached in
+  order to accelerate compute ability. Cyborg should be able to manage this
+  kind of acceleration resources and assign it to the VM(binding).
+
+Proposed change
+===============
+1. In general, the goal is to develop a Intel PMEM Device driver that supports
+discover interfaces for Intel PMEM accelerator framework. The driver should
+include the ``discover()`` function. This function works excuting "ndctl list"
+command that reports devices' raw info sample as following::
+
+  [
+    {
+    "vendor": "8086",
+    "product": "ns200_0",
+    "device": "dax0.0"
+    }
+  ]
+
+2. Generate Cyborg specific driver objects and resource provider modeling
+for the PMEM device. Below is the objects to describe a PMEM devices which
+complies with the Cyborg database mode and Placement data model.
+
+::
+
+  Hardware     Driver objects       Placement data model
+     |               |                      |
+  1 PMEM         1 device                    |
+     |               |                      |
+     |         1 deployable       ---> resource_provider
+     |               |            ---> parent resource_provider: compute node
+     |               |                      |
+  n Namespace  n attach_handle    ---> inventories(total:n)
+
+3. Need add the "enable_driver=intel_pmem_driver" in the Cyborg Agent
+   configure file.
+
+4. Need add the "pmem_namespaces=$LABEL:$NSNAME|$NSNAME,$LABEL:$NSNAME|$NSNAME"
+   in the Cyborg Agent configure file as:
+   "pmem_namespaces = 6GB:ns0|ns1|ns2,LARGE:ns3"
+
+5. Resource class follows standard resources classes as:
+    "CUSTOM_PMEM_NAMESPACE_$LABEL"
+
+6. Traits follows the placement custom trait format. In the Cyborg driver, it
+   will report two traits for PMEM accelerator using the format below:
+   trait1:"CUSTOM_PMEM_NAMESPACE_$LABEL1"
+   trait2:"CUSTOM_PMEM_NAMESPACE_$LABEL2"
+
+
+7. Before cyborg discover the namespaces, they should be created. How to create
+   the namespce can reference [5]_ and [6]_.
+
+Alternatives
+------------
+
+None
+
+Data model impact
+-----------------
+
+Need add new type such as PMEM in devices and attach_handle tables.
+
+REST API impact
+---------------
+
+None.
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+User can manage Intel PMEM Device by Cyborg Intel PMEM driver. Such as list
+of the Intel PMEM devices, report the details of all available Intel PMEM
+accelerators on the host, binding with Intel PMEM and so on.
+
+Performance Impact
+------------------
+
+None
+
+Other deployer impact
+---------------------
+
+None.
+
+Developer impact
+----------------
+
+None
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  qiujunting(qiujunting@inspur.com)
+
+Work Items
+----------
+
+* Implement Intel PMEM driver in Cyborg
+* Add related test cases.
+
+
+Dependencies
+============
+
+None
+
+Testing
+========
+
+* Unit tests will be added to test this driver.
+
+Documentation Impact
+====================
+
+Document Intel PMEM driver in Cyborg project.
+Add test report in cyborg wiki.
+
+References
+==========
+.. [1] https://pmem.io/ndctl/ndctl-create-namespace.html
+.. [2] https://github.com/qemu/qemu/blob/19b599f7664b2ebfd0f405fb79c14dd241557452/docs/nvdimm.txt#L145
+.. [3] https://www.kernel.org/doc/Documentation/nvdimm/nvdimm.txt
+.. [4] https://docs.openstack.org/nova/latest/admin/virtual-persistent-memory.html
+.. [5] https://docs.openstack.org/nova/latest/admin/virtual-persistent-memory.html#configure-pmem-namespaces-compute
+.. [6] https://pmem.io/ndctl/ndctl-create-namespace.html
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release
+     - Description
+   * - Yoga
+     - Introduced
+
--- a/specs/2023.1/implemented/vgpu-driver-proposal.rst
+++ b/specs/2023.1/implemented/vgpu-driver-proposal.rst
@ -0,0 +1,337 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+================================================
+Cyborg NVIDIA GPU Driver support vGPU management
+================================================
+
+The Cyborg NVIDIA GPU Driver has implemented pGPU management in the Train
+release, this spec proposes the specification of supporting vGPU management
+in the same driver.
+
+Problem description
+===================
+
+GPU devices can provide supercomputing capabilities, and can replace the CPU
+to provide users with more efficient computing power at a lower cost. GPU cloud
+servers have great value in the following application scenarios, including:
+video encoding and decoding, scientific research and artificial intelligence
+(deep learning, machine learning).
+
+In the OpenStack ecosystem, users can now use Nova to pass gpu resources to
+guest by two methods:
+
+* Pass the GPU hardware to the guest (PCI pass-through).
+
+* Pass the Mediated Device(vGPU) to the guest.
+
+With the long-term goal that Cyborg will manage heterogeneous accelerators
+including GPUs, Cyborg needs to support GPU management and integrate with Nova
+to provide users with gpu resources allocation in the aforementioned methods.
+The existing Cyborg GPU driver, NVIDIA GPU Driver, has supported the first
+method (PCI pass-through), while the second method is not yet supported.
+Please see ref [1]_ for Nova-Cyborg vGPU integration spec.
+
+Use Cases
+---------
+
+* When the user is using Cyborg to manage GPU devices, he/she wants to boot
+  up a VM with Nvidia GPU (pGPU or vGPU) attached in order to accelerate the
+  video coding and decoding, Cyborg should be able to manage this kind of
+  acceleration resources and to assign it to the VM(binding).
+
+Proposed changes
+================
+
+To be clear, in the following, we will describe the whole process of how does
+the NVIDIA GPU Driver discover, generate Cyborg specific driver objects of the
+vGPU devices(comply with Cyborg Database Model), and report it to cyborg-db
+and Placement by cyborg-conductor. Features that are aleady supported in
+current branch is marked as DONE, new changes are marked as NEW CHANGES.
+
+1. Collect raw info of GPU devices from compute node by "lspci" and grep
+nvidia related keyword.(DONE)
+
+2. Parsing details from each record including ``vendor_id``, ``product_id``
+and ``pci_address``.(DONE)
+
+3. Generate Cyborg specific driver objects and resource provider modeling
+for the GPU device as well as its mdiated devices. Below is the objects to
+describe a vGPU devices which complies with the Cyborg database mode [4]_
+and placement data model [5]_.(NEW CHANGE)
+
+::
+
+  Hardware     Driver objects       Placement data model
+     |               |                      |
+  1 GPU         1 device                    |
+     |               |                      |
+     |         1 deployable       ---> resource_provider
+     |               |            ---> parent resource_provider: compute node
+     |               |                      |
+  4 vGPUs     4 attach_handles    ---> inventories(total:4)
+
+4. Supporting set the vGPU type for a specific GPU device in cyborg.conf. The
+implementation is similar to that in Nova [9]_.(NEW CHANGE)
+
+* Firstly, we propose [gpu_devices]/enabled_vgpu_types to define which vgpu
+  type Cyborg driver can use:
+
+  ::
+
+    [gpu_devices]
+    enabled_vgpu_types = [str_vgpu_type_1, str_vgpu_type_2, ...]
+
+* And also, we propose that Cyborg driver will accept configuration sections
+  that are related to the [gpu_devices]/enabled_vgpu_types and specifies which
+  exact pGPUs are related to the enabled vGPU types and will have a
+  device_addresses option defined like this:
+
+  ::
+
+    cfg.ListOpt('device_addresses',
+                default=[],
+                help="""
+    List of physical PCI addresses to associate with a specific vGPU type.
+
+    The particular physical GPU device address needs to be mapped to the vendor
+    vGPU type which that physical GPU is configured to accept. In order to
+    provide this mapping, there will be a CONF section with a name
+    corresponding to the following template: "vgpu_%(vgpu_type_name)s
+
+    The vGPU type to associate with the PCI devices has to be the section name
+    prefixed by ``vgpu_``. For example, for 'nvidia-11', you would declare
+    ``[vgpu_nvidia-11]/device_addresses``.
+
+    Each vGPU type also has to be declared in ``[gpu_devices]/enabled_vgpu_types``.
+
+    Related options:
+
+    * ``[gpu_devices]/enabled_vgpu_types``
+    """),
+
+  For example, it would be set in cyborg.conf
+
+  ::
+
+    [gpu_devices]
+    enabled_vgpu_types = nvidia-223,nvidia-224
+    [vgpu_nvidia-223]
+    device_addresses = 0000:af:00.0,0000:86:00.0
+    [vgpu_nvidia-224]
+    device_addresses = 0000:87:00.0
+
+5. Generate resource_class and traits for device, which later will also be
+reported to Placement, and used by nova-scheduler to filter appropriate
+accelerators.(NEW CHANGE)
+
+* ``resource class`` follows standard resources classes used by OpenStack [6]_.
+  Pass-through GPU device will report 'PGPU' as its resource class,
+  Virtualized GPU device will report 'VGPU' as its resource class.
+
+* ``traits`` follows the placement custom trait format [7]_. In the Cyborg
+  driver, it will report two traits for vGPU accelerator using the format
+  below:
+
+  trait1: **OWNER_CYBORG**.
+
+  trait2: **CUSTOM_<VENDOR_NAME>_<PRODUCT_ID>_<Virtual_GPU_Type>**.
+
+  Meaning of each parameter is listed below.
+
+  * OWNER_CYBORG: a new namespace in os-traits to remark that a device is
+    reported by Cyborg when the inventory is reported to placement. It is used
+    to distinguish GPU devices reported by Nova.
+
+  * VENDOR_NAME: vendor name of the GPU device.
+
+  * PRODUCT_ID: product ID of the GPU device.
+
+  * Virtual_GPU_Type: this parameter is actually another format of the
+    enabled_vgpu_types for a specific device set by admin in cyborg.conf.
+    In order to generate this param, driver will first retrieve
+    ``enabled_vgpu_type`` and then map it to Virtual_GPU_Type by the way
+    showed below. The name is exactly the Virtual_GPU_Type that will be
+    reported in traits. For more details about the valid Virtual GPU Types
+    for supported GPUs, please refer to [8]_.
+
+  ::
+
+    # find mapping relation between Virtual_GPU_Type and enabled_vgpu_type.
+    # The value in "name" file contains its corresponding Virtual_GPU_Type.
+    cat /sys/class/mdev_bus/{device_address}/mdev_supported_types/{enabled_vgpu_type}/name
+
+* Here is a example to show the traits of a GPU device in the real world.
+
+  * A Nvidia Tesla T4 device has been successfully installed on host,
+    device address is 0000:af:00.0. In addition, the vendor’s vGPU driver
+    software must be installed and configured on the host at the same time.
+
+  ::
+
+    [vtu@ubuntudbs ~]# lspci -nnn -D|grep 1eb8
+    0000:af:00.0 3D controller [0302]: NVIDIA Corporation TU104GL [Tesla T4] [10de:1eb8] (rev a1)
+
+  * Enable GPU types (Accelerator)
+
+    1. Specify which specific GPU type(s) the instances would get from this
+    specific device.
+
+    Edit devices.enabled_vgpu_types and device_address in cyborg.conf:
+
+    ::
+
+      [gpu]
+      enabled_vgpu_types=nvidia-223
+      [vgpu_nvidia-223]
+      device_addresses = 0000:af:00.0
+
+    2. Restart the cyborg-agent service.
+
+  * Finally, traits reported for this device(RP) will be:
+
+    **OWNER_CYBORG** and **CUSTOM_NVIDIA_1EB8_T4_2B**
+
+.. NOTE::
+
+  For the last parameter "T4_2B" (<Virtual_GPU_type>), we can validate the
+  mapping relation between "nvidia-223" and "T4_2B" by check from the mdev
+  sys path:
+
+  ::
+
+    [vtu@ubuntudbs mdev_supported_types]$ pwd
+    /sys/class/mdev_bus/0000:af:00.0/mdev_supported_types
+    [vtu@ubuntudbs mdev_supported_types]$ ls
+    nvidia-222  nvidia-225  nvidia-228  nvidia-231  nvidia-234  nvidia-320
+    nvidia-223  nvidia-226  nvidia-229  nvidia-232  nvidia-252  nvidia-321
+    nvidia-224  nvidia-227  nvidia-230  nvidia-233  nvidia-319
+    [vtu@ubuntudbs mdev_supported_types]$ cat nvidia-223/name
+    GRID T4-2B
+
+6. Generate ``controlpath_id``, ``deployable``, ``attach_handle``,
+``attribute`` for vGPU.(NEW CHANGE)
+
+7. Create a mdev device in the sys by echo its UUID (actually is the
+attach_handle UUID) to the create file when vgpu is bind to a VM.(NEW CHANGE)
+
+create_file_path=
+/sys/class/mdev_bus/{pci_address}/mdev_supported_types/{type-id}/create
+
+8. Delete a mdev device from sys by echo "1" to the remove file when vgpu is
+unbind from a VM.(NEW CHANGE)
+
+remove_file_path=
+/sys/class/mdev_bus/{pci_address}/mdev_supported_types/{type-id}/UUID/remove
+
+Alternatives
+------------
+
+Using Nova to manage vGPU device [10]_.
+
+Data model impact
+-----------------
+
+None
+
+
+REST API impact
+---------------
+
+None
+
+
+Security impact
+---------------
+
+None
+
+Notifications impact
+--------------------
+
+None
+
+Other end user impact
+---------------------
+
+None
+
+Performance Impact
+------------------
+
+None
+
+Other deployer impact
+---------------------
+
+This feature is highly dependent on the version of libvirt and the physical
+devices present on the host.
+
+For vGPU management, deployers need to make sure that the GPU device has been
+successfully virtualized. Otherwise, Cyborg will report it as a pGPU device.
+
+Please see ref [2]_ and [3]_ for how to install the Virtual GPU Manager package
+to virtualize your GPU devices.
+
+Developer impact
+----------------
+
+None
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  <yumeng-bao>
+
+Work Items
+----------
+
+* Implement NVIDIA GPU Driver enhancement in Cyborg
+* Add related test cases.
+* Add test report to wiki and update the supported driver doc page
+
+Dependencies
+============
+
+None
+
+Testing
+========
+
+* Unit tests will be added to test this driver.
+
+Documentation Impact
+====================
+
+Document Nvidia GPU driver in Cyborg project.
+
+References
+==========
+.. [1] https://review.opendev.org/#/c/750116/
+.. [2] https://docs.nvidia.com/grid/6.0/grid-vgpu-user-guide/index.html
+.. [3] https://docs.nvidia.com/grid/6.0/grid-vgpu-user-guide/index.html#install-vgpu-package-generic-linux-kvm
+.. [4] https://specs.openstack.org/openstack/cyborg-specs/specs/stein/implemented/cyborg-database-model-proposal.html
+.. [5] https://docs.openstack.org/nova/rocky/user/placement.html#references
+.. [6] https://github.com/openstack/os-resource-classes/blob/master/os_resource_classes/__init__.py#L41
+.. [7] https://specs.openstack.org/openstack/nova-specs/specs/pike/implemented/resource-provider-traits.html
+.. [8] https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#virtual-gpu-types-grid-reference
+.. [9] https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vgpu-multiple-types.html
+.. [10] https://docs.openstack.org/nova/latest/admin/virtual-gpu.html
+
+History
+=======
+
+.. list-table:: Revisions
+   :header-rows: 1
+
+   * - Release
+     - Description
+   * - Wallaby
+     - Introduced