Add NVIDIA GPU driver specification
This spec proposes to provide the initial design for Cyborg's NVIDIA pGPU driver. Change-Id: I9a6941eccebf65da4df90b95a1dc36f9bca40dc1
This commit is contained in:
parent
58e6de783b
commit
4281ff28da
|
@ -0,0 +1,164 @@
|
|||
..
|
||||
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||
License.
|
||||
|
||||
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||
|
||||
==================================
|
||||
Cyborg NVIDIA GPU Driver Proposal
|
||||
==================================
|
||||
|
||||
This spec proposes to provide the initial design for Cyborg's NVIDIA physical
|
||||
GPU management driver. Please note that the virtualized GPU is out of scope.
|
||||
we only support passthrough one pGPU card to one VM directly.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
This spec will add a NVIDIA GPU driver for Cyborg to manage specific
|
||||
NVIDIA physical GPU devices.
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
* As an operator, I would like to use Cyborg agent starts or does resource
|
||||
checking periodically, the Cyborg NVIDIA GPU driver should provider
|
||||
``discover()`` function to enumerate the list of the NVIDIA GPU devices,
|
||||
and report the details of all available NVIDIA GPU accelerators on the
|
||||
host, such as PID(Product id), VID(Vendor id), Device.
|
||||
|
||||
* As a user, I would like to boot up a VM with NVIDIA GPU card attached in
|
||||
order to accelerate compute ability. Cyborg should be able to manage this
|
||||
kind of acceleration resources and assign it to the VM(binding).
|
||||
|
||||
|
||||
Proposed changes
|
||||
================
|
||||
|
||||
In general, the goal is to develop a Cyborg NVIDIA GPU driver that supports
|
||||
discover interfaces for NVIDIA GPU accelerator framework. The driver should
|
||||
include the ``discover()`` function. For physical GPU, this function works by
|
||||
executing ``lspci`` command and reports devices' raw info sample as
|
||||
following::
|
||||
|
||||
[
|
||||
{
|
||||
"vendor": "10de",
|
||||
"product": "1db6",
|
||||
"device": "0000:af:00:0"
|
||||
}
|
||||
]
|
||||
|
||||
Generate Cyborg specific driver objects and resource provider modeling
|
||||
for the GPU device. Below is the objects to describe a pGPU devices which
|
||||
complies with the Cyborg database mode and Placement data model.
|
||||
|
||||
::
|
||||
|
||||
Hardware Driver objects Placement data model
|
||||
| | |
|
||||
1 GPU 1 device |
|
||||
| | |
|
||||
| 1 deployable ---> resource_provider
|
||||
| | ---> parent resource_provider: compute node
|
||||
| | |
|
||||
1 pGPU 1 attach_handle ---> inventories(total:1)
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
None
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
NVIDIA GPU driver will not touch Data model.
|
||||
The Cyborg Agent can call NVIDIA GPU driver to update the database
|
||||
during the discover operations.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
None.
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
None
|
||||
|
||||
Notifications impact
|
||||
--------------------
|
||||
|
||||
None
|
||||
|
||||
Other end user impact
|
||||
---------------------
|
||||
|
||||
User can manage NVIDIA GPU cards by Cyborg NVIDIA GPU driver. Such as list
|
||||
of the NVIDIA GPU devices, report the details of all available NVIDIA GPU
|
||||
accelerators on the host, binding with NVIDIA GPU and so on.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
None
|
||||
|
||||
Other deployer impact
|
||||
---------------------
|
||||
|
||||
Deployers need to make sure the GPU device hasn't been virtualized. Otherwise,
|
||||
we can't use it as a pGPU.
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
None
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Wenping Song
|
||||
|
||||
Work Items
|
||||
----------
|
||||
|
||||
* Implement NVIDIA GPU driver in Cyborg
|
||||
* Add related test cases.
|
||||
|
||||
|
||||
Dependencies
|
||||
============
|
||||
|
||||
None
|
||||
|
||||
Testing
|
||||
========
|
||||
|
||||
* Unit tests will be added to test this driver.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
Document NVIDIA pGPU driver in Cyborg project.
|
||||
Test report in Cyborg wiki
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
None
|
||||
|
||||
|
||||
History
|
||||
=======
|
||||
|
||||
.. list-table:: Revisions
|
||||
:header-rows: 1
|
||||
|
||||
* - Release
|
||||
- Description
|
||||
* - Train
|
||||
- Introduced
|
Loading…
Reference in New Issue