Add doc8 to pep8 check for cyborg project
This patch adds a doc8 check of .rst files to the current pep8 check. It includes fixes to the .rst files that didn't pass the check. Change-Id: Ib7a63d1e579f77039172aa4f99d26a3ceeef83d7
This commit is contained in:
parent
3670bbb83a
commit
38c1618e60
13
HACKING.rst
13
HACKING.rst
@ -7,10 +7,11 @@ Before you commit your code run tox against your patch using the command.
|
||||
|
||||
tox .
|
||||
|
||||
If any of the tests fail correct the error and try again. If your code is valid Python
|
||||
but not valid pep8 you may find autopep8 from pip useful.
|
||||
If any of the tests fail correct the error and try again. If your code is valid
|
||||
Python but not valid pep8 you may find autopep8 from pip useful.
|
||||
|
||||
Once you submit a patch integration tests will run and those may fail, -1'ing your patch
|
||||
you can make a gerrit comment 'recheck ci' if you have reviewed the logs from the jobs
|
||||
by clicking on the job name in gerrit and concluded that the failure was spurious or otherwise
|
||||
not related to your patch. If problems persist contact people on #openstack-cyborg or #openstack-infra.
|
||||
Once you submit a patch integration tests will run and those may fail,
|
||||
-1'ing your patch you can make a gerrit comment 'recheck ci' if you have
|
||||
reviewed the logs from the jobs by clicking on the job name in gerrit and
|
||||
concluded that the failure was spurious or otherwise not related to your patch.
|
||||
If problems persist contact people on #openstack-cyborg or #openstack-infra.
|
||||
|
@ -5,19 +5,19 @@ General Information
|
||||
===================
|
||||
|
||||
This document describes the basic REST API operation that Cyborg supports
|
||||
for Pike release.
|
||||
for Pike release::
|
||||
|
||||
+--------+-----------------------+-------------------------------------------------------------------------------+
|
||||
| Verb | URI | Description |
|
||||
+========+=======================+===============================================================================+
|
||||
| GET | /accelerators | Return a list of accelerators |
|
||||
+--------+-----------------------+-------------------------------------------------------------------------------+
|
||||
| GET | /accelerators/{uuid} | Retrieve a certain accelerator info identified by `{uuid}` |
|
||||
+--------+-----------------------+-------------------------------------------------------------------------------+
|
||||
| POST | /accelerators | Create a new accelerator. |
|
||||
+--------+-----------------------+-------------------------------------------------------------------------------+
|
||||
| PUT | /accelerators/{uuid} | Update the spec for the accelerator identified by `{uuid}` |
|
||||
+--------+-----------------------+-------------------------------------------------------------------------------+
|
||||
| DELETE | /accelerators/{uuid} | Delete the accelerator identified by `{uuid}` |
|
||||
+--------+-----------------------+-------------------------------------------------------------------------------+
|
||||
+--------+-----------------------+-----------------------------------------------------------+
|
||||
| Verb | URI | Description |
|
||||
+========+=======================+===========================================================+
|
||||
| GET | /accelerators | Return a list of accelerators |
|
||||
+--------+-----------------------+-----------------------------------------------------------+
|
||||
| GET | /accelerators/{uuid} | Retrieve a certain accelerator info identified by `{uuid}`|
|
||||
+--------+-----------------------+-----------------------------------------------------------+
|
||||
| POST | /accelerators | Create a new accelerator. |
|
||||
+--------+-----------------------+-----------------------------------------------------------+
|
||||
| PUT | /accelerators/{uuid} | Update the spec for the accelerator identified by `{uuid}`|
|
||||
+--------+-----------------------+-----------------------------------------------------------+
|
||||
| DELETE | /accelerators/{uuid} | Delete the accelerator identified by `{uuid}` |
|
||||
+--------+-----------------------+-----------------------------------------------------------+
|
||||
|
||||
|
@ -135,10 +135,11 @@ Finally, push the patch for review using,
|
||||
Adding functionality
|
||||
--------------------
|
||||
|
||||
If you are adding new functionality to Cyborg please add testing for that functionality
|
||||
and provide a detailed commit message outlining the goals of your commit and how you
|
||||
achived them.
|
||||
If you are adding new functionality to Cyborg please add testing for that
|
||||
functionality and provide a detailed commit message outlining the goals of
|
||||
your commit and how you achived them.
|
||||
|
||||
If the functionality you wish to add doesn't fix in an existing part of the Cyborg
|
||||
achitecture diagram drop by our team meetings to disscuss how it could be implemented
|
||||
If the functionality you wish to add doesn't fix in an existing part of the
|
||||
Cyborg achitecture diagram drop by our team meetings to disscuss how it
|
||||
could be implemented
|
||||
|
||||
|
@ -119,7 +119,8 @@ It will speed up your installation if you have a local GIT_BASE.
|
||||
##### Command line
|
||||
|
||||
You can `source openrc YOUR_USER YOUR_USER (e.g. source openrc admin admin)` in
|
||||
your shell, and then use the `openstack` command line tool to manage your devstack.
|
||||
your shell, and then use the `openstack` command line tool to manage your
|
||||
devstack.
|
||||
|
||||
##### Horizon
|
||||
|
||||
|
@ -6,8 +6,10 @@ Background Story
|
||||
|
||||
OpenStack Acceleration Discussion Started from Telco Requirements:
|
||||
|
||||
* High level requirements first drafted in the standard organization ETSI NFV ISG
|
||||
* High level requirements transformed into detailed requirements in OPNFV DPACC project.
|
||||
* High level requirements first drafted in the standard organization
|
||||
ETSI NFV ISG
|
||||
* High level requirements transformed into detailed requirements in
|
||||
OPNFV DPACC project.
|
||||
* New project called Nomad established to address the requirements.
|
||||
* BoF discussions back in OpenStack Austin Summit.
|
||||
|
||||
|
@ -28,42 +28,44 @@ Use of accelerators attached to virtual machine instances in OpenStack
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Cyborg Agent resides on various compute hosts and monitors them for accelerators.
|
||||
On it's first run Cyborg Agent will run the detect accelerator functions of all
|
||||
it's installed drivers. The resulting list of accelerators available on the host
|
||||
will be reported to the conductor where it will be stored into the database and
|
||||
listed during API requests. By default accelerators will be inserted into the
|
||||
database in a inactive state. It will be up to the operators to manually set
|
||||
an accelerator to 'ready' at which point cyborg agent will be responsible for
|
||||
calling the drivers install function and ensuring that the accelerator is ready
|
||||
for use.
|
||||
Cyborg Agent resides on various compute hosts and monitors them for
|
||||
accelerators. On it's first run Cyborg Agent will run the detect
|
||||
accelerator functions of all it's installed drivers. The resulting list
|
||||
of accelerators available on the host will be reported to the conductor
|
||||
where it will be stored into the database and listed during API requests.
|
||||
By default accelerators will be inserted into the database in a inactive
|
||||
state. It will be up to the operators to manually set an accelerator to
|
||||
'ready' at which point cyborg agent will be responsible for calling the
|
||||
drivers install function and ensuring that the accelerator is ready for use.
|
||||
|
||||
In order to mirror the current Nova model of using the placement API each Agent
|
||||
will send updates on it's resources directly to the placement API endpoint as well
|
||||
as to the conductor for usage aggregation. This should keep placement API up to date
|
||||
on accelerators and their usage.
|
||||
will send updates on it's resources directly to the placement API endpoint
|
||||
as well as to the conductor for usage aggregation. This should keep placement
|
||||
API up to date on accelerators and their usage.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
There are lots of alternate ways to lay out the communication between the Agent
|
||||
and the API endpoint or the driver. Almost all of them involving exactly where we
|
||||
draw the line between the driver, Conductor , and Agent. I've written my proposal
|
||||
with the goal of having the Agent act mostly as a monitoring tool, reporting to
|
||||
the cloud operator or other Cyborg components to take action. A more active role
|
||||
for Cyborg Agent is possible but either requires significant synchronization with
|
||||
the Conductor or potentially steps on the toes of operators.
|
||||
and the API endpoint or the driver. Almost all of them involving exactly where
|
||||
we draw the line between the driver, Conductor , and Agent. I've written my
|
||||
proposal with the goal of having the Agent act mostly as a monitoring tool,
|
||||
reporting to the cloud operator or other Cyborg components to take action.
|
||||
A more active role for Cyborg Agent is possible but either requires significant
|
||||
synchronization with the Conductor or potentially steps on the toes of
|
||||
operators.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
Cyborg Agent will create new entries in the database for accelerators it detects
|
||||
it will also update those entries with the current status of the accelerator
|
||||
at a high level. More temporary data like the current usage of a given accelerator
|
||||
will be broadcast via a message passing system and won't be stored.
|
||||
Cyborg Agent will create new entries in the database for accelerators it
|
||||
detects it will also update those entries with the current status of the
|
||||
accelerator at a high level. More temporary data like the current usage of
|
||||
a given accelerator will be broadcast via a message passing system and won't
|
||||
be stored.
|
||||
|
||||
Cyborg Agent will retain a local cache of this data with the goal of not losing accelerator
|
||||
state on system interruption or loss of connection.
|
||||
Cyborg Agent will retain a local cache of this data with the goal of not losing
|
||||
accelerator state on system interruption or loss of connection.
|
||||
|
||||
|
||||
REST API impact
|
||||
|
@ -23,11 +23,11 @@ Use Cases
|
||||
---------
|
||||
|
||||
* As a user I want to be able to spawn VM with dedicated hardware, so
|
||||
that I can utilize provided hardware.
|
||||
that I can utilize provided hardware.
|
||||
* As a compute service I need to know how requested resource should be
|
||||
attached to the VM.
|
||||
attached to the VM.
|
||||
* As a scheduler service I'd like to know on which resource provider
|
||||
requested resource can be found.
|
||||
requested resource can be found.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
@ -38,26 +38,28 @@ for Cyborg.
|
||||
Life Cycle Management Phases
|
||||
----------------------------
|
||||
|
||||
For cyborg, LCM phases include typical create, retrieve, update, delete operations.
|
||||
One thing should be noted that deprovisioning mainly refers to detach(delete) operation
|
||||
which deactivate an acceleration capability but preserve the resource itself
|
||||
for future usage. For Cyborg, from functional point of view, the LCM includes provision,
|
||||
attach,update,list, and detach. There is no notion of deprovisioning for Cyborg API
|
||||
in a sense that we decomission or disconnect an entire accelerator device from
|
||||
the bus.
|
||||
For cyborg, LCM phases include typical create, retrieve, update, delete
|
||||
operations. One thing should be noted that deprovisioning mainly refers to
|
||||
detach(delete) operation which deactivate an acceleration capability but
|
||||
preserve the resource itself for future usage. For Cyborg, from functional
|
||||
point of view, the LCM includes provision, attach,update,list, and detach.
|
||||
There is no notion of deprovisioning for Cyborg API in a sense that we
|
||||
decomission or disconnect an entire accelerator device from the bus.
|
||||
|
||||
Difference between Provision and Attach/Detach
|
||||
----------------------------------------------
|
||||
|
||||
Noted that while the APIs support provisioning via CRUD operations, attach/detach
|
||||
are considered different:
|
||||
Noted that while the APIs support provisioning via CRUD operations,
|
||||
attach/detach are considered different:
|
||||
|
||||
* Provision operations (create) will involve api->
|
||||
conductor->agent->driver workflow, where as attach/detach (update/delete) could be taken
|
||||
care of at the driver layer without the involvement of the pre-mentioned workflow. This
|
||||
is similar to the difference between create a volume and attach/detach a volume in Cinder.
|
||||
conductor->agent->driver workflow, where as attach/detach (update/delete)
|
||||
could be taken care of at the driver layer without the involvement of the
|
||||
pre-mentioned workflow. This is similar to the difference between create a
|
||||
volume and attach/detach a volume in Cinder.
|
||||
|
||||
* The attach/detach in Cyborg API will mainly involved in DB status modification.
|
||||
* The attach/detach in Cyborg API will mainly involved in DB status
|
||||
modification.
|
||||
|
||||
Difference between Attach/Detach To VM and Host
|
||||
-----------------------------------------------
|
||||
@ -66,23 +68,23 @@ Moreover there are also differences when we attach an accelerator to a VM or
|
||||
a host, similar to Cinder.
|
||||
|
||||
* When the attachment happens to a VM, we are expecting that Nova could call
|
||||
the virt driver to perform the action for the instance. In this case Nova
|
||||
needs to support the acc-attach and acc-detach action.
|
||||
the virt driver to perform the action for the instance. In this case Nova
|
||||
needs to support the acc-attach and acc-detach action.
|
||||
|
||||
* When the attachment happens to a host, we are expecting that Cyborg could
|
||||
take care of the action itself via Cyborg driver. Althrough currently there
|
||||
is the generic driver to accomplish the job, we should consider a os-brick
|
||||
like standalone lib for accelerator attach/detach operations.
|
||||
take care of the action itself via Cyborg driver. Althrough currently there
|
||||
is the generic driver to accomplish the job, we should consider a os-brick
|
||||
like standalone lib for accelerator attach/detach operations.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
* For attaching an accelerator to a VM, we could let Cyborg perform the action
|
||||
itself, however it runs into the risk of tight-coupling with Nova of which Cyborg
|
||||
needs to get instance related information.
|
||||
* For attaching an accelerator to a host, we could consider to use Ironic drivers
|
||||
however it might not bode well with the standalone accelerator rack scenarios where
|
||||
accelerators are not attached to server at all.
|
||||
itself, however it runs into the risk of tight-coupling with Nova of which
|
||||
Cyborg needs to get instance related information.
|
||||
* For attaching an accelerator to a host, we could consider to use Ironic
|
||||
drivers however it might not bode well with the standalone accelerator rack
|
||||
scenarios where accelerators are not attached to server at all.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
@ -177,7 +179,7 @@ Example message body of the response to the GET operation::
|
||||
}
|
||||
|
||||
'GET /accelerators/{uuid}'
|
||||
*************************
|
||||
**************************
|
||||
|
||||
Retrieve a certain accelerator info indetified by '{uuid}'
|
||||
|
||||
@ -210,7 +212,7 @@ If the accelerator does not exist a `404 Not Found` must be
|
||||
returned.
|
||||
|
||||
'POST /accelerators/{uuid}'
|
||||
*******************
|
||||
***************************
|
||||
|
||||
Create a new accelerator
|
||||
|
||||
@ -252,7 +254,7 @@ A `409 Conflict` response code will be returned if another accelerator
|
||||
exists with the provided name.
|
||||
|
||||
'PUT /accelerators/{uuid}/{acc_spec}'
|
||||
*************************
|
||||
*************************************
|
||||
|
||||
Update the spec for the accelerator identified by `{uuid}`.
|
||||
|
||||
@ -289,7 +291,7 @@ The returned HTTP response code will be one of the following:
|
||||
|
||||
|
||||
'PUT /accelerators/{uuid}'
|
||||
*************************
|
||||
**************************
|
||||
|
||||
Attach the accelerator identified by `{uuid}`.
|
||||
|
||||
@ -322,11 +324,13 @@ The body of the request and the response is empty.
|
||||
|
||||
The returned HTTP response code will be one of the following:
|
||||
|
||||
* `204 No Content` if the request was successful and the accelerator was detached.
|
||||
* `204 No Content` if the request was successful and the accelerator was
|
||||
detached.
|
||||
* `404 Not Found` if the accelerator identified by `{uuid}` was
|
||||
not found.
|
||||
* `409 Conflict` if there exist allocations records for any of the
|
||||
accelerator resource that would be detached as a result of detaching the accelerator.
|
||||
accelerator resource that would be detached as a result of detaching
|
||||
the accelerator.
|
||||
|
||||
|
||||
Security impact
|
||||
@ -373,7 +377,7 @@ Work Items
|
||||
|
||||
* Implement the APIs specified in this spec
|
||||
* Proposal to Nova about the new accelerator
|
||||
attach/detach api
|
||||
attach/detach api
|
||||
* Implement the DB specified in this spec
|
||||
|
||||
|
||||
|
@ -30,23 +30,24 @@ Proposed change
|
||||
Cyborg Conductor will reside on the control node and will be
|
||||
responsible for stateful actions taken by Cyborg. Acting as both a cache to
|
||||
the database and as a method of combining reads and writes to the database.
|
||||
All other Cyborg components will go through the conductor for database operations.
|
||||
All other Cyborg components will go through the conductor for database
|
||||
operations.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Having each Cyborg Agent instance hit the database on it's own is a possible
|
||||
alternative, and it may even be feasible if the accelerator load monitoring rate is
|
||||
very low and the vast majority of operations are reads. But since we intend to store
|
||||
metadata about accelerator usage updated regularly this model probably will not scale
|
||||
well.
|
||||
alternative, and it may even be feasible if the accelerator load monitoring
|
||||
rate is very low and the vast majority of operations are reads. But since we
|
||||
intend to store metadata about accelerator usage updated regularly this model
|
||||
probably will not scale well.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
|
||||
Using the conductor 'properly' will result in little or no per instance state and stateful
|
||||
operations moving through the conductor with the exception of some local caching where it
|
||||
can be garunteed to work well.
|
||||
Using the conductor 'properly' will result in little or no per instance state
|
||||
and stateful operations moving through the conductor with the exception of
|
||||
some local caching where it can be garunteed to work well.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
@ -120,8 +121,8 @@ CI using the dummy driver.
|
||||
Documentation Impact
|
||||
====================
|
||||
|
||||
Some configuration values tuning save out rate and other parameters on the controller
|
||||
will need to be documented for end users
|
||||
Some configuration values tuning save out rate and other parameters on the
|
||||
controller will need to be documented for end users
|
||||
|
||||
References
|
||||
==========
|
||||
|
@ -66,14 +66,15 @@ REST API impact
|
||||
---------------
|
||||
|
||||
This blueprint proposes to add the following APIs:
|
||||
*cyborg install-driver <driver_id>
|
||||
*cyborg uninstall-driver <driver_id>
|
||||
*cyborg attach-instance <instance_id>
|
||||
*cyborg detach-instance <instance_id>
|
||||
*cyborg service-list
|
||||
*cyborg driver-list
|
||||
*cyborg update-driver <driver_id>
|
||||
*cyborg discover-services
|
||||
|
||||
* cyborg install-driver <driver_id>
|
||||
* cyborg uninstall-driver <driver_id>
|
||||
* cyborg attach-instance <instance_id>
|
||||
* cyborg detach-instance <instance_id>
|
||||
* cyborg service-list
|
||||
* cyborg driver-list
|
||||
* cyborg update-driver <driver_id>
|
||||
* cyborg discover-services
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
@ -119,6 +120,7 @@ Work Items
|
||||
----------
|
||||
|
||||
This change would entail the following:
|
||||
|
||||
* Add a feature to identify and discover attached accelerator backends.
|
||||
* Add a feature to list services running on the backend
|
||||
* Add a feature to attach accelerators to the generic backend.
|
||||
|
@ -17,10 +17,10 @@ Problem description
|
||||
|
||||
A Field Programmable Gate Array(FPGA) is an integrated circuit designed to be
|
||||
configured by a customer or a designer after manufacturing. The advantage lies
|
||||
in that they are sometimes significantly faster for some applications because of
|
||||
their parallel nature and optimality in terms of the number of gates used for a
|
||||
certain process. Hence, using FPGA for application acceleration in cloud has been
|
||||
becoming desirable.
|
||||
in that they are sometimes significantly faster for some applications because
|
||||
of their parallel nature and optimality in terms of the number of gates used
|
||||
for a certain process. Hence, using FPGA for application acceleration in cloud
|
||||
has been becoming desirable.
|
||||
|
||||
There is a management framwork in Cyborg [1]_ for heterogeneous accelerators,
|
||||
tracking and deploying FPGAs. This spec will add a FPGA driver for Cyborg to
|
||||
@ -30,20 +30,20 @@ Use Cases
|
||||
---------
|
||||
|
||||
* When Cyborg agent starts or does resource checking periodically, the Cyborg
|
||||
FPGA driver should enumerate the list of the FPGA devices, and report the
|
||||
details of all available FPGA accelerators on the host, such as BDF(Bus,
|
||||
Device, Function), PID(Product id) VID(Vendor id), IMAGE_ID and PF(Physical
|
||||
Function)/VF(Virtual Function) type.
|
||||
FPGA driver should enumerate the list of the FPGA devices, and report the
|
||||
details of all available FPGA accelerators on the host, such as BDF(Bus,
|
||||
Device, Function), PID(Product id) VID(Vendor id), IMAGE_ID and PF(Physical
|
||||
Function)/VF(Virtual Function) type.
|
||||
|
||||
* When user uses empty FPGA regions as their accelerators, Cyborg agent will
|
||||
call driver's program() interface. Cyborg agent should provide BDF
|
||||
of PF/VF, and local image path to the driver. More details can be found in ref
|
||||
[2]_.
|
||||
call driver's program() interface. Cyborg agent should provide BDF
|
||||
of PF/VF, and local image path to the driver. More details can be found in
|
||||
ref [2]_.
|
||||
|
||||
* When there maybe more thant one vendor fpga card on a host, or on different
|
||||
hosts in the cluster, Cyborg agent can discover the wendors easiy and
|
||||
intelligently by Cyborg FPGA driver, and call the correct driver to execute
|
||||
it's operations, such as discover() and program().
|
||||
hosts in the cluster, Cyborg agent can discover the wendors easiy and
|
||||
intelligently by Cyborg FPGA driver, and call the correct driver to execute
|
||||
it's operations, such as discover() and program().
|
||||
|
||||
|
||||
Proposed changes
|
||||
@ -54,27 +54,29 @@ discover/program interfaces for FPGA accelerator framework.
|
||||
|
||||
The driver should include the follow functions:
|
||||
1. discover()
|
||||
driver reports devices as following:
|
||||
[{
|
||||
"vendor": "0x8086",
|
||||
"product": "bcc0",
|
||||
"pr_num": 1,
|
||||
"devices": "0000:be:00:0",
|
||||
"path": "/sys/class/fpga/intel-fpga-dev.0",
|
||||
"regions": [
|
||||
{"vendor": "0x8086",
|
||||
"product": "bcc1",
|
||||
"regions": 1,
|
||||
"devices": "0000:be:00:1",
|
||||
"path": "/sys/class/fpga/intel-fpga-dev.1"
|
||||
}]
|
||||
}]
|
||||
pr_num: partial reconfiguration region numbers.
|
||||
driver reports devices as following::
|
||||
|
||||
[{
|
||||
"vendor": "0x8086",
|
||||
"product": "bcc0",
|
||||
"pr_num": 1,
|
||||
"devices": "0000:be:00:0",
|
||||
"path": "/sys/class/fpga/intel-fpga-dev.0",
|
||||
"regions": [
|
||||
{"vendor": "0x8086",
|
||||
"product": "bcc1",
|
||||
"regions": 1,
|
||||
"devices": "0000:be:00:1",
|
||||
"path": "/sys/class/fpga/intel-fpga-dev.1"
|
||||
}]
|
||||
}]
|
||||
|
||||
pr_num: partial reconfiguration region numbers.
|
||||
|
||||
2. program(device_path, image)
|
||||
program the image to a PR region specified by device_path.
|
||||
device_path: the sys path of accelerator device.
|
||||
image: The local path of programming image.
|
||||
program the image to a PR region specified by device_path.
|
||||
device_path: the sys path of accelerator device.
|
||||
image: The local path of programming image.
|
||||
|
||||
Image Format
|
||||
----------------------------
|
||||
@ -161,7 +163,7 @@ Testing
|
||||
* Functional tests will be added to test Cyborg FPGA driver.
|
||||
|
||||
Documentation Impact
|
||||
===================
|
||||
====================
|
||||
|
||||
Document FPGA driver in the Cyborg project
|
||||
|
||||
|
@ -11,32 +11,35 @@
|
||||
Blueprint url is not available yet
|
||||
https://blueprints.launchpad.net/openstack-cyborg/+spec/cyborg-fpga-modelling
|
||||
|
||||
This spec proposes the DB modelling schema for tracking reprogrammable resources
|
||||
This spec proposes the DB modelling schema for tracking reprogrammable
|
||||
resources
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
A field-programmable gate array (FPGA) is an integrated circuit designed to be
|
||||
configured by a customer or a designer after manufacturing. Their advantage lies
|
||||
in that they are sometimes significantly faster for some applications because of
|
||||
their parallel nature and optimality in terms of the number of gates used for a
|
||||
certain process. Hence, using FPGA for application acceleration in cloud has been
|
||||
becoming desirable. Cyborg as a management framwork for heterogeneous accelerators
|
||||
,tracking and deploying FPGAs are much needed features.
|
||||
configured by a customer or a designer after manufacturing. Their advantage
|
||||
lies in that they are sometimes significantly faster for some applications
|
||||
because of their parallel nature and optimality in terms of the number of gates
|
||||
used for a certain process. Hence, using FPGA for application acceleration in
|
||||
cloud has been becoming desirable. Cyborg as a management framwork for
|
||||
heterogeneous accelerators, tracking and deploying FPGAs are much needed
|
||||
features.
|
||||
|
||||
|
||||
Use Cases
|
||||
---------
|
||||
|
||||
When user requests FPGA resources, scheduler will use placement agent [1]_ to select
|
||||
appropriate hosts that have the requested FPGA resources.
|
||||
When user requests FPGA resources, scheduler will use placement agent [1]_ to
|
||||
select appropriate hosts that have the requested FPGA resources.
|
||||
|
||||
When a FPGA type resource is allocated to a VM, Cyborg needs to track down which
|
||||
exact device has been assigned in the database. On the other hand, when the
|
||||
resource is released, Cyborg will need to be detached and free the exact resource.
|
||||
When a FPGA type resource is allocated to a VM, Cyborg needs to track down
|
||||
which exact device has been assigned in the database. On the other hand, when
|
||||
the resource is released, Cyborg will need to be detached and free the exact
|
||||
resource.
|
||||
|
||||
When a new device is plugged in to the system(host), Cyborg needs to discover it
|
||||
and store it into the database
|
||||
When a new device is plugged in to the system(host), Cyborg needs to discover
|
||||
it and store it into the database
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
@ -45,13 +48,14 @@ We need to add 2 more tables to Cyborg database, one for tracking all the
|
||||
deployables and one for arbitrary key-value pairs of deplyable associated
|
||||
attirbutes. These tables are named as Deployables and Attributes.
|
||||
|
||||
Deployables table consists of all the common attributes columns as well as a parent_id
|
||||
and a root_id. The parent_id will point to the associated parent deployable and the
|
||||
root_id will point to the associated root deployable. By doing this, we can form a
|
||||
nested tree structure to represent different hierarchies. In addition, there will a
|
||||
foreign key named accelerator_id reference to the accelerators table. For the case
|
||||
where FPGA has not been loaded any bitstreams on it, they will still be tracked as
|
||||
a Deployable but no other Deployables referencing to it. For instance, a network of
|
||||
Deployables table consists of all the common attributes columns as well as
|
||||
a parent_id and a root_id. The parent_id will point to the associated parent
|
||||
deployable and the root_id will point to the associated root deployable.
|
||||
By doing this, we can form a nested tree structure to represent different
|
||||
hierarchies. In addition, there will a foreign key named accelerator_id
|
||||
reference to the accelerators table. For the case where FPGA has not been
|
||||
loaded any bitstreams on it, they will still be tracked as a Deployable but
|
||||
no other Deployables referencing to it. For instance, a network of
|
||||
FPGA hierarchies can be formed using deployables in following scheme::
|
||||
|
||||
-------------------
|
||||
@ -71,17 +75,18 @@ FPGA hierarchies can be formed using deployables in following scheme::
|
||||
----------------- -----------------
|
||||
|
||||
|
||||
Attributes table consists of a key and a value columns to represent arbitrary k-v pairs.
|
||||
Attributes table consists of a key and a value columns to represent arbitrary
|
||||
k-v pairs.
|
||||
|
||||
For instance, bitstream_id and function kpi can be tracked in this table.In addition,
|
||||
a foreign key deployable_id refers to the Deployables table and a parent_attribute_id
|
||||
to form nested structured attribute relationships.
|
||||
For instance, bitstream_id and function kpi can be tracked in this table.
|
||||
In addition, a foreign key deployable_id refers to the Deployables table and
|
||||
a parent_attribute_id to form nested structured attribute relationships.
|
||||
|
||||
Cyborg needs to have object classes to represent different types of deployables(e.g.
|
||||
FPGA, Physical Functions, Virtual Functions etc).
|
||||
Cyborg needs to have object classes to represent different types of
|
||||
deployables(e.g. FPGA, Physical Functions, Virtual Functions etc).
|
||||
|
||||
Cyborg Agent needs to add feature to discover the FPGA resources from FPGA driver
|
||||
and report them to the Cyborg DB through the conductor.
|
||||
Cyborg Agent needs to add feature to discover the FPGA resources from FPGA
|
||||
driver and report them to the Cyborg DB through the conductor.
|
||||
|
||||
Conductor needs to add couple of sets of APIs for different types of deployable
|
||||
resources.
|
||||
@ -89,21 +94,23 @@ resources.
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Alternativly, instead of having a flat table to track arbitrary hierarchies, we can use
|
||||
two different tables in Cyborg database, one for physical functions and one for virtual
|
||||
functions. physical_functions should have a foreign key constraint to reference the id in
|
||||
Accelerators table. In addition, virtual_functions should have a foreign key constraint
|
||||
to reference the id in physical_functions.
|
||||
Alternativly, instead of having a flat table to track arbitrary hierarchies, we
|
||||
can use two different tables in Cyborg database, one for physical functions and
|
||||
one for virtual functions. physical_functions should have a foreign key
|
||||
constraint to reference the id in Accelerators table. In addition,
|
||||
virtual_functions should have a foreign key constraint to reference the id
|
||||
in physical_functions.
|
||||
|
||||
The problems with this design are as follows. First, it can only track up to 3 hierarchies
|
||||
of resources. In case we need to add another layer, a lot of migaration work will
|
||||
be required. Second, even if we only need to add some new attribute to the existing
|
||||
resource type, we need to create new migration scripts for them. Overall the maintenance
|
||||
work is tedious.
|
||||
The problems with this design are as follows. First, it can only track up to
|
||||
3 hierarchies of resources. In case we need to add another layer, a lot of
|
||||
migaration work will be required. Second, even if we only need to add some new
|
||||
attribute to the existing resource type, we need to create new migration
|
||||
scripts for them. Overall the maintenance work is tedious.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
As discussed in previous sections, two tables will be added: Deployables and Attributes::
|
||||
As discussed in previous sections, two tables will be added: Deployables and
|
||||
Attributes::
|
||||
|
||||
|
||||
CREATE TABLE Deployables
|
||||
@ -143,7 +150,8 @@ As discussed in previous sections, two tables will be added: Deployables and Att
|
||||
|
||||
RPC API impact
|
||||
---------------
|
||||
Two sets of conductor APIs need to be added. 1 set for physical functions, 1 set for virtual functions
|
||||
Two sets of conductor APIs need to be added. 1 set for physical functions,
|
||||
1 set for virtual functions
|
||||
|
||||
Physical function apis::
|
||||
|
||||
@ -161,9 +169,9 @@ Virtual function apis::
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
Since these tables are not exposed to users for modifying/adding/deleting, Cyborg
|
||||
will only add two extra REST APIs to allow user query information related to
|
||||
deployables and their attributes.
|
||||
Since these tables are not exposed to users for modifying/adding/deleting,
|
||||
Cyborg will only add two extra REST APIs to allow user query information
|
||||
related to deployables and their attributes.
|
||||
|
||||
API for retrieving Deployable's information::
|
||||
|
||||
|
@ -71,6 +71,8 @@ Driver 'POST /discovery'
|
||||
|
||||
Trigger the discovery and setup process for a specific driver
|
||||
|
||||
.. code_block:: init
|
||||
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
@ -85,6 +87,8 @@ ready to use entires available by the public API. Hardware are
|
||||
physical devices on nodes that may or may not be ready to use or
|
||||
even fully supported.
|
||||
|
||||
.. code_block:: init
|
||||
|
||||
200 OK
|
||||
Content-Type: application/json
|
||||
|
||||
@ -125,6 +129,8 @@ Driver 'POST /hello'
|
||||
Registers that a driver has been installed on the machine and is ready to use.
|
||||
As well as it's endpoint and hardware support.
|
||||
|
||||
.. code_block:: init
|
||||
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
@ -154,6 +160,8 @@ Conductor 'POST /hello'
|
||||
|
||||
Registers that an Agent has been installed on the machine and is ready to use.
|
||||
|
||||
.. code_block:: init
|
||||
|
||||
Content-Type: application/json
|
||||
|
||||
{
|
||||
|
@ -13,10 +13,10 @@ https://blueprints.launchpad.net/cyborg/+spec/cyborg-nova-interaction
|
||||
Cyborg, as a service for managing accelerators of any kind needs to cooperate
|
||||
with Nova on two planes: Cyborg should be able to inform Nova about the
|
||||
resources through placement API[1], so that scheduler can leverage user
|
||||
requests for particular functionality into assignment of specific resource using
|
||||
resource provider which possess an accelerator, and second, Cyborg should be
|
||||
able to provide information on how Nova compute can attach particular resource
|
||||
to VM.
|
||||
requests for particular functionality into assignment of specific resource
|
||||
using resource provider which possess an accelerator, and second, Cyborg should
|
||||
be able to provide information on how Nova compute can attach particular
|
||||
resource to VM.
|
||||
|
||||
In a nutshell, this blueprint will define how information between Nova and
|
||||
Cyborg will be exchanged.
|
||||
@ -24,14 +24,14 @@ Cyborg will be exchanged.
|
||||
Problem description
|
||||
===================
|
||||
|
||||
Currently in OpenStack the use of non-standard accelerator hardware is supported
|
||||
in that features exist across many of the core servers that allow these resources
|
||||
to be allocated, passed through, and eventually used.
|
||||
Currently in OpenStack the use of non-standard accelerator hardware is
|
||||
supported in that features exist across many of the core servers that allow
|
||||
these resources to be allocated, passed through, and eventually used.
|
||||
|
||||
What remains a challenge though is the lack of an integrated workflow; there is no
|
||||
way to configure many of the accelerator features without significant by hand effort
|
||||
and service disruptions that go against the goals of having a easy, stable, and
|
||||
flexible cloud.
|
||||
What remains a challenge though is the lack of an integrated workflow; there
|
||||
is no way to configure many of the accelerator features without significant
|
||||
by hand effort and service disruptions that go against the goals of having
|
||||
a easy, stable, and flexible cloud.
|
||||
|
||||
Cyborg exists to bring these disjoint efforts together into a more standard
|
||||
workflow. While many components of this workflow already exist, some don't
|
||||
@ -53,7 +53,7 @@ used:
|
||||
|
||||
|
||||
Proposed Workflow
|
||||
===============
|
||||
=================
|
||||
|
||||
Using a method not relevant to this proposal Cyborg Agent inspects hardware
|
||||
and finds accelerators that it is interested in setting up for use.
|
||||
@ -66,21 +66,25 @@ One of the primary responsibilities of the Cyborg conductor is to keep the
|
||||
placement API in sync with reality. For example if here is a device with
|
||||
a virtual function or a FPGA with a given program Cyborg may be tasked with
|
||||
changing the virtual function on the NIC or the program on the FPGA. At which
|
||||
point the previously specified traits and resources need to be updated. Likewise
|
||||
Cyborg will be watching monitoring Nova's instances to ensure that doing this
|
||||
doesn't pull resources out from under an allocated instance.
|
||||
point the previously specified traits and resources need to be updated.
|
||||
Likewise Cyborg will be watching monitoring Nova's instances to ensure that
|
||||
doing this doesn't pull resources out from under an allocated instance.
|
||||
|
||||
At a high level what we need to be able to do is the following
|
||||
|
||||
1. Add a PCI device to Nova's whitelist live (config only / needs implementation)
|
||||
2. Add information about this device to the placement API (existing / being worked)
|
||||
3. Hotplug and unplug PCI devices from instances (existing / not sure how well maintained)
|
||||
1. Add a PCI device to Nova's whitelist live
|
||||
(config only / needs implementation)
|
||||
2. Add information about this device to the placement API
|
||||
(existing / being worked)
|
||||
3. Hotplug and unplug PCI devices from instances
|
||||
(existing / not sure how well maintained)
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Don't use Cyborg, struggle with bouncing services and grub config changes yourself.
|
||||
Don't use Cyborg, struggle with bouncing services and grub config changes
|
||||
yourself.
|
||||
|
||||
Data model impact
|
||||
-----------------
|
||||
@ -146,8 +150,8 @@ Dependencies
|
||||
This design depends on the changes which may or may not be accepted in Nova
|
||||
project. Other than that is ongoing work on Nested resource providers:
|
||||
http://specs.openstack.org/openstack/nova-specs/specs/ocata/approved/nested-resource-providers.html
|
||||
Which would be an essential feature in Placement API, which will be leveraged by
|
||||
Cyborg.
|
||||
Which would be an essential feature in Placement API, which will be leveraged
|
||||
by Cyborg.
|
||||
|
||||
|
||||
Testing
|
||||
|
@ -24,14 +24,14 @@ Use Cases
|
||||
---------
|
||||
|
||||
* When Cinder uses Ceph as its backend, the user should be able to
|
||||
use the Cyborg SPDK driver to discover the SPDK accelerator backend,
|
||||
enumerate the list of the Ceph nodes that have installed the SPDK.
|
||||
use the Cyborg SPDK driver to discover the SPDK accelerator backend,
|
||||
enumerate the list of the Ceph nodes that have installed the SPDK.
|
||||
* When Cinder directly uses SPDK's BlobStore as its backend, the user
|
||||
should be able to accomplish the same life cycle management operations
|
||||
for SPDK as mentioned above. After enumerating the SPDK, the user can
|
||||
attach (install) SPDK on that node. When the task completes, the user
|
||||
can also detach the SPDK from the node. Last but not least the user
|
||||
should be able to update the latest and available SPDK.
|
||||
should be able to accomplish the same life cycle management operations
|
||||
for SPDK as mentioned above. After enumerating the SPDK, the user can
|
||||
attach (install) SPDK on that node. When the task completes, the user
|
||||
can also detach the SPDK from the node. Last but not least the user
|
||||
should be able to update the latest and available SPDK.
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
@ -42,18 +42,18 @@ discover/list/update/attach/detach operations for SPDK framework.
|
||||
SPDK framework
|
||||
--------------
|
||||
|
||||
The SPDK framework comprises of the following components:
|
||||
The SPDK framework comprises of the following components::
|
||||
|
||||
+-----------userspace--------+ +--------------+
|
||||
| +------+ +------+ +------+ | | +-----------+ |
|
||||
+---+ | |DPDK | |NVMe | |NVMe | | | | Ceph | |
|
||||
| N +-+-+NIC | |Target| |Driver+-+-+ |NVMe Device| |
|
||||
| I | | |Driver| | | | | | | +-----------+ |
|
||||
| C | | +------+ +------+ +------+ | | +-----------+ |
|
||||
+---+ | +------------------------+ | | | Blobstore | |
|
||||
| | DPDK Libraries | | | |NVMe Device| |
|
||||
| +------------------------+ | | +-----------+ |
|
||||
+----------------------------+ +---------------+
|
||||
+-----------userspace--------+ +--------------+
|
||||
| +------+ +------+ +------+ | | +-----------+ |
|
||||
+---+ | |DPDK | |NVMe | |NVMe | | | | Ceph | |
|
||||
| N +-+-+NIC | |Target| |Driver+-+-+ |NVMe Device| |
|
||||
| I | | |Driver| | | | | | | +-----------+ |
|
||||
| C | | +------+ +------+ +------+ | | +-----------+ |
|
||||
+---+ | +------------------------+ | | | Blobstore | |
|
||||
| | DPDK Libraries | | | |NVMe Device| |
|
||||
| +------------------------+ | | +-----------+ |
|
||||
+----------------------------+ +---------------+
|
||||
|
||||
BlobStore NVMe Device Format
|
||||
----------------------------
|
||||
@ -87,25 +87,25 @@ avoids the filesystem, which improves efficiency.
|
||||
Life Cycle Management Phases
|
||||
----------------------------
|
||||
* We should be able to add a judgement whether the backend node has SPDK kit
|
||||
in generic driver module. If true, initialize the DPDK environment (such as
|
||||
hugepage).
|
||||
in generic driver module. If true, initialize the DPDK environment (such as
|
||||
hugepage).
|
||||
* Import the generic driver module, and then we should be able to
|
||||
discover (probe) the system for SPDK.
|
||||
discover (probe) the system for SPDK.
|
||||
* Determined by the backend storage scenario, enumerate (list) the optimal
|
||||
SPDK node, returning a boolean value to judge whether the SPDK should be
|
||||
attached.
|
||||
SPDK node, returning a boolean value to judge whether the SPDK should be
|
||||
attached.
|
||||
* After the node where SPDK will be running is attached, we can now send a
|
||||
request about the information of namespaces, and then create an I/O queue
|
||||
pair to submit read/write requests to a namespace.
|
||||
request about the information of namespaces, and then create an I/O queue
|
||||
pair to submit read/write requests to a namespace.
|
||||
* When Ceph is used as the backend, as the latest Ceph (such as Luminous)
|
||||
uses the BlueStore to be the storage engine, BlueStore and BlobStore are
|
||||
very similar things. We will not be able to use BlobStore to accelerate
|
||||
Ceph, but we can use Ioat and poller to boost speed for storage.
|
||||
uses the BlueStore to be the storage engine, BlueStore and BlobStore are
|
||||
very similar things. We will not be able to use BlobStore to accelerate
|
||||
Ceph, but we can use Ioat and poller to boost speed for storage.
|
||||
* When SPDK is used as the backend, we should be able to use BlobStore to
|
||||
improve performance.
|
||||
improve performance.
|
||||
* Whenever user requests, we should be able to detach the SPDK device.
|
||||
* Whenever user requests, we should be able to update SPDK to the latest and
|
||||
stable release.
|
||||
stable release.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
@ -116,19 +116,20 @@ Data model impact
|
||||
-----------------
|
||||
|
||||
* The Cyborg SPDK driver will notify Cyborg Agent to update the database
|
||||
when discover/list/update/attach/detach operations take place.
|
||||
when discover/list/update/attach/detach operations take place.
|
||||
|
||||
REST API impact
|
||||
---------------
|
||||
|
||||
This blueprint proposes to add the following APIs:
|
||||
*cyborg discover-driver(driver_type)
|
||||
*cyborg driver-list(driver_type)
|
||||
*cyborg install-driver(driver_id, driver_type)
|
||||
*cyborg attach-instance <instance_id>
|
||||
*cyborg detach-instance <instance_id>
|
||||
*cyborg uninstall-driver(driver_id, driver_type)
|
||||
*cyborg update-driver <driver_id, driver_type>
|
||||
|
||||
* cyborg discover-driver(driver_type)
|
||||
* cyborg driver-list(driver_type)
|
||||
* cyborg install-driver(driver_id, driver_type)
|
||||
* cyborg attach-instance <instance_id>
|
||||
* cyborg detach-instance <instance_id>
|
||||
* cyborg uninstall-driver(driver_id, driver_type)
|
||||
* cyborg update-driver <driver_id, driver_type>
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
@ -176,7 +177,7 @@ Work Items
|
||||
|
||||
* Implement the cyborg-spdk-driver in this spec.
|
||||
* Propose SPDK to py-spdk. The py-spdk is designed as a SPDK client
|
||||
which provides the python binding.
|
||||
which provides the python binding.
|
||||
|
||||
|
||||
Dependencies
|
||||
@ -192,10 +193,10 @@ Testing
|
||||
|
||||
* Unit tests will be added to test Cyborg SPDK driver.
|
||||
* Functional tests will be added to test Cyborg SPDK driver. For example:
|
||||
discover-->list-->attach,whether the workflow can be passed successfully.
|
||||
discover-->list-->attach,whether the workflow can be passed successfully.
|
||||
|
||||
Documentation Impact
|
||||
===================
|
||||
====================
|
||||
|
||||
Document SPDK driver in the Cyborg project
|
||||
|
||||
|
@ -99,8 +99,8 @@ If this is one part of a larger effort make it clear where this piece ends. In
|
||||
other words, what's the scope of this effort?
|
||||
|
||||
At this point, if you would like to just get feedback on if the problem and
|
||||
proposed change fit in Cyborg, you can stop here and post this for review to get
|
||||
preliminary feedback. If so please say:
|
||||
proposed change fit in Cyborg, you can stop here and post this for review to
|
||||
get preliminary feedback. If so please say:
|
||||
Posting to get preliminary feedback on the scope of this spec.
|
||||
|
||||
Alternatives
|
||||
|
@ -20,3 +20,4 @@ sphinxcontrib-seqdiag # BSD
|
||||
reno # Apache-2.0
|
||||
os-api-ref # Apache-2.0
|
||||
tempest # Apache-2.0
|
||||
doc8>=0.6.0 # Apache-2.0
|
||||
|
5
tox.ini
5
tox.ini
@ -31,6 +31,7 @@ commands =
|
||||
|
||||
[testenv:pep8]
|
||||
commands = pep8 {posargs}
|
||||
doc8 {posargs}
|
||||
|
||||
[testenv:pep8-constraints]
|
||||
install_command = {[testenv:common-constraints]install_command}
|
||||
@ -42,6 +43,10 @@ commands = {posargs}
|
||||
[testenv:cover]
|
||||
commands = python setup.py testr --coverage --testr-args='{posargs}'
|
||||
|
||||
[doc8]
|
||||
ignore-path = .venv,.git,.tox,*cyborg/locale*,*lib/python*,*cyborg.egg*,api-ref/build,doc/build,doc/source/contributor/api
|
||||
|
||||
|
||||
[testenv:docs]
|
||||
commands =
|
||||
python setup.py build_sphinx
|
||||
|
Loading…
Reference in New Issue
Block a user