Valence integration

This specs proposes to integrate with OpenStack Valence to support
compose node on the fly.

Change-Id: Id890263db8a62c7c74a11eedf75c87b148afa546
This commit is contained in:
Zhenguo Niu 2017-03-06 16:13:36 +08:00
parent 26c5e79070
commit 605c0698c0
2 changed files with 560 additions and 0 deletions

View File

@ -0,0 +1,384 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
==========================================
Example Spec - The title of your blueprint
==========================================
Include the URL of your launchpad blueprint:
https://blueprints.launchpad.net/mogan/+spec/example
Introduction paragraph -- why are we doing anything? A single paragraph of
prose that operators can understand. The title and this first paragraph
should be used as the subject line and body of the commit message
respectively.
Some notes about the mogan-spec and blueprint process:
* Not all blueprints need a spec. For more information see
http://docs.openstack.org/developer/mogan/blueprints.html#specs
* The aim of this document is first to define the problem we need to solve,
and second agree the overall approach to solve that problem.
* This is not intended to be extensive documentation for a new feature.
For example, there is no need to specify the exact configuration changes,
nor the exact details of any DB model changes. But you should still define
that such changes are required, and be clear on how that will affect
upgrades.
* You should aim to get your spec approved before writing your code.
While you are free to write prototypes and code before getting your spec
approved, its possible that the outcome of the spec review process leads
you towards a fundamentally different solution than you first envisaged.
* But, API changes are held to a much higher level of scrutiny.
As soon as an API change merges, we must assume it could be in production
somewhere, and as such, we then need to support that API change forever.
To avoid getting that wrong, we do want lots of details about API changes
upfront.
Some notes about using this template:
* Your spec should be in ReSTructured text, like this template.
* Please wrap text at 79 columns.
* The filename in the git repository should match the launchpad URL, for
example a URL of: https://blueprints.launchpad.net/mogan/+spec/awesome-thing
should be named awesome-thing.rst
* Please do not delete any of the sections in this template. If you have
nothing to say for a whole section, just write: None
* For help with syntax, see http://sphinx-doc.org/rest.html
* To test out your formatting, build the docs using tox and see the generated
HTML file in doc/build/html/specs/<path_of_your_file>
* If you would like to provide a diagram with your spec, ascii diagrams are
required. http://asciiflow.com/ is a very nice tool to assist with making
ascii diagrams. The reason for this is that the tool used to review specs is
based purely on plain text. Plain text will allow review to proceed without
having to look at additional files which can not be viewed in gerrit. It
will also allow inline feedback on the diagram itself.
* If your specification proposes any changes to the Mogan REST API such
as changing parameters which can be returned or accepted, or even
the semantics of what happens when a client calls into the API, then
you should add the APIImpact flag to the commit message. Specifications with
the APIImpact flag can be found with the following query:
https://review.openstack.org/#/q/status:open+project:openstack/mogan-specs+message:apiimpact,n,z
Problem description
===================
A detailed description of the problem. What problem is this blueprint
addressing?
Use Cases
---------
What use cases does this address? What impact on actors does this change have?
Ensure you are clear about the actors in each use case: Developer, End User,
Deployer etc.
Proposed change
===============
Here is where you cover the change you propose to make in detail. How do you
propose to solve this problem?
If this is one part of a larger effort make it clear where this piece ends. In
other words, what's the scope of this effort?
At this point, if you would like to just get feedback on if the problem and
proposed change fit in mogan, you can stop here and post this for review to get
preliminary feedback. If so please say:
Posting to get preliminary feedback on the scope of this spec.
Alternatives
------------
What other ways could we do this thing? Why aren't we using those? This doesn't
have to be a full literature review, but it should demonstrate that thought has
been put into why the proposed solution is an appropriate one.
Data model impact
-----------------
Changes which require modifications to the data model often have a wider impact
on the system. The community often has strong opinions on how the data model
should be evolved, from both a functional and performance perspective. It is
therefore important to capture and gain agreement as early as possible on any
proposed changes to the data model.
Questions which need to be addressed by this section include:
* What new data objects and/or database schema changes is this going to
require?
* What database migrations will accompany this change.
* How will the initial set of new data objects be generated, for example if you
need to take into account existing instances, or modify other existing data
describe how that will work.
REST API impact
---------------
Each API method which is either added or changed should have the following
* Specification for the method
* A description of what the method does suitable for use in
user documentation
* Method type (POST/PUT/GET/DELETE)
* Normal http response code(s)
* Expected error http response code(s)
* A description for each possible error code should be included
describing semantic errors which can cause it such as
inconsistent parameters supplied to the method, or when an
instance is not in an appropriate state for the request to
succeed. Errors caused by syntactic problems covered by the JSON
schema definition do not need to be included.
* URL for the resource
* URL should not include underscores, and use hyphens instead.
* Parameters which can be passed via the url
* JSON schema definition for the request body data if allowed
* Field names should use snake_case style, not CamelCase or MixedCase
style.
* JSON schema definition for the response body data if any
* Field names should use snake_case style, not CamelCase or MixedCase
style.
* Example use case including typical API samples for both data supplied
by the caller and the response
* Discuss any policy changes, and discuss what things a deployer needs to
think about when defining their policy.
Note that the schema should be defined as restrictively as
possible. Parameters which are required should be marked as such and
only under exceptional circumstances should additional parameters
which are not defined in the schema be permitted (eg
additionaProperties should be False).
Reuse of existing predefined parameter types such as regexps for
passwords and user defined names is highly encouraged.
Security impact
---------------
Describe any potential security impact on the system. Some of the items to
consider include:
* Does this change touch sensitive data such as tokens, keys, or user data?
* Does this change alter the API in a way that may impact security, such as
a new way to access sensitive information or a new way to login?
* Does this change involve cryptography or hashing?
* Does this change require the use of sudo or any elevated privileges?
* Does this change involve using or parsing user-provided data? This could
be directly at the API level or indirectly such as changes to a cache layer.
* Can this change enable a resource exhaustion attack, such as allowing a
single API interaction to consume significant server resources? Some examples
of this include launching subprocesses for each connection, or entity
expansion attacks in XML.
For more detailed guidance, please see the OpenStack Security Guidelines as
a reference (https://wiki.openstack.org/wiki/Security/Guidelines). These
guidelines are a work in progress and are designed to help you identify
security best practices. For further information, feel free to reach out
to the OpenStack Security Group at openstack-security@lists.openstack.org.
Notifications impact
--------------------
Please specify any changes to notifications. Be that an extra notification,
changes to an existing notification, or removing a notification.
Other end user impact
---------------------
Aside from the API, are there other ways a user will interact with this
feature?
* Does this change have an impact on python-moganclient? What does the user
interface there look like?
Performance Impact
------------------
Describe any potential performance impact on the system, for example
how often will new code be called, and is there a major change to the calling
pattern of existing code.
Examples of things to consider here include:
* A periodic task might look like a small addition but if it calls conductor or
another service the load is multiplied by the number of nodes in the system.
* Scheduler filters get called once per host for every instance being created,
so any latency they introduce is linear with the size of the system.
* A small change in a utility function or a commonly used decorator can have a
large impacts on performance.
* Calls which result in a database queries (whether direct or via conductor)
can have a profound impact on performance when called in critical sections of
the code.
* Will the change include any locking, and if so what considerations are there
on holding the lock?
Other deployer impact
---------------------
Discuss things that will affect how you deploy and configure OpenStack
that have not already been mentioned, such as:
* What config options are being added? Should they be more generic than
proposed (for example a flag that other hypervisor drivers might want to
implement as well)? Are the default values ones which will work well in
real deployments?
* Is this a change that takes immediate effect after its merged, or is it
something that has to be explicitly enabled?
* If this change is a new binary, how would it be deployed?
* Please state anything that those doing continuous deployment, or those
upgrading from the previous release, need to be aware of. Also describe
any plans to deprecate configuration values or features. For example, if we
change the directory name that instances are stored in, how do we handle
instance directories created before the change landed? Do we move them? Do
we have a special case in the code? Do we assume that the operator will
recreate all the instances in their cloud?
Developer impact
----------------
Discuss things that will affect other developers working on OpenStack,
such as:
* If the blueprint proposes a change to the driver API, discussion of how
other hypervisors would implement the feature is required.
Implementation
==============
Assignee(s)
-----------
Who is leading the writing of the code? Or is this a blueprint where you're
throwing it out there to see who picks it up?
If more than one person is working on the implementation, please designate the
primary author and contact.
Primary assignee:
<launchpad-id or None>
Other contributors:
<launchpad-id or None>
Work Items
----------
Work items or tasks -- break the feature up into the things that need to be
done to implement it. Those parts might end up being done by different people,
but we're mostly trying to understand the timeline for implementation.
Dependencies
============
* Include specific references to specs and/or blueprints in mogan, or in other
projects, that this one either depends on or is related to.
* If this requires functionality of another project that is not currently used
by Mogan, document that fact.
* Does this feature require any new library dependencies or code otherwise not
included in OpenStack? Or does it depend on a specific version of library?
Testing
=======
Please discuss the important scenarios needed to test here, as well as
specific edge cases we should be ensuring work correctly. For each
scenario please specify if this requires specialized hardware, a full
openstack environment, or can be simulated inside the Mogan tree.
Please discuss how the change will be tested. We especially want to know what
tempest tests will be added. It is assumed that unit test coverage will be
added so that doesn't need to be mentioned explicitly, but discussion of why
you think unit tests are sufficient and we don't need to add more tempest
tests would need to be included.
Is this untestable in gate given current limitations (specific hardware /
software configurations available)? If so, are there mitigation plans (3rd
party testing, gate enhancements, etc).
Documentation Impact
====================
Which audiences are affected most by this change, and which documentation
titles on docs.openstack.org should be updated because of this change? Don't
repeat details discussed above, but reference them here in the context of
documentation for multiple audiences. For example, the Operations Guide targets
cloud operators, and the End User Guide would need to be updated if the change
offers a new feature available through the CLI or dashboard. If a config option
changes or is deprecated, note here that the documentation needs to be updated
to reflect this specification's change.
References
==========
Please add any useful references here. You are not required to have any
reference. Moreover, this specification should still make sense when your
references are unavailable. Examples of what you could include are:
* Links to mailing list or IRC discussions
* Links to notes from a summit session
* Links to relevant research, if appropriate
* Related specifications as appropriate (e.g. if it's an EC2 thing, link the
EC2 docs)
* Anything else you feel it is worthwhile to refer to
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - Ocata
- Introduced

View File

@ -0,0 +1,176 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License.
http://creativecommons.org/licenses/by/3.0/legalcode
===================
Valence Integration
===================
https://blueprints.launchpad.net/mogan/+spec/rsd-integration
The current Mogan implementation only supports pre-set configuration servers.
For custom servers, Mogan should to be able to compose bare metal through
integration with Valence that leverages the Redfish API to compose nodes using
disaggregated resources.
Problem description
===================
Mogan currently can only provision pre-set configuration servers, but users may
want to request a custom server with specific configurations like CPU, RAM, and
DISK.
Use Cases
---------
* An enterprise user wants to manage the RSD and general servers in a
unified manner.
* An enterprise user wants to apply a custom server with CPU, RAM, and DISK
specified himself.
Proposed change
===============
First, we need to refactor our flavor to pass Valence required parameters when
composing a node, need to align with Valence team. But for non-rack servers
we can keep the current way of scheduling a node to provision.
When a request come with the Valence specific flavor, We can invoke Valence to
compose the node on the fly, then register the composed node into Ironic with
Redfish driver(not supported yet). When nodes are enrolled in Ironic, there's
no difference with non-rack nodes. And these works are all done before the
current instance create workflow, so we can create a new taskflow [1]_ for
Valence which includes compose and enroll tasks:
ComposeNodeTask:
* execute: Invoke Valence to compose a node according the specified flavor.
* revert: Release the composed node if there's something wrong when enrolling.
EnrollNodeTask:
* execute: Enroll the composed node to Ironic.
* revert: If some exception raised and the node has been enrolled, need to
remove it from Ironic.
For Valence node, we should skip the scheduling task in provison workflow.
Currently there are ScheduleCreateInstanceTask and OnFailureRescheduleTask,
we can get rid of these two tasks when initialize the task flow in Valence
scenario. Or maybe can handle this like select which node instances are
launched(not supported yet).
Also, if there's some exception raised when provisioning, we should release the
composed node to Valence pool and remove it from Ironic.
When deleting a node we should remove it from ironic first, then release the
resources to Valence pool. For this, we can add a new field to instance to
indicate whether it's a valence instance or not.
Alternatives
------------
It will automatically invoke valence to compose node if scheduling max attempts
exceeds instead of using a specific flavor to indicate it's a Valence instance.
Data model impact
-----------------
The proposed change will be add the following fields to the instance object
with their data type and default value for migrations.
+-----------------------+--------------+-----------------+
| Field Name | Field Type | Migration Value |
+=======================+==============+=================+
| composed | bool | None |
+-----------------------+--------------+-----------------+
REST API impact
---------------
None
Security impact
---------------
None
Notifications impact
--------------------
None
Other end user impact
---------------------
None
Performance Impact
------------------
There's one potential performance impact on the instance creating process,
as we need to composing the node from Valence first.
Other deployer impact
---------------------
None
Developer impact
----------------
* As Mogan plans to support not only Ironic driver but also CloudBoot, need
to figure out whether CloudBoot has supported Redfish already or there's not
a plan to support it.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
<niu-zglinux>
Work Items
----------
* Refactor flavor(instance type) to meet Valence's requirements.
* Add `composed` filed to instance object.
* Add a new taskflow for node composing and enrolling.
* Change delete instance process to handle composed node gracefully.
* Add Valence installation in Mogan devstack plugin as an option
Dependencies
============
* Need valence client to be ready to integrate.
* Redfish driver landed in ironic.
* Valence PodManager simulator need to be improved, maybe return a fake
node(VM) and maybe we can test it with ssh driver before Redfish driver
available.
Testing
=======
Unit Testing will be added.
Documentation Impact
====================
Docs about Valence integration will be added.
References
==========
.. [1] http://wiki.openstack.org/wiki/TaskFlow
* https://wiki.openstack.org/wiki/Valence