Browse Source

Valence integration

This specs proposes to integrate with OpenStack Valence to support
compose node on the fly.

Change-Id: Id890263db8a62c7c74a11eedf75c87b148afa546
changes/90/441790/3
Zhenguo Niu 2 years ago
parent
commit
605c0698c0
2 changed files with 560 additions and 0 deletions
  1. 384
    0
      specs/pike/approved/template.rst
  2. 176
    0
      specs/pike/approved/valence-integration.rst

+ 384
- 0
specs/pike/approved/template.rst View File

@@ -0,0 +1,384 @@
1
+..
2
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
3
+ License.
4
+
5
+ http://creativecommons.org/licenses/by/3.0/legalcode
6
+
7
+==========================================
8
+Example Spec - The title of your blueprint
9
+==========================================
10
+
11
+Include the URL of your launchpad blueprint:
12
+
13
+https://blueprints.launchpad.net/mogan/+spec/example
14
+
15
+Introduction paragraph -- why are we doing anything? A single paragraph of
16
+prose that operators can understand. The title and this first paragraph
17
+should be used as the subject line and body of the commit message
18
+respectively.
19
+
20
+Some notes about the mogan-spec and blueprint process:
21
+
22
+* Not all blueprints need a spec. For more information see
23
+  http://docs.openstack.org/developer/mogan/blueprints.html#specs
24
+
25
+* The aim of this document is first to define the problem we need to solve,
26
+  and second agree the overall approach to solve that problem.
27
+
28
+* This is not intended to be extensive documentation for a new feature.
29
+  For example, there is no need to specify the exact configuration changes,
30
+  nor the exact details of any DB model changes. But you should still define
31
+  that such changes are required, and be clear on how that will affect
32
+  upgrades.
33
+
34
+* You should aim to get your spec approved before writing your code.
35
+  While you are free to write prototypes and code before getting your spec
36
+  approved, its possible that the outcome of the spec review process leads
37
+  you towards a fundamentally different solution than you first envisaged.
38
+
39
+* But, API changes are held to a much higher level of scrutiny.
40
+  As soon as an API change merges, we must assume it could be in production
41
+  somewhere, and as such, we then need to support that API change forever.
42
+  To avoid getting that wrong, we do want lots of details about API changes
43
+  upfront.
44
+
45
+Some notes about using this template:
46
+
47
+* Your spec should be in ReSTructured text, like this template.
48
+
49
+* Please wrap text at 79 columns.
50
+
51
+* The filename in the git repository should match the launchpad URL, for
52
+  example a URL of: https://blueprints.launchpad.net/mogan/+spec/awesome-thing
53
+  should be named awesome-thing.rst
54
+
55
+* Please do not delete any of the sections in this template.  If you have
56
+  nothing to say for a whole section, just write: None
57
+
58
+* For help with syntax, see http://sphinx-doc.org/rest.html
59
+
60
+* To test out your formatting, build the docs using tox and see the generated
61
+  HTML file in doc/build/html/specs/<path_of_your_file>
62
+
63
+* If you would like to provide a diagram with your spec, ascii diagrams are
64
+  required.  http://asciiflow.com/ is a very nice tool to assist with making
65
+  ascii diagrams.  The reason for this is that the tool used to review specs is
66
+  based purely on plain text.  Plain text will allow review to proceed without
67
+  having to look at additional files which can not be viewed in gerrit.  It
68
+  will also allow inline feedback on the diagram itself.
69
+
70
+* If your specification proposes any changes to the Mogan REST API such
71
+  as changing parameters which can be returned or accepted, or even
72
+  the semantics of what happens when a client calls into the API, then
73
+  you should add the APIImpact flag to the commit message. Specifications with
74
+  the APIImpact flag can be found with the following query:
75
+
76
+  https://review.openstack.org/#/q/status:open+project:openstack/mogan-specs+message:apiimpact,n,z
77
+
78
+
79
+Problem description
80
+===================
81
+
82
+A detailed description of the problem. What problem is this blueprint
83
+addressing?
84
+
85
+Use Cases
86
+---------
87
+
88
+What use cases does this address? What impact on actors does this change have?
89
+Ensure you are clear about the actors in each use case: Developer, End User,
90
+Deployer etc.
91
+
92
+Proposed change
93
+===============
94
+
95
+Here is where you cover the change you propose to make in detail. How do you
96
+propose to solve this problem?
97
+
98
+If this is one part of a larger effort make it clear where this piece ends. In
99
+other words, what's the scope of this effort?
100
+
101
+At this point, if you would like to just get feedback on if the problem and
102
+proposed change fit in mogan, you can stop here and post this for review to get
103
+preliminary feedback. If so please say:
104
+Posting to get preliminary feedback on the scope of this spec.
105
+
106
+Alternatives
107
+------------
108
+
109
+What other ways could we do this thing? Why aren't we using those? This doesn't
110
+have to be a full literature review, but it should demonstrate that thought has
111
+been put into why the proposed solution is an appropriate one.
112
+
113
+Data model impact
114
+-----------------
115
+
116
+Changes which require modifications to the data model often have a wider impact
117
+on the system.  The community often has strong opinions on how the data model
118
+should be evolved, from both a functional and performance perspective. It is
119
+therefore important to capture and gain agreement as early as possible on any
120
+proposed changes to the data model.
121
+
122
+Questions which need to be addressed by this section include:
123
+
124
+* What new data objects and/or database schema changes is this going to
125
+  require?
126
+
127
+* What database migrations will accompany this change.
128
+
129
+* How will the initial set of new data objects be generated, for example if you
130
+  need to take into account existing instances, or modify other existing data
131
+  describe how that will work.
132
+
133
+REST API impact
134
+---------------
135
+
136
+Each API method which is either added or changed should have the following
137
+
138
+* Specification for the method
139
+
140
+  * A description of what the method does suitable for use in
141
+    user documentation
142
+
143
+  * Method type (POST/PUT/GET/DELETE)
144
+
145
+  * Normal http response code(s)
146
+
147
+  * Expected error http response code(s)
148
+
149
+    * A description for each possible error code should be included
150
+      describing semantic errors which can cause it such as
151
+      inconsistent parameters supplied to the method, or when an
152
+      instance is not in an appropriate state for the request to
153
+      succeed. Errors caused by syntactic problems covered by the JSON
154
+      schema definition do not need to be included.
155
+
156
+  * URL for the resource
157
+
158
+    * URL should not include underscores, and use hyphens instead.
159
+
160
+  * Parameters which can be passed via the url
161
+
162
+  * JSON schema definition for the request body data if allowed
163
+
164
+    * Field names should use snake_case style, not CamelCase or MixedCase
165
+      style.
166
+
167
+  * JSON schema definition for the response body data if any
168
+
169
+    * Field names should use snake_case style, not CamelCase or MixedCase
170
+      style.
171
+
172
+* Example use case including typical API samples for both data supplied
173
+  by the caller and the response
174
+
175
+* Discuss any policy changes, and discuss what things a deployer needs to
176
+  think about when defining their policy.
177
+
178
+Note that the schema should be defined as restrictively as
179
+possible. Parameters which are required should be marked as such and
180
+only under exceptional circumstances should additional parameters
181
+which are not defined in the schema be permitted (eg
182
+additionaProperties should be False).
183
+
184
+Reuse of existing predefined parameter types such as regexps for
185
+passwords and user defined names is highly encouraged.
186
+
187
+Security impact
188
+---------------
189
+
190
+Describe any potential security impact on the system.  Some of the items to
191
+consider include:
192
+
193
+* Does this change touch sensitive data such as tokens, keys, or user data?
194
+
195
+* Does this change alter the API in a way that may impact security, such as
196
+  a new way to access sensitive information or a new way to login?
197
+
198
+* Does this change involve cryptography or hashing?
199
+
200
+* Does this change require the use of sudo or any elevated privileges?
201
+
202
+* Does this change involve using or parsing user-provided data? This could
203
+  be directly at the API level or indirectly such as changes to a cache layer.
204
+
205
+* Can this change enable a resource exhaustion attack, such as allowing a
206
+  single API interaction to consume significant server resources? Some examples
207
+  of this include launching subprocesses for each connection, or entity
208
+  expansion attacks in XML.
209
+
210
+For more detailed guidance, please see the OpenStack Security Guidelines as
211
+a reference (https://wiki.openstack.org/wiki/Security/Guidelines).  These
212
+guidelines are a work in progress and are designed to help you identify
213
+security best practices.  For further information, feel free to reach out
214
+to the OpenStack Security Group at openstack-security@lists.openstack.org.
215
+
216
+Notifications impact
217
+--------------------
218
+
219
+Please specify any changes to notifications. Be that an extra notification,
220
+changes to an existing notification, or removing a notification.
221
+
222
+Other end user impact
223
+---------------------
224
+
225
+Aside from the API, are there other ways a user will interact with this
226
+feature?
227
+
228
+* Does this change have an impact on python-moganclient? What does the user
229
+  interface there look like?
230
+
231
+Performance Impact
232
+------------------
233
+
234
+Describe any potential performance impact on the system, for example
235
+how often will new code be called, and is there a major change to the calling
236
+pattern of existing code.
237
+
238
+Examples of things to consider here include:
239
+
240
+* A periodic task might look like a small addition but if it calls conductor or
241
+  another service the load is multiplied by the number of nodes in the system.
242
+
243
+* Scheduler filters get called once per host for every instance being created,
244
+  so any latency they introduce is linear with the size of the system.
245
+
246
+* A small change in a utility function or a commonly used decorator can have a
247
+  large impacts on performance.
248
+
249
+* Calls which result in a database queries (whether direct or via conductor)
250
+  can have a profound impact on performance when called in critical sections of
251
+  the code.
252
+
253
+* Will the change include any locking, and if so what considerations are there
254
+  on holding the lock?
255
+
256
+Other deployer impact
257
+---------------------
258
+
259
+Discuss things that will affect how you deploy and configure OpenStack
260
+that have not already been mentioned, such as:
261
+
262
+* What config options are being added? Should they be more generic than
263
+  proposed (for example a flag that other hypervisor drivers might want to
264
+  implement as well)? Are the default values ones which will work well in
265
+  real deployments?
266
+
267
+* Is this a change that takes immediate effect after its merged, or is it
268
+  something that has to be explicitly enabled?
269
+
270
+* If this change is a new binary, how would it be deployed?
271
+
272
+* Please state anything that those doing continuous deployment, or those
273
+  upgrading from the previous release, need to be aware of. Also describe
274
+  any plans to deprecate configuration values or features.  For example, if we
275
+  change the directory name that instances are stored in, how do we handle
276
+  instance directories created before the change landed?  Do we move them?  Do
277
+  we have a special case in the code? Do we assume that the operator will
278
+  recreate all the instances in their cloud?
279
+
280
+Developer impact
281
+----------------
282
+
283
+Discuss things that will affect other developers working on OpenStack,
284
+such as:
285
+
286
+* If the blueprint proposes a change to the driver API, discussion of how
287
+  other hypervisors would implement the feature is required.
288
+
289
+
290
+Implementation
291
+==============
292
+
293
+Assignee(s)
294
+-----------
295
+
296
+Who is leading the writing of the code? Or is this a blueprint where you're
297
+throwing it out there to see who picks it up?
298
+
299
+If more than one person is working on the implementation, please designate the
300
+primary author and contact.
301
+
302
+Primary assignee:
303
+  <launchpad-id or None>
304
+
305
+Other contributors:
306
+  <launchpad-id or None>
307
+
308
+Work Items
309
+----------
310
+
311
+Work items or tasks -- break the feature up into the things that need to be
312
+done to implement it. Those parts might end up being done by different people,
313
+but we're mostly trying to understand the timeline for implementation.
314
+
315
+
316
+Dependencies
317
+============
318
+
319
+* Include specific references to specs and/or blueprints in mogan, or in other
320
+  projects, that this one either depends on or is related to.
321
+
322
+* If this requires functionality of another project that is not currently used
323
+  by Mogan, document that fact.
324
+
325
+* Does this feature require any new library dependencies or code otherwise not
326
+  included in OpenStack? Or does it depend on a specific version of library?
327
+
328
+
329
+Testing
330
+=======
331
+
332
+Please discuss the important scenarios needed to test here, as well as
333
+specific edge cases we should be ensuring work correctly. For each
334
+scenario please specify if this requires specialized hardware, a full
335
+openstack environment, or can be simulated inside the Mogan tree.
336
+
337
+Please discuss how the change will be tested. We especially want to know what
338
+tempest tests will be added. It is assumed that unit test coverage will be
339
+added so that doesn't need to be mentioned explicitly, but discussion of why
340
+you think unit tests are sufficient and we don't need to add more tempest
341
+tests would need to be included.
342
+
343
+Is this untestable in gate given current limitations (specific hardware /
344
+software configurations available)? If so, are there mitigation plans (3rd
345
+party testing, gate enhancements, etc).
346
+
347
+
348
+Documentation Impact
349
+====================
350
+
351
+Which audiences are affected most by this change, and which documentation
352
+titles on docs.openstack.org should be updated because of this change? Don't
353
+repeat details discussed above, but reference them here in the context of
354
+documentation for multiple audiences. For example, the Operations Guide targets
355
+cloud operators, and the End User Guide would need to be updated if the change
356
+offers a new feature available through the CLI or dashboard. If a config option
357
+changes or is deprecated, note here that the documentation needs to be updated
358
+to reflect this specification's change.
359
+
360
+References
361
+==========
362
+
363
+Please add any useful references here. You are not required to have any
364
+reference. Moreover, this specification should still make sense when your
365
+references are unavailable. Examples of what you could include are:
366
+
367
+* Links to mailing list or IRC discussions
368
+
369
+* Links to notes from a summit session
370
+
371
+* Links to relevant research, if appropriate
372
+
373
+* Related specifications as appropriate (e.g.  if it's an EC2 thing, link the
374
+  EC2 docs)
375
+
376
+* Anything else you feel it is worthwhile to refer to
377
+
378
+.. list-table:: Revisions
379
+   :header-rows: 1
380
+
381
+   * - Release Name
382
+     - Description
383
+   * - Ocata
384
+     - Introduced

+ 176
- 0
specs/pike/approved/valence-integration.rst View File

@@ -0,0 +1,176 @@
1
+..
2
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
3
+ License.
4
+
5
+ http://creativecommons.org/licenses/by/3.0/legalcode
6
+
7
+===================
8
+Valence Integration
9
+===================
10
+
11
+https://blueprints.launchpad.net/mogan/+spec/rsd-integration
12
+
13
+The current Mogan implementation only supports pre-set configuration servers.
14
+For custom servers, Mogan should to be able to compose bare metal through
15
+integration with Valence that leverages the Redfish API to compose nodes using
16
+disaggregated resources.
17
+
18
+
19
+Problem description
20
+===================
21
+
22
+Mogan currently can only provision pre-set configuration servers, but users may
23
+want to request a custom server with specific configurations like CPU, RAM, and
24
+DISK.
25
+
26
+Use Cases
27
+---------
28
+
29
+* An enterprise user wants to manage the RSD and general servers in a
30
+unified manner.
31
+
32
+* An enterprise user wants to apply a custom server with CPU, RAM, and DISK
33
+specified himself.
34
+
35
+
36
+Proposed change
37
+===============
38
+
39
+First, we need to refactor our flavor to pass Valence required parameters when
40
+composing a node, need to align with Valence team. But for non-rack servers
41
+we can keep the current way of scheduling a node to provision.
42
+
43
+When a request come with the Valence specific flavor, We can invoke Valence to
44
+compose the node on the fly, then register the composed node into Ironic with
45
+Redfish driver(not supported yet). When nodes are enrolled in Ironic, there's
46
+no difference with non-rack nodes. And these works are all done before the
47
+current instance create workflow, so we can create a new taskflow [1]_ for
48
+Valence which includes compose and enroll tasks:
49
+
50
+ComposeNodeTask:
51
+* execute: Invoke Valence to compose a node according the specified flavor.
52
+* revert: Release the composed node if there's something wrong when enrolling.
53
+
54
+EnrollNodeTask:
55
+* execute: Enroll the composed node to Ironic.
56
+* revert: If some exception raised and the node has been enrolled, need to
57
+remove it from Ironic.
58
+
59
+For Valence node, we should skip the scheduling task in provison workflow.
60
+Currently there are ScheduleCreateInstanceTask and OnFailureRescheduleTask,
61
+we can get rid of these two tasks when initialize the task flow in Valence
62
+scenario. Or maybe can handle this like select which node instances are
63
+launched(not supported yet).
64
+
65
+Also, if there's some exception raised when provisioning, we should release the
66
+composed node to Valence pool and remove it from Ironic.
67
+
68
+When deleting a node we should remove it from ironic first, then release the
69
+resources to Valence pool. For this, we can add a new field to instance to
70
+indicate whether it's a valence instance or not.
71
+
72
+
73
+Alternatives
74
+------------
75
+
76
+It will automatically invoke valence to compose node if scheduling max attempts
77
+exceeds instead of using a specific flavor to indicate it's a Valence instance.
78
+
79
+Data model impact
80
+-----------------
81
+
82
+The proposed change will be add the following fields to the instance object
83
+with their data type and default value for migrations.
84
+
85
++-----------------------+--------------+-----------------+
86
+| Field Name            | Field Type   | Migration Value |
87
++=======================+==============+=================+
88
+| composed              | bool         | None            |
89
++-----------------------+--------------+-----------------+
90
+
91
+
92
+REST API impact
93
+---------------
94
+
95
+None
96
+
97
+Security impact
98
+---------------
99
+
100
+None
101
+
102
+Notifications impact
103
+--------------------
104
+
105
+None
106
+
107
+Other end user impact
108
+---------------------
109
+
110
+None
111
+
112
+Performance Impact
113
+------------------
114
+
115
+There's one potential performance impact on the instance creating process,
116
+as we need to composing the node from Valence first.
117
+
118
+Other deployer impact
119
+---------------------
120
+
121
+None
122
+
123
+Developer impact
124
+----------------
125
+
126
+* As Mogan plans to support not only Ironic driver but also CloudBoot, need
127
+to figure out whether CloudBoot has supported Redfish already or there's not
128
+a plan to support it.
129
+
130
+
131
+Implementation
132
+==============
133
+
134
+Assignee(s)
135
+-----------
136
+
137
+Primary assignee:
138
+  <niu-zglinux>
139
+
140
+Work Items
141
+----------
142
+
143
+* Refactor flavor(instance type) to meet Valence's requirements.
144
+* Add `composed` filed to instance object.
145
+* Add a new taskflow for node composing and enrolling.
146
+* Change delete instance process to handle composed node gracefully.
147
+* Add Valence installation in Mogan devstack plugin as an option
148
+
149
+Dependencies
150
+============
151
+
152
+* Need valence client to be ready to integrate.
153
+
154
+* Redfish driver landed in ironic.
155
+
156
+* Valence PodManager simulator need to be improved, maybe return a fake
157
+node(VM) and maybe we can test it with ssh driver before Redfish driver
158
+available.
159
+
160
+
161
+Testing
162
+=======
163
+
164
+Unit Testing will be added.
165
+
166
+Documentation Impact
167
+====================
168
+
169
+Docs about Valence integration will be added.
170
+
171
+References
172
+==========
173
+
174
+.. [1] http://wiki.openstack.org/wiki/TaskFlow
175
+
176
+* https://wiki.openstack.org/wiki/Valence

Loading…
Cancel
Save