Browse Source

Merge "Specs for pegleg site manifest generation tool"

Zuul 5 months ago
parent
commit
515e36ea03
1 changed files with 409 additions and 0 deletions
  1. 409
    0
      specs/approved/data_config_generator.rst

+ 409
- 0
specs/approved/data_config_generator.rst View File

@@ -0,0 +1,409 @@
1
+..
2
+  This work is licensed under a Creative Commons Attribution 3.0 Unported
3
+  License.
4
+
5
+  http://creativecommons.org/licenses/by/3.0/legalcode
6
+
7
+========
8
+Spyglass
9
+========
10
+
11
+Spyglass is a data extraction tool which can interface with
12
+different input data sources to generate site manifest YAML files.
13
+The data sources will provide all the configuration data needed
14
+for a site deployment. These site manifest YAML files generated
15
+by spyglass will be saved in a Git repository, from where Pegleg
16
+can access and aggregate them. This aggregated file can then be
17
+fed to Shipyard for site deployment / updates.
18
+
19
+Problem description
20
+===================
21
+
22
+During the deployment of Airship Genesis node via Pegleg, it expects that
23
+the deployment engineer provides all the information pertained to Genesis,
24
+Controller & Compute nodes such as PXE IPs, VLANs pertaining to Storage
25
+network, Kubernetes network, Storage disks, Host profiles, etc. as
26
+manifests/YAMLs that are easily understandable by Pegleg.
27
+Currently there exists multiple data sources and these inputs are processed
28
+manually by deployment engineers. Considering the fact that there are
29
+multiple sites for which we need to generate such data, the current process
30
+is cumbersome, error-prone and time-intensive.
31
+
32
+The solution to this problem is to automate the overall process so that
33
+the resultant work-flow has standardized operations to handle multiple data
34
+sources and generate site YAMLs considering site type and version.
35
+
36
+Impacted components
37
+===================
38
+
39
+None.
40
+
41
+Proposed change
42
+===============
43
+
44
+Proposal here is to develop a standalone stateless automation utility to
45
+extract relevant information from a given site data source and process
46
+it against site specific templates to generate site manifests which can
47
+be consumed by Pegleg. The data sources could be different engineering packages
48
+or extracted from remote external sources. One example of a remote data source
49
+can be an API endpoint.
50
+
51
+The application shall perform the automation in two stages. In the first stage
52
+it shall generate a standardized intermediary YAML object after parsing extracted
53
+information from the data source. In the second stage the intermediary YAML shall be
54
+processed by a site processor using site specific templates to generate
55
+site manifests.
56
+
57
+Overall Architecture
58
+====================
59
+
60
+::
61
+
62
+        +-----------+           +-------------+
63
+        |           |           |  +-------+  |
64
+        |           |   +------>|  |Generic|  |
65
+    +-----------+   |   |       |  |Object |  |
66
+    |Tugboat(Xl)| I |   |       |  +-------+  |
67
+    |Plugin     | N |   |       |     |       |
68
+    +-----------+ T |   |       |     |       |
69
+        |         E |   |       |  +------+   |
70
+   +------------+ R |   |       |  |Parser|   +------> Intermediary YAML
71
+   |Remote Data | F |---+       |  +------+   |
72
+   |SourcePlugin| A |           |     |       |
73
+   +------------+ C |           |     |(Intermediary YAML)
74
+        |         E |           |     |       |
75
+        |           |           |     |       |
76
+        |         H |           |     v       |
77
+        |         A |           |  +---------+|(templates)    +------------+
78
+        |         N |           |  |Site     |+<--------------|Repository  |
79
+        |         D |           |  |Processor||-------------->|Adapter     |
80
+        |         L |           |  +---------+|(Generated     +------------+
81
+        |         E |           |      ^      | Site Manifests)
82
+        |         R |           |  +---|-----+|
83
+        |           |           |  |  J2     ||
84
+        |           |           |  |Templates||
85
+        |           |           |  +---------+|
86
+        +-----------+           +-------------+
87
+
88
+--
89
+
90
+1)Interface handler: Acts as an interface to support multiple plugins like Excel,
91
+  Remote Data Source, etc. The interface would define abstract APIs which would be overridden
92
+  by different plugins. A plugin would implement these APIs based on the type of data source
93
+  to collect raw site data and convert them to a generic object for further processing.
94
+  For example: Consider the APIs connect_data_source() and get_host_profile(). For Excel plugin
95
+  the connect_data_source API would implement file-open methods and the get_host_profile would
96
+  extract host profile related information from the Excel file.
97
+
98
+  In the case of a remote data source (for example an API endpoint), the API "connect_data_source"
99
+  shall authenticate (if required) and establish a connection to the remote site and the
100
+  "get_host_profile" API shall implement the logic to extract appropriate details over the established
101
+  connection. In order to support future plugins, one needs to override these interface handler
102
+  APIs and develop logic to extract site data from the corresponding data source.
103
+
104
+2)Parser: It processes the information obtained from generic YAML object to create an
105
+  intermediary YAML using the following inputs:
106
+  a) Global Design Rules: Common rules for generating manifest for any kind of site.
107
+  These rule are used for every plugin. for example: IPs to skip before considering allocation to host.
108
+  b) Site Config Rules: These are settings specific to a particular site.
109
+  For example http_proxy, bgp asn number, etc. It can be referred by all plugins. Sometimes these
110
+  site specific information can also be received from plugin data sources. In such cases the
111
+  information from plugin data sources would be used instead of the ones specified in site config rules.
112
+
113
+3)Intermediary YAML: It holds the complete site information after getting it from interface
114
+  handler plugin and after application of site specific rules. It maintains a common format agnostic
115
+  of the corresponding data source used. So it act as a primary input to Site Processor for generating
116
+  site manifests.
117
+
118
+4)Tugboat(Excel Parser) Plugin: It uses the interface handler APIs to open and parse the Excel file to
119
+  extract site details and create an in memory generic YAML object. This generic object is further processed
120
+  using site specific config rules and global rules to generate an intermediary YAML. The name "Tugboat"
121
+  here is used to identify "Excel Parser". For Excel parser the plugin shall use a Site specification file
122
+  which defines the various location(s) of the site information items in file. The location is specified by
123
+  mentioning rows and columns of the spreadsheet cell containing the specific site data.
124
+
125
+5)Remote Data Source Plugin: It uses the interface handler APIs to connect to the data source and extract
126
+  site specific information and then construct a generic in memory YAML object. This object is then parsed
127
+  to generate an intermediary YAML. There may be situations wherein the information extracted from API
128
+  endpoints are incomplete. In such scenarios, the missing information can be supplied from Site Config Rules.
129
+
130
+6)Site Processor: The site processor consumes the intermediary YAML and generates site manifests
131
+  based on corresponding site templates that are written in python Jinja2.
132
+  For example, for template file "baremetal.yaml.j2", the site processor will generate "baremetal.yaml"
133
+  with the information obtained from intermediary YAML and also by following the syntax present in the
134
+  corresponding template file.
135
+
136
+7)Site Templates(J2 templates): These define the manifest file formats for various entities like
137
+  baremetal, network, host-profiles, etc. The site processor applies these templates to an intermediary
138
+  YAML and generates the corresponding site manifests.
139
+  For example: calico-ip-rules.yaml.j2 will generate calico-ip-rules.yaml when processed by the
140
+  site processor.
141
+
142
+8)Repository Adapter: This helps in importing site specific templates from a repository and also
143
+  push generated site manifest YAMLs. The aim of the repository adapter shall be to abstract the
144
+  specific repository operations and maintain an uniform interface irrespective of the type of
145
+  repository used. It shall be possible to add newer repositories in the future without any change
146
+  to this interface. The access to this repository can be regulated by credentials if required and
147
+  those will be passed as parameters to the site specific config file.
148
+
149
+9)Sample data flow: for example generating OAM network information from site manifests.
150
+
151
+  - Raw rack information from plugin:
152
+
153
+     vlan_network_data:
154
+         oam:
155
+             subnet: 12.0.0.64/26
156
+             vlan: '1321'
157
+
158
+  - Rules to define gateway, ip ranges from subnet:
159
+
160
+     rule_ip_alloc_offset:
161
+         name: ip_alloc_offset
162
+             ip_alloc_offset:
163
+                 default: 10
164
+                 gateway: 1
165
+
166
+   The above rule specify the ip offset to considered to define ip address for gateway, reserved
167
+   and static ip ranges from the subnet pool.
168
+   So ip range for 12.0.0.64/26 is : 12.0.0.65 ~ 12.0.0.126
169
+   The rule "ip_alloc_offset" now helps to define additional information as follows:
170
+
171
+     - gateway: 12.0.0.65 (the first offset as defined by the field 'gateway')
172
+     - reserved ip ranges: 12.0.0.65 ~ 12.0.0.76 (the range is defined by adding
173
+         "default" to start ip range)
174
+     - static ip ranges: 12.0.0.77 ~ 12.0.0.126 (it follows the rule that we need
175
+         to skip first 10 ip addresses as defined by "default")
176
+
177
+  - Intermediary YAML file information generated after applying the above rules
178
+    to the raw rack information:
179
+
180
+::
181
+
182
+       network:
183
+            vlan_network_data:
184
+               oam:
185
+                network: 12.0.0.64/26
186
+                gateway: 12.0.0.65 --------+
187
+                reserved_start: 12.0.0.65  |
188
+                reserved_end: 12.0.0.76    |
189
+                routes:                    +--> Newly derived information
190
+                 - 0.0.0.0/0               |
191
+                static_start: 12.0.0.77    |
192
+                static_end: 12.0.0.126 ----+
193
+                vlan: '1321'
194
+
195
+--
196
+
197
+   - J2 templates for specifying oam network data: It represents the format in
198
+     which the site manifests will be generated with values obtained from
199
+     Intermediary YAML
200
+
201
+::
202
+
203
+      ---
204
+      schema: 'drydock/Network/v1'
205
+      metadata:
206
+        schema: 'metadata/Document/v1'
207
+        name: oam
208
+        layeringDefinition:
209
+          abstract: false
210
+          layer: 'site'
211
+          parentSelector:
212
+            network_role: oam
213
+            topology: cruiser
214
+          actions:
215
+            - method: merge
216
+              path: .
217
+        storagePolicy: cleartext
218
+      data:
219
+        cidr: {{ data['network']['vlan_network_data']['oam']['network'] }}}
220
+        routes:
221
+          - subnet: {{ data['network']['vlan_network_data']['oam']['routes'] }}
222
+            gateway: {{ data['network']['vlan_network_data']['oam']['gateway'] }}
223
+            metric: 100
224
+          ranges:
225
+          - type: reserved
226
+            start: {{ data['network']['vlan_network_data']['oam']['reserved_start'] }}
227
+            end: {{ data['network']['vlan_network_data']['oam']['reserved_end'] }}
228
+          - type: static
229
+            start: {{ data['network']['vlan_network_data']['oam']['static_start'] }}
230
+            end: {{ data['network']['vlan_network_data']['oam']['static_end'] }}
231
+      ...
232
+
233
+--
234
+
235
+   - OAM Network information in site manifests after applying intermediary YAML to J2
236
+     templates.:
237
+
238
+::
239
+
240
+      ---
241
+      schema: 'drydock/Network/v1'
242
+      metadata:
243
+        schema: 'metadata/Document/v1'
244
+        name: oam
245
+        layeringDefinition:
246
+          abstract: false
247
+          layer: 'site'
248
+          parentSelector:
249
+            network_role: oam
250
+            topology: cruiser
251
+          actions:
252
+            - method: merge
253
+              path: .
254
+        storagePolicy: cleartext
255
+      data:
256
+        cidr: 12.0.0.64/26
257
+        routes:
258
+          - subnet: 0.0.0.0/0
259
+            gateway: 12.0.0.65
260
+            metric: 100
261
+        ranges:
262
+          - type: reserved
263
+            start: 12.0.0.65
264
+            end: 12.0.0.76
265
+          - type: static
266
+            start: 12.0.0.77
267
+            end: 12.0.0.126
268
+      ...
269
+
270
+--
271
+
272
+Security impact
273
+---------------
274
+The impact would be limited to the use of credentials for accessing the data source, templates and
275
+also for uploading generated manifest files.
276
+
277
+Performance impact
278
+------------------
279
+
280
+None.
281
+
282
+Alternatives
283
+------------
284
+
285
+No existing utilities available to transform site information automatically.
286
+
287
+Implementation
288
+==============
289
+
290
+The following high-level implementation tasks are identified:
291
+a) Interface Handler
292
+b) Plugins (Excel and a sample Remote data source plugin)
293
+c) Parser
294
+d) Site Processor
295
+e) Repository Adapter
296
+
297
+Usage
298
+=====
299
+The tool will support Excel and remote data source plugin from the beginning.
300
+The section below lists the required input files for each of the aforementioned
301
+plugins.
302
+
303
+* Preparation: The preparation steps differ based on selected data source.
304
+
305
+  A. Excel Based Data Source.
306
+
307
+  - Gather the following input files:
308
+
309
+    1) Excel based site Engineering package. This file contains detail specification
310
+    covering IPMI, Public IPs, Private IPs, VLAN, Site Details, etc.
311
+
312
+    2) Excel Specification to aid parsing of the above Excel file. It contains
313
+    details about specific rows and columns in various sheet which contain the
314
+    necessary information to build site manifests.
315
+
316
+    3) Site specific configuration file containing additional configuration like
317
+    proxy, bgp information, interface names, etc.
318
+
319
+    4) Intermediary YAML file. In this cases Site Engineering Package and Excel
320
+    specification are not required.
321
+
322
+  B.  Remote Data Source
323
+
324
+  - Gather the following input information:
325
+
326
+    1) End point configuration file containing credentials to enable its access.
327
+    Each end-point type shall have their access governed by their respective plugins
328
+    and associated configuration file.
329
+
330
+    2) Site specific configuration file containing additional configuration like
331
+    proxy, bgp information, interface names, etc. These will be used if information
332
+    extracted from remote site is insufficient.
333
+
334
+* Program execution
335
+    1) CLI Options:
336
+
337
+      -g, --generate_intermediary  Dump intermediary file from passed Excel and
338
+                                   Excel spec.
339
+      -m, --generate_manifests     Generate manifests from the generated
340
+                                   intermediary file.
341
+      -x, --excel PATH             Path to engineering Excel file, to be passed
342
+                                   with generate_intermediary. The -s option is
343
+                                   mandatory with this option. Multiple engineering
344
+                                   files can be used. For example: -x file1.xls -x file2.xls
345
+      -s, --exel_spec PATH         Path to Excel spec, to be passed with
346
+                                   generate_intermediary. The -x option is
347
+                                   mandatory along with this option.
348
+      -i, --intermediary PATH      Path to intermediary file,to be passed
349
+                                   with generate_manifests. The -g and -x options
350
+                                   are not required with this option.
351
+      -d, --site_config PATH       Path to the site specific YAML file  [required]
352
+      -l, --loglevel INTEGER       Loglevel NOTSET:0 ,DEBUG:10,    INFO:20,
353
+                                   WARNING:30, ERROR:40, CRITICAL:50  [default:20]
354
+      -e, --end_point_config       File containing end-point configurations like user-name
355
+                                   password, certificates, URL, etc.
356
+      --help                       Show this message and exit.
357
+
358
+     2) Example:
359
+
360
+     2-1) Using Excel spec as input data source:
361
+
362
+      Generate Intermediary: spyglass -g -x <DesignSpec> -s <excel spec> -d <site-config>
363
+
364
+      Generate Manifest & Intermediary: spyglass -mg -x <DesignSpec> -s <excel spec> -d <site-config>
365
+
366
+      Generate Manifest with Intermediary: spyglass -m -i <intermediary>
367
+
368
+
369
+     2-1) Using external data source as input:
370
+
371
+      Generate Manifest and Intermediary : spyglass -m -g -e<end_point_config> -d <site-config>
372
+      Generate Manifest : spyglass -m  -e<end_point_config> -d <site-config>
373
+
374
+      Note: The end_point_config shall include attributes of the external data source that are
375
+      necessary for its access. Each external data source type shall have its own plugin to configure
376
+      its corresponding credentials.
377
+
378
+* Program output:
379
+    a) Site Manifests: As an initial release, the program shall output manifest files for
380
+       "airship-seaworthy" site. For example: baremetal, deployment, networks, pki, etc.
381
+       Reference:https://github.com/openstack/airship-treasuremap/tree/master/site/airship-seaworthy
382
+    b) Intermediary YAML: Containing aggregated site information generated from data sources that is
383
+       used to generate the above site manifests.
384
+
385
+Future Work
386
+============
387
+1) Schema based manifest generation instead of Jinja2 templates. It shall
388
+be possible to cleanly transition to this schema based generation keeping a unique
389
+mapping between schema and generated manifests. Currently this is managed by
390
+considering a mapping of j2 templates with schemas and site type.
391
+
392
+2) UI editor for intermediary YAML
393
+
394
+
395
+Alternatives
396
+============
397
+1) Schema based manifest generation instead of Jinja2 templates.
398
+2) Develop the data source plugins as an extension to Pegleg.
399
+
400
+Dependencies
401
+============
402
+1) Availability of a repository to store Jinja2 templates.
403
+2) Availability of a repository to store generated manifests.
404
+
405
+References
406
+==========
407
+
408
+None
409
+

Loading…
Cancel
Save