Browse Source

Add LLDP processing hook and new CLI commands

It would be useful to display the contents of Link Layer Discovery
Protocol (LLDP) TLVs received from network switches that are cached
by IPA.  The LLDP data can help with deployment validation and
troubleshooting.  The spec presents a hook to process and store
the cached LLDP info and new 'openstack baremetal' commands to
display the processed LLDP data.

Change-Id: Ife9a1901b8f21be2a31969a5fb6bc777162f1e95
Related-Bug: 1626253
Bob Fournier 2 years ago
parent
commit
ac70bef7b8
1 changed files with 525 additions and 0 deletions
  1. 525
    0
      specs/lldp-reporting.rst

+ 525
- 0
specs/lldp-reporting.rst View File

@@ -0,0 +1,525 @@
1
+This work is licensed under a Creative Commons Attribution 3.0 Unported
2
+ License.
3
+
4
+http://creativecommons.org/licenses/by/3.0/legalcode
5
+
6
+=============================
7
+ Introspection LLDP reporting
8
+=============================
9
+
10
+`LLDP reporting RFE`_
11
+
12
+Link Layer Discovery Protocol (LLDP) packets are transmitted periodically
13
+by network switches out each switch port in accordance with
14
+`IEEE Std 802.1AB-2016`_.  The protocol is used to advertise the switch port's
15
+capabilities and configuration.  The LLDP packets are gathered
16
+by Ironic Python Agent (IPA) running on each node and stored per interface
17
+in the Ironic Inspector database in the same type, length, value (TLV) format
18
+as they are received per the `IPA change to add LLDP support`_.  The LLDP data
19
+contains switch port VLAN assignments, MTU sizes, link aggregation
20
+configuration, etc., offering a rich set of networking information that can be
21
+used for planning and troubleshooting a deployment from the point of view of
22
+Ironic nodes.
23
+
24
+This proposal is for additional Ironic Inspector hooks to parse the raw LLDP
25
+TLVs and store the results back to Swift as part of the interface data. The
26
+processed LLDP data can be accessed directly or displayed via a client
27
+side CLI.  This proposal includes the python-ironic-inspector-client CLI
28
+commands to display the processed LLDP data.  The data should be displayed
29
+in a format that will allow a user to quickly see the network switch
30
+port configuration and detect mismatches within and between nodes.  For
31
+example, it should be easy to see whether a particular VLAN is configured
32
+on each node's eth0 interface, or detect that all switch ports connected
33
+to all nodes are configured to support jumbo frames.
34
+
35
+Problem description
36
+===================
37
+
38
+Network switch configuration problems can be a major source of problems
39
+when doing Openstack deployments and are difficult to detect and diagnose.
40
+There may be many network switch ports that multiple baremetal nodes are
41
+connected to and the switch port configuration may not match the deployment
42
+parameters.  The user doing the Openstack deployment may not have access
43
+to the network switches to view the port configurations, or may not be
44
+familiar with the particular switch user interface.
45
+
46
+Some potential mismatches between switch port configuration and the
47
+Openstack deployment's node interface settings are:
48
+
49
+        - VLAN configuration
50
+        - Untagged VLAN ID on provisioning network
51
+        - MTU sizes
52
+        - Link aggregation (aka bonding) configuration
53
+
54
+Proposed change
55
+===============
56
+
57
+The scope of the proposed change is to parse the LLDP data captured by
58
+Ironic Python Agent by Ironic Inspector plugins TLVs as defined in
59
+`IEEE Std 802.1AB-2016`_ and display the data in a user friendly format.
60
+It's beyond scope to determine if an Openstack deployment matches the
61
+switch port configuration, but the CLI commands can be used to
62
+verify and validate the deployment configuration.
63
+
64
+A new Ironic Inspector hook (aka plugin) will be added to parse all
65
+standard LLDP data and store the data per interface.  Not all LLDP TLVs
66
+as defined in the specification are sent by every network switch.  However,
67
+it appears to be standard practice that switches that support the standard
68
+will implement the basic management, `IEEE Std 802.1Q-2014`_, and
69
+`IEEE Std 802.3-2012`_ optional TLVs. The new Ironic Inspector hook must
70
+support the following mandatory and optional TLVs:
71
+
72
+        - Chassis ID TLV (mandatory)
73
+        - Port ID TLV (mandatory)
74
+        - Basic Management TLV Set (optional)
75
+                - Port Description TLV
76
+                - System Name TLV
77
+                - System Description TLV
78
+                - System Capabilities TLV
79
+                - Management Address TLV
80
+        - IEEE 802.1 Organizationally Specific TLV Set (optional)
81
+                - Port VLAN ID TLV
82
+                - Port And Protocol VLAN ID TLV
83
+                - VLAN Name TLV
84
+                - Protocol Identity TLV
85
+                - Management VID TLV
86
+                - Link Aggregation TLV
87
+        - IEEE 802.3 Organizationally Specific TLV Set (optional)
88
+                - MAC/Phy Config/Status TLV (includes duplex/speed/autoneg)
89
+                - Link Aggregation TLV
90
+                - Maximum Frame Size MTU
91
+
92
+The `LLDP-MED`_ TLV set was developed to support IP telephony and as such isn't
93
+relevant for a LAN environment.  The LLDP-MED TLVs that are received
94
+could be handled by a separate processing hook that will parse and store
95
+the decoded TLVs.  Processing hooks can be stacked such that the standard
96
+LLDP processing hook can be run in addition to the LLDP-MED hook.  The
97
+development of a LLDP-MED processing hook is out of scope for this effort.
98
+
99
+Some network switches send vendor specific TLVs.  The definition of these
100
+TLVs may not be widely available or consistent across releases.  In order
101
+to support the parsing of these TLVs, vendor specific hooks could be added to
102
+the other LLDP hooks in the future, although vendor-specific hooks are out of
103
+scope for this effort.
104
+
105
+The hooks can be enabled in the ``inspector.conf`` file. By default, these LLDP
106
+plugins will not be enabled.  Background information on ironic-inspector
107
+plugins is here: `ironic-inspector plugins`_.
108
+
109
+The new plugins will reside in ironic-inspector.  The new CLI commands will
110
+be in python-ironic-inspector-client.
111
+
112
+Alternatives
113
+------------
114
+
115
+This proposed implementation uses data captured by IPA during introspection,
116
+it is not a real-time LLPD monitoring tool like ``lldpd``.  The data used could
117
+be outside the Time To Live (TTL) window and therefore considered not valid.
118
+Any configuration changes made at the switch would not be detected until the
119
+user runs introspection again.
120
+
121
+The major advantage of this approach is that the data is cached by inspector
122
+for all nodes and their interfaces.  This implementation is most useful as a
123
+tool to point to potential mismatches between the switch and the deployment
124
+configuration, rather than as an absolute source of truth about the real-time
125
+switch configuration.
126
+
127
+Data model impact
128
+-----------------
129
+
130
+Currently Ironic Inspector stores received LLDP data in Swift in raw
131
+type/value format for each interface.  For example, here is a subset of
132
+stored LLDP TLVs showing Chassis ID (Type 1), Port ID (Type 2), System Name
133
+(Type 5), and VLANs (Type 127 and OUI for 802.1 - 0x0080c2).  Two VLAN TLVs are
134
+shown that map to VLANs 100 and 101.
135
+
136
+::
137
+
138
+    "lldp": [
139
+      [
140
+        1,
141
+        "0464649b32f300"
142
+      ],
143
+      [
144
+        2,
145
+        "07373231"
146
+      [
147
+        5,
148
+        "737730312d646973742d31622d6231322e72647532"
149
+      ],
150
+      [
151
+        127,
152
+        "0080c203006407766c616e313030"
153
+      ],
154
+      [
155
+        127,
156
+        "0080c203006507766c616e313031"
157
+      ],
158
+      ...
159
+
160
+
161
+The proposed Ironic Inspector processing hooks will parse this LLDP data
162
+and update the data store with an ``lldp_processed`` struct per interface
163
+containing name/value pairs.  This new struct will be stored under
164
+``all_interfaces``.
165
+
166
+Note that in the raw data there may be multiple TLVs with the same TLV
167
+type/subtype.  In some cases this is expected, for example there are individual
168
+VLAN TLVs for each configured VLAN. In other cases, multiple TLVs for the same
169
+type/subtype is unexpected, perhaps due to the switch sending the same TLV
170
+twice or IPA receiving them out of order, etc.  The unexpected case still needs
171
+to be handled.
172
+
173
+Depending on the TLV type, the hook will store the data as either a name/list
174
+or name/value binding.  The name/value will be for TLVs that should only have
175
+a single value, as with ``chassis_id``, while the name/list is for data that
176
+can incorporate multiple TLVs with the same type/subtype, for example VLANs.
177
+Data stored in the list each entry must be unique, there cannot be duplicate
178
+list entries.  ``system_capabilities`` and ``port_capabilites`` TLVs can be
179
+handled as a list in the same way as VLANs.
180
+
181
+For TLVs that map to a single name/value pair, i.e. ``chassis_id``,
182
+``port_id``, ``autonegotiation_enabled`` etc. a check must be made to ensure
183
+that duplicate TLV(s) are not processed.  In other words, if a name/value pair
184
+for ``chassis_id`` has already been stored it will not be overwritten.
185
+
186
+Example processed content is shown below.
187
+
188
+::
189
+
190
+   all_interfaces": {"eth0":
191
+       {"ip": null, "mac": "a0:36:xx:xx:xx",
192
+        "lldp_processed": {
193
+           "switch_chassis_id": "64:64:9b:xx:xx:xx",
194
+           "switch_port_id": "734",
195
+           "switch_system_name": "sw01-bld2",
196
+           "switch_port_physical_capabilities" : ['100Base-TX hdx',
197
+                                                 '100BASE-TX fdx',
198
+                                                 '1000BASE-T fdx'],
199
+           "switch_port_mtu" : "9216",
200
+           "switch_port_link_aggregation_support": "True",
201
+           "switch_port_link_aggregation_enabled": "False",
202
+           "switch_port_autonegotiation_support"  "True",
203
+           "switch_port_autonegotiation_enabled"  "True",
204
+           "switch_port_vlans": [{"name": "vlan101", "id": 101},
205
+                                 {"name": "vlan102", "id": 102},
206
+                                 {"name": "vlan104", "id": 103}],
207
+      ...
208
+           }
209
+       }
210
+   }
211
+
212
+Each processing hook will add additional named pairs to ``lldp_processed``
213
+per interface.  This allows both standard and vendor specific hooks to run
214
+that can interpret all received LLDP TLVs.  Vendor specific plugins will only
215
+process TLVs that correspond to the particular vendor as identified by the OUI
216
+in the Organizationally Specific TLV (type 127).  For example, Juniper uses OUI
217
+``0x009069``.  Likewise an LLDP-MED hook will only process Organizationally
218
+Specific TLVs with OUI ``0x0012bb``.  In this way, individual TLVs are not
219
+processed more than once. However clashes between the processed names used by
220
+the standard LLDP plugin and vendor or LLDP-MED plugins needs to be avoided.
221
+For that reason the additional plugins (beyond the standard plugin) will use
222
+the naming format:
223
+``<OUI>_<OUIsubtype>``:
224
+
225
+where "OUI" is the string corresponding to the vendor OUI and "OUIsubtype" is
226
+the vendor specific subtype, e.g.::
227
+
228
+   "juniper_chassis_id": "0123456789"
229
+
230
+Likewise, some examples for the LLDP-MED plugin::
231
+
232
+   "lldp_med_location_id": "5567892"
233
+   "lldp_med_device_type": "Network connectivity"
234
+
235
+
236
+HTTP API impact
237
+---------------
238
+
239
+   None.
240
+
241
+Client (CLI) impact
242
+-------------------
243
+
244
+To display the LLDP collected by Ironic python agent, a new set of commands
245
+under ``openstack baremetal introspection`` is proposed as follows with example
246
+output.
247
+
248
+1. List interfaces for each node with key LLDP data.
249
+
250
+::
251
+
252
+   $ openstack baremetal introspection interface list
253
+     5f428939-698d-4942-b164-ff645a768e4a
254
+
255
++-----------+--------+-----------------+-------------------+-------------+
256
+| Interface | MAC    | Switch VLAN IDs | Switch Chassis    | Switch Port |
257
++-----------+--------+-----------------+-------------------+-------------+
258
+| eth0      | b0...  | [101, 102, 103] | 64:64:9b:xx:xx:xx | 554         |
259
++-----------+--------+-----------------+-------------------+-------------+
260
+| eth1      | b0...  | [101, 102, 103] | 64:64:9b:xx:xx:xx | 734         |
261
++-----------+--------+-----------------+-------------------+-------------+
262
+| eth2      | b0...  | [101, 102, 103] | 64:64:9b:xx:xx:xx | 587         |
263
++-----------+--------+-----------------+-------------------+-------------+
264
+| eth3      | b0...  | [101, 102]      | 64:64:9b:xx:xx:xx | 772         |
265
++-----------+--------+-----------------+-------------------+-------------+
266
+
267
+2. Show all LLDP values for an interface.  The field names will come directly
268
+from the names stored in the processed data.
269
+
270
+::
271
+
272
+   $ openstack baremetal introspection interface show
273
+     5f428939-698d-4942-b164-ff645a768e4a eth0
274
+
275
++--------------------------------------+--------------------------------------+
276
+| Field                                | Value                                |
277
++--------------------------------------+--------------------------------------+
278
+| node                                 | 5f428939-698d-4942-b164-ff645a768e4a |
279
++--------------------------------------+--------------------------------------+
280
+| interface                            | eth0                                 |
281
++--------------------------------------+--------------------------------------+
282
+| interface_mac_address                | b0:83:fe:xx:xx:xx                    |
283
++--------------------------------------+--------------------------------------+
284
+| switch_chassis_id                    | 64:64:9b:xx:xx:x                     |
285
++--------------------------------------+--------------------------------------+
286
+| switch_port_id                       | 554                                  |
287
++--------------------------------------+--------------------------------------+
288
+| switch_system_name                   | sw01-dist-1b-b12.rdu2                |
289
++--------------------------------------+--------------------------------------+
290
+| switch_system_capabilities           | ['Bridge', 'Router']                 |
291
++--------------------------------------+--------------------------------------+
292
+| switch_port_description              | host2.lab.eng                        |
293
+|                                      | port 1 (Prov/Trunked VLANs)          |
294
++--------------------------------------+--------------------------------------+
295
+| switch_port_autonegotiation_support  | True                                 |
296
++--------------------------------------+--------------------------------------+
297
+| switch_port_autonegotiation_enabled  | True                                 |
298
++--------------------------------------+--------------------------------------+
299
+| switch_port_physical_capabilities    | ['100Base-TX hdx', '100BASE-TX fdx', |
300
+|                                      | '1000BASE-T fdx']                    |
301
++--------------------------------------+--------------------------------------+
302
+| switch_port_mau_type                 | Unknown                              |
303
++--------------------------------------+--------------------------------------+
304
+| switch_port_link_aggregation_support | True                                 |
305
++--------------------------------------+--------------------------------------+
306
+| switch_port_link_aggregation_enabled | False                                |
307
++--------------------------------------+--------------------------------------+
308
+| switch_port_link_aggregation_id      | 0                                    |
309
++--------------------------------------+--------------------------------------+
310
+| switch_port_mtu                      | 9216                                 |
311
++--------------------------------------+--------------------------------------+
312
+| switch_port_untagged_vlan_id         | 102                                  |
313
++--------------------------------------+--------------------------------------+
314
+| switch_port_vlans                    | [{'name': 'vlan101', 'id': 101},     |
315
+|                                      |  {'name': 'vlan102', 'id': 102},     |
316
+|                                      |  {'name': 'vlan103', 'id': 103}]     |
317
++--------------------------------------+--------------------------------------+
318
+
319
+3. Show interface data filtered by particular VLANs
320
+
321
+::
322
+
323
+   $ openstack baremetal introspection interface list
324
+     5f428939-698d-4942-b164-ff645a768e4a --vlan=103
325
+
326
++-----------+--------+-----------------+-------------------+-------------+
327
+| Interface | MAC    | Switch VLAN IDs | Switch Chassis    | Switch Port |
328
++-----------+--------+-----------------+-------------------+-------------+
329
+| eth0      | b0...  | [101, 102, 103] | 64:64:9b:xx:xx:xx | 554         |
330
++-----------+--------+-----------------+-------------------+-------------+
331
+| eth1      | b0...  | [101, 102, 103] | 64:64:9b:xx:xx:xx | 734         |
332
++-----------+--------+-----------------+-------------------+-------------+
333
+| eth2      | b0...  | [101, 102, 103] | 64:64:9b:xx:xx:xx | 587         |
334
++-----------+--------+-----------------+-------------------+-------------+
335
+
336
+4. Show the value of provided field for each node/interface using the field
337
+names stored in the processed data and shown via the interface show command.
338
+
339
+To show switch port MTU on a node for all interfaces:
340
+
341
+::
342
+
343
+   $ openstack baremetal introspection interface list
344
+     5f428939-698d-4942-b164-ff645a768e4a --fields interface,
345
+     switch_port_mtu
346
+
347
++-----------+-----------------+
348
+| Interface | switch_port_mtu |
349
++-----------+-----------------+
350
+| eth0      | 9216            |
351
++-----------+-----------------+
352
+| eth1      | 9216            |
353
++-----------+-----------------+
354
+| eth2      | 1514            |
355
++-----------+-----------------+
356
+| eth3      | 1514            |
357
++-----------+-----------------+
358
+
359
+To show the switch port link aggregation (aka bonding) configuration for
360
+a node:
361
+
362
+::
363
+
364
+   $ openstack baremetal introspection interface list
365
+     22aadc81-e134-4ff0-ac53-229126e77f62 --fields interface,
366
+     switch_port_link_aggregation_enabled
367
+
368
++-----------+--------------------------------------+
369
+| Interface | switch_port_link_aggregation_enabled |
370
++-----------+--------------------------------------+
371
+| eth0      | False                                |
372
++-----------+--------------------------------------+
373
+| eth1      | False                                |
374
++-----------+--------------------------------------+
375
+| eth2      | True                                 |
376
++-----------+--------------------------------------+
377
+| eth3      | True                                 |
378
++-----------+--------------------------------------+
379
+
380
+To show the switch port native VLAN configuration for a node and interface:
381
+
382
+::
383
+
384
+   $ openstack baremetal introspection interface list --interface eth0
385
+     --fields interface, switch_port_untagged_vlan_id
386
+
387
++-----------+------------------------------+
388
+| Interface | switch_port_untagged_vlan_id |
389
++-----------+------------------------------+
390
+| eth0      | 102                          |
391
++-----------+------------------------------+
392
+
393
+5. To display the full LLDP processed report for all nodes in json format
394
+the ``interface list`` command can be run for all nodes using the
395
+built-in arguments ``--long`` (to display all fields) and ``--format json``
396
+(to output in json format), for example::
397
+
398
+   $ openstack baremetal introspection interface list
399
+     5f428939-698d-4942-b164-ff645a768e4a --long --format json
400
+
401
+Ironic python agent impact
402
+--------------------------
403
+
404
+LLDP data collection is available in Newton but it must be enabled by the
405
+kernel flag ``ipa-collect-lldp``.
406
+
407
+Performance and scalability impact
408
+----------------------------------
409
+
410
+Each time the new lldp commands are invoked, Ironic Inspector will be queried
411
+to get the LLDP data.  Since the data has already been processed by the
412
+Inspector hook, there will be little additional processing that needs to
413
+be done to display the data.
414
+
415
+Security impact
416
+---------------
417
+
418
+No sensitive or proprietary data will be displayed by these commands.
419
+All LLDP data was received as unencrypted UDP data.
420
+
421
+These commands may provide a benefit for security audits of the deployment as
422
+they will make it possible to ensure that no systems are attached to
423
+unintended VLANs, thus reducing the possibility of accidental exposure.
424
+
425
+In order for a switch to send LLDP packets, the network administrator must
426
+enable LLDP on the ports connected to node interfaces.  A user on the Openstack
427
+CLI will be able to see everything that is sent in the LLDP packets including
428
+VLANs, management IP, switch model, port number, and firmware version.  This
429
+information may potentially be used to organize attacks against networking
430
+equipment.  For this reason the System Description TLV, which can include
431
+switch model, version, and build info, will be processed but not displayed;
432
+the Management Address TLV will be handled the same way.  This will reduce the
433
+information available while still maintaining enough data for networking
434
+related validations.
435
+
436
+Deployer impact
437
+---------------
438
+
439
+As discussed, these new commands may facilitate deployments as they
440
+could help detect mismatches between network switch configurations
441
+and deployment settings in areas such as VLANs, MTUs, bonding, port
442
+speed etc.
443
+
444
+By default, the new plugins will be not be enabled.  The deployer should
445
+set the standard LLDP hook in inspector.conf when in a baremetal
446
+environment.
447
+
448
+In order to enable data collection in IPA, the deployer should set
449
+the kernel flag ``ipa-collect-lldp=1``.  Examples of setting kernel parameters
450
+can be seen in `configuring PXE`_.
451
+
452
+Developer impact
453
+----------------
454
+
455
+When the CLI is implemented, vendors will be able to develop vendor-specific
456
+plugins to handle vendor LLDP TLVs and expand the functionality.
457
+
458
+Implementation
459
+==============
460
+
461
+Assignee(s)
462
+-----------
463
+
464
+Primary assignee::
465
+  bfournie@redhat.com
466
+
467
+Work Items
468
+----------
469
+
470
+* Add processing hook to parse standard lldp data and write to data store.
471
+
472
+* Integrate OSC commands with python-ironic-inspector-client.
473
+
474
+* Add unit tests.
475
+
476
+* Test with multiple vendors' network switches.
477
+
478
+Dependencies
479
+============
480
+
481
+The API for listing all introspection statuses affects similar commands so
482
+would be good to wait until that is complete.
483
+https://review.openstack.org/#/c/344921/
484
+
485
+Testing
486
+=======
487
+
488
+In addition to functional testing, if baremetal CI is available, a test to
489
+ensure that LLDP collection is enabled and working would be useful, along with
490
+a test of the standard LLDP plugin as defined in the spec.
491
+
492
+References
493
+==========
494
+
495
+* `IEEE Std 802.1AB-2016`_
496
+
497
+* `IEEE Std 802.1Q-2014`_
498
+
499
+* `IEEE Std 802.3-2012`_
500
+
501
+.. _ironic-inspector plugins:
502
+   http://docs.openstack.org/developer/ironic-inspector/usage.html#plugins
503
+
504
+.. _LLDP reporting RFE:
505
+   https://bugs.launchpad.net/python-ironic-inspector-client/+bug/1626253
506
+
507
+.. _IEEE Std 802.1AB-2016:
508
+   https://standards.ieee.org/findstds/standard/802.1AB-2016.html
509
+
510
+.. _IPA change to add LLDP support:
511
+   https://review.openstack.org/#/c/320584/
512
+
513
+.. _IEEE Std 802.1Q-2014:
514
+   https://standards.ieee.org/findstds/standard/802.1Q-2014.html
515
+
516
+.. _IEEE Std 802.3-2012:
517
+   https://standards.ieee.org/findstds/standard/802.3-2012.html
518
+
519
+.. _LLDP-MED:
520
+   http://www.docfoc.com/ansi-tia-1057-2006-telecommunications-ip-telephone-\
521
+   infrastructure-1Fp2
522
+
523
+.. _configuring PXE:
524
+   http://docs.openstack.org/developer/ironic-inspector/install.html\
525
+   #configuring-pxe

Loading…
Cancel
Save