Browse Source

Multiple PXE filtering backends

Change-Id: I7022d10fd22e6e141e59d0596402f43d2dcde056
Partial-Bug: 1665666
dparalen 2 years ago
parent
commit
e30c6916ed
1 changed files with 195 additions and 0 deletions
  1. 195
    0
      specs/multiple-pxe-filtering-backends.rst

+ 195
- 0
specs/multiple-pxe-filtering-backends.rst View File

@@ -0,0 +1,195 @@
1
+..
2
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
3
+ License.
4
+
5
+ http://creativecommons.org/licenses/by/3.0/legalcode
6
+
7
+===============================
8
+Multiple PXE filtering backends
9
+===============================
10
+
11
+https://bugs.launchpad.net/ironic-inspector/+bug/1665666
12
+
13
+This is a part of the HA inspector effort [1]_, of the tripleo routed networks
14
+ironic inspector effort [2]_ and of the Pike PTG inspector architectural
15
+session outcome [3]_.
16
+
17
+Problem description
18
+===================
19
+
20
+To prevent interference with normal PXE boot of **ironic** bare metal nodes
21
+the **inspector** has to employ filtering of the "inspection" PXE traffic.
22
+Therefore a filter has to block nodes not being inspected while nodes being
23
+inspected have to be explicitly white-listed. Considering the *discovery*
24
+feature, unknown nodes have to be allowed to boot the inspection image.
25
+
26
+**inspector** currently supports only an L2-based ``iptables`` filter or no
27
+filtering option. While functional in the flat-network scenario, the
28
+``iptables`` filter comprises a scaling bottleneck and a safety issue. For the
29
+leaf-and-spine_ use case, where the DHCP PXE requests are relayed through a
30
+Top-Of-Rack switch, current ``iptables`` black-listing cannot be used anymore
31
+as the source MAC address of the original DHCP frames is replaced with the TOR
32
+MAC address when crossing the L2 broadcast domain. In case of a dedicated
33
+discovery network, the PXE filtering is not necessary any more.
34
+
35
+To support these use cases and to allow vendor-specific solutions we'd like to
36
+propose abstracting the inspection PXE traffic filtering into a *driver
37
+interface*. This could be implemented in an \*aaS fashion, such as **neutron**,
38
+or by directly controlling the DHCP service i.e talking to ``dnsmasq`` over its
39
+D-Bus interface. An intelligent TOR switch might be capable of filtering the
40
+relay traffic directly. A noop driver would be used in case of the dedicated
41
+discovery network.
42
+
43
+Proposed change
44
+===============
45
+
46
+Since essentially the filtering is an **ironic** vs **inspector** vs filter
47
+synchronization problem, we propose a discrete PXE filtering *driver interface*
48
+that comprises of these *idempotent* methods that *must not lock* any node
49
+items:
50
+
51
+* ``__init__(self)`` synchronous; creates per-process "singleton" instance of
52
+   the filter driver; called by stevedore_ to configure the filter driver.
53
+
54
+* ``init_filter(self)`` may be synchronous; initializes internal filter state.
55
+  This method may perform system-wide filter state changes.
56
+
57
+* ``whitelist_node_ids([<node_id>, <node_id>, ...])`` should be asynchronous;
58
+  enables the DHCP requests from these nodes.
59
+
60
+* ``blacklist_node_ids([<node_id>, <node_id>, ...])`` should be asynchronous;
61
+  disables the DHCP requests from specified nodes.
62
+
63
+* ``remove_node_ids([<node_id>, <node_id>, ...])`` should be asynchronous;
64
+  removes nodes no longer tracked by **ironic/inspector** from both the filter
65
+  lists.
66
+
67
+* ``tear_down_filter(self)`` may be synchronous; resets internal filter state.
68
+  This method may perform system-wide filter state changes.
69
+
70
+This abstract interface shall reside in **inspector** tree, together with an
71
+``iptables`` and a ``noop`` driver implementation.
72
+
73
+Any driver-specific High-Availability concerns (such as leader election) are
74
+out of scope of this spec and the **inspector** code base and should be
75
+addressed by particular drivers internally.
76
+
77
+We also suggest to drop introspection status cache cleaning to reduce the
78
+synchronization between the filter and **ironic** and remove the periodic
79
+firewall update procedure in favor of the periodic **ironic** synchronization
80
+procedure.
81
+
82
+Alternatives
83
+------------
84
+
85
+Select a couple of supported, in-tree located filters without the possibility
86
+to extend the set by vendors.
87
+
88
+Data model impact
89
+-----------------
90
+
91
+None
92
+
93
+HTTP API impact
94
+---------------
95
+
96
+None
97
+
98
+Client (CLI) impact
99
+-------------------
100
+
101
+None
102
+
103
+Ironic python agent impact
104
+--------------------------
105
+
106
+None
107
+
108
+Performance and scalability impact
109
+----------------------------------
110
+
111
+We hope to see custom PXE filter drivers help the **inspector** to scale beyond
112
+the current firewall-based filtering bottleneck.
113
+
114
+Security impact
115
+---------------
116
+
117
+None
118
+
119
+Deployer impact
120
+---------------
121
+
122
+* A new configuration option ``pxe_filter_driver`` is introduced pointing
123
+  **inspector** to particular filtering driver. Default value shall be
124
+  ``iptables``.
125
+
126
+* The ``firewall.*`` configuration options are *deprecated* and renamed to
127
+  ``iptables.*``
128
+
129
+* The ``pxe_filter_driver`` configuration option shall take *precedence* over
130
+  the ``iptables.*`` configuration option.
131
+
132
+* The ``iptables.manage_firewall`` configuration option shall be *deprecated
133
+  and ignored*.
134
+
135
+* The ``firewall.firewall_update_period`` configuration option shall be
136
+  *deprecated and ignored*.
137
+
138
+* The inspector ``node_status_keep_time`` shall be *deprecated and ignored*,
139
+  implying caching a node inspection status for the lifetime of the node.
140
+
141
+* Deployer might consider custom drivers fitting their needs.
142
+
143
+* A "standard" **grenade** testing with the firewall-based driver will be
144
+  performed in the upstream **inspector** CI gate to assert the upgradability.
145
+
146
+Developer impact
147
+----------------
148
+
149
+Developers of custom PXE filter drivers should adhere to the proposed driver
150
+interface. Any High-availability considerations should be addressed by the
151
+drivers internally. The `stevedore`_ library will be used to implement the
152
+driver loading mechanism.
153
+
154
+Implementation
155
+==============
156
+
157
+Assignee(s)
158
+-----------
159
+
160
+Primary assignee:
161
+  <milan k (vetrisko)>
162
+
163
+Work Items
164
+----------
165
+
166
+* introduce the abstract driver interface
167
+* refactoring current firewall-based filter
168
+* deprecate the the ``node_status_keep_time`` configuration option and make the
169
+  status records last for the node lifetime
170
+
171
+Dependencies
172
+============
173
+
174
+The `stevedore`_ library will be used to implement the driver loading
175
+mechanism.
176
+
177
+Testing
178
+=======
179
+
180
+Unit tests covering the interface and default implementations will be added. A
181
+"standard" Grenade CI gate job will assert upgradability of **inspector** with
182
+the default firewall-based filter.
183
+
184
+References
185
+==========
186
+
187
+.. [1] `HA Inspector effort <http://specs.openstack.org/openstack/ironic-inspector-specs/specs/HA_inspector.html>`_
188
+
189
+.. [2] `Tripleo routed networks ironic inspector effort <https://review.openstack.org/#/c/421011/>`_
190
+
191
+.. [3] `Pike PTG inspector architectural session outcome <https://etherpad.openstack.org/p/ironic-pike-ptg-inspector-arch>`_
192
+
193
+.. _leaf-and-spine: http://blog.westmonroepartners.com/a-beginners-guide-to-understanding-the-leaf-spine-network-topology
194
+
195
+.. _stevedore: https://docs.openstack.org/developer/stevedore/index.html

Loading…
Cancel
Save