Browse Source

Support for Software RAID

This spec proposes to add support for software RAID to Ironic.

Change-Id: I64fa27eac172016da5588156be3c0f2e3c5c6c31
Arne Wiebalck 4 months ago
parent
commit
f4ec59ae86
2 changed files with 237 additions and 0 deletions
  1. 236
    0
      specs/approved/software-raid.rst
  2. 1
    0
      specs/not-implemented/software-raid.rst

+ 236
- 0
specs/approved/software-raid.rst View File

@@ -0,0 +1,236 @@
1
+..
2
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
3
+ License.
4
+
5
+ http://creativecommons.org/licenses/by/3.0/legalcode
6
+
7
+=========================
8
+Support for Software RAID
9
+=========================
10
+
11
+https://storyboard.openstack.org/#!/story/2004581
12
+
13
+This spec proposes to add support for the configuration of software RAIDs.
14
+
15
+In analogy to the way hardware RAIDs are currently set up, the RAID setup
16
+shall be done as part of the cleaning ("clean-time software RAID"). Admin
17
+Users define the target RAID config which will be applied whenever the
18
+node is cleaned, i.e. before it becomes available for instance creation.
19
+
20
+In order to allow the End User to provide details on how the software RAID
21
+shall be configured, the RAID setup should eventually become part of the
22
+deployment steps. Integrating this into the deployment steps framework,
23
+however, is beyond the scope of this spec.
24
+
25
+
26
+Problem description
27
+===================
28
+
29
+As it is hardware agnostic, flexible, reliable, and easy to use, software RAID
30
+has become a popular choice to protect against disk device failures - also in
31
+production setups. Large deployments, such as the ones at Oath or CERN, rely
32
+on software RAID for their various services.
33
+
34
+Ironic's current lack of support for such setups requires Deployers and Admins
35
+to withdraw to workarounds in order to provide their End Users with physical
36
+instances based on a software RAID configuration. These workarounds may require
37
+to maintain an additional installation infrastructure which is then either
38
+integrated into the installation process or requires the End User to re-install
39
+a machine a second time after it has been already provisioned by Ironic to
40
+eventually end up with the desired configuration of the disk devices. This
41
+increases the complexity for Deployers and Admins, and can also lead to a
42
+decrease of the End Users' satisfaction with the overall provisioning and
43
+installation process.
44
+
45
+
46
+Proposed change
47
+===============
48
+
49
+The proposal is to extend Ironic to support software RAID by:
50
+
51
+* using a node's ``target_raid_config`` to specify the desired s/w RAID layout
52
+  (with some restrictions, see below);
53
+* adding support in the ``ironic-python-agent`` to understand a software
54
+  RAID config as specified in a node's ``target_raid_config`` and be able to
55
+  create and delete such configurations;
56
+* allow the ``ironic-python-agent`` to consider s/w RAID devices for
57
+  deployment, e.g. via root device hints (considering them at all is
58
+  already addressed in [1]);
59
+* adding support in Ironic and the ``ironic-python-agent`` to take the
60
+  necessary steps to boot from a s/w RAID, e.g. installing the boot loader
61
+  on the correct device(s).
62
+
63
+Initially, only the following configurations will be supported for the
64
+``target_raid_config`` as to be set by the Admin:
65
+
66
+* a single RAID-1 spanning the available devices and serving as the deploy
67
+  target device, or
68
+* a RAID-1 serving as the deploy target device plus a RAID-N where the RAID
69
+  level N is configurable by the Admin. N can be 0, 1, 5, 6, or 10.
70
+
71
+The supported configurations have been limited to these two options in order
72
+to avoid issues when booting from RAID devices. Having a (small) RAID-1 device
73
+to boot from is a common approach when setting up more advanced RAID
74
+configurations: a RAID-1 holder device can look like a standalone disk and does
75
+not require the bootloader to have any knowledge or capabilities to understand
76
+more complex RAID configurations.
77
+
78
+Support for more than one RAID-N, support for the selection of a subset of
79
+drives to act as holder devices, as well as support to partition the created
80
+RAID-N device are left for follow-up enhancements and beyond the scope of
81
+this specification.
82
+
83
+A first prototype very close to the proposal is available from [2][3][4].
84
+
85
+Alternatives
86
+------------
87
+
88
+As mentioned above, the alternative is to use other methods to create s/w RAID
89
+setups on physical nodes and integrate these out-of-band approaches into the
90
+provisioning workflow of individual deployments. This increases complexity on
91
+the Deployer/Admin side and can have a negative impact on the user experience
92
+when creating physical instances which need to have a software RAID setup..
93
+
94
+
95
+Data model impact
96
+-----------------
97
+
98
+None.
99
+
100
+
101
+State Machine Impact
102
+--------------------
103
+
104
+None.
105
+
106
+
107
+REST API impact
108
+---------------
109
+
110
+None.
111
+
112
+
113
+Client (CLI) impact
114
+-------------------
115
+
116
+None.
117
+
118
+"ironic" CLI
119
+~~~~~~~~~~~~
120
+None.
121
+
122
+"openstack baremetal" CLI
123
+~~~~~~~~~~~~~~~~~~~~~~~~~
124
+None.
125
+
126
+RPC API impact
127
+--------------
128
+
129
+None.
130
+
131
+Driver API impact
132
+-----------------
133
+
134
+The proposed functionality could be consolidated into a new RAID interface.
135
+
136
+Nova driver impact
137
+------------------
138
+
139
+None.
140
+
141
+Ramdisk impact
142
+--------------
143
+
144
+The ``ironic-python-agent`` will need to be able to:
145
+* setup and clean software RAID devices
146
+* consider software RAID devices for deployment
147
+* configure the holder devices of the RAID-1 device in a way they are bootable
148
+
149
+This functionality could be consolidated in an additional RAID interface.
150
+
151
+Security impact
152
+---------------
153
+
154
+None.
155
+
156
+Other end user impact
157
+---------------------
158
+
159
+While the predefined RAID-1 ensures that a system should be able to boot,
160
+End Users need to be aware that the kernel of the started image needs to
161
+be able to understand software RAID devices.
162
+
163
+Scalability impact
164
+------------------
165
+
166
+None.
167
+
168
+Performance Impact
169
+------------------
170
+
171
+None.
172
+
173
+Other deployer impact
174
+---------------------
175
+
176
+Deployers will need to be aware that the configuration and clean up of
177
+the RAID-N devices is only done during cleaning, so any changes require
178
+the node to be cleaned. Also, the config is not configurable by the End
179
+User, but limited to admins (as the target_raid_config) is a node
180
+property. All of this, however, already holds true for hardware RAID
181
+configurations.
182
+
183
+Developer impact
184
+----------------
185
+
186
+None.
187
+
188
+Implementation
189
+==============
190
+
191
+An inital proof-of-concept is available from [2][3][4].
192
+
193
+Assignee(s)
194
+-----------
195
+
196
+Primary assignee:
197
+  None.
198
+
199
+Other contributors:
200
+  Arne.Wiebalck@cern.ch (arne_wiebalck)
201
+
202
+Work Items
203
+----------
204
+
205
+This is to be defined once the overall idea is accepted and there's agreement
206
+on a design.
207
+
208
+Dependencies
209
+============
210
+
211
+None.
212
+
213
+Testing
214
+=======
215
+
216
+TBD
217
+
218
+Upgrades and Backwards Compatibility
219
+====================================
220
+
221
+None.
222
+
223
+Documentation Impact
224
+====================
225
+
226
+Documentation on how to configure a software RAID along with the limitations
227
+outlined in 'Deployer's Impact' need to be documented.
228
+
229
+References
230
+==========
231
+
232
+[1] https://review.openstack.org/#/c/592639
233
+[2] CERN Hardware Manager: https://github.com/cernops/cern-ironic-hardware-manager/commit/7f6d892ec4848a09000ed1f28f3137bf8ba917f0
234
+[3] Patched Ironic Python Agent: https://github.com/cernops/ironic-python-agent/commit/bddac76c4d100af0103a6bc08b81dd71681a9c02
235
+[4] Patched Ironic: https://github.com/cernops/ironic/commit/581e65f1d8986ac3e859678cb9aadd5a5b06ba60
236
+

+ 1
- 0
specs/not-implemented/software-raid.rst View File

@@ -0,0 +1 @@
1
+../approved/software-raid.rst

Loading…
Cancel
Save