Browse Source

Introduce magnum nodegroups

Introduce the concept of nodegroups. This approach gives the ability
to create heterogeneous clusters by defining groups of nodes with
different properties.

This spec tries to summarize the changes needed in order to support
nodegroups with Magnum.

story: 2004169
task: 27644

Change-Id: I4e64e928b8d8c7a3b9fd1772875edd4d60cae6ee
Theodoros Tsioutsias 6 months ago
parent
commit
e1c67df536
1 changed files with 275 additions and 0 deletions
  1. 275
    0
      specs/stein/magnum-nodegroups.rst

+ 275
- 0
specs/stein/magnum-nodegroups.rst View File

@@ -0,0 +1,275 @@
1
+Magnum Nodegroups
2
+=================
3
+
4
+Launchpad blueprint:
5
+
6
+https://blueprints.launchpad.net/magnum/+spec/magnum-nodegroups
7
+
8
+This is a proposal to extend the Magnum API adding support for nodegroups.
9
+
10
+Problem Description
11
+-------------------
12
+
13
+Currently Magnum supports the creation of clusters with all the nodes in the
14
+same availability zone. At the same time, the user has the ability to choose
15
+one flavor for master nodes and one for worker nodes.
16
+
17
+The concept of nodegroups provides users with the ability to specify groups of
18
+nodes with different properties. Within the scope of a group users are able to
19
+define labels, used image, flavor, etc depending on the purpose these nodes are
20
+going to be used for.
21
+
22
+This proposal tries to address the changes needed to support nodegroups with
23
+Magnum.
24
+
25
+Use Cases
26
+---------
27
+
28
+1. As a user, I want to deploy heterogeneous workloads in the same cluster.
29
+   These can include sql databases with high iops requirements, caches
30
+   requiring large amounts of memory and batch jobs requiring a larger number
31
+   of cpus or even gpus.
32
+
33
+2. As a user I want to create higly available clusters with Magnum.
34
+
35
+Proposed Changes
36
+----------------
37
+
38
+The proposed change includes:
39
+
40
+* Add a new '/clusters/{cluster_id}/nodegroups' REST API endpoint to Magnum
41
+  providing management of the given cluster's nodegroups. This includes
42
+  nodegroup creation, update and deletion.
43
+
44
+* Add a new object to the data model to represent a nodegroup.
45
+
46
+* Change the cluster create procedure to create two default nodegroups, one
47
+  containing the master node(s) of the cluster and one containing the worker
48
+  node(s).
49
+
50
+* Adapt the cluster delete procedure to delete also the nodegroups associated
51
+  with the cluster being deleted.
52
+
53
+Check sections `Data Model Impact`_ and `REST API Impact`_ for more details.
54
+
55
+    NOTE::
56
+    As a first step, users will be able to create nodegroups containing only
57
+    worker nodes. This is because the scripts used for scaling up do not
58
+    support adding new master nodes to the cluster. This change is left as
59
+    future work and will be handled by another spec.
60
+
61
+Alternatives
62
+------------
63
+
64
+As an alternative to the proposed solution, a user could create multiple
65
+independent clusters and connect them in one single federated control plane,
66
+acting as one heterogeneous cluster.
67
+
68
+The problem is that there is no feature parity between the cluster and the
69
+federation APIs and for the time being, cluster federation is supported only by
70
+the Kubernetes COE.
71
+
72
+It seems that the concept of nodegroups takes care of the matter at hand, in a
73
+more complete way.
74
+
75
+Data Model Impact
76
+-----------------
77
+
78
+A new entity would be added (corresponding tables will be added):
79
+
80
+* **nodegroup**
81
+
82
+  * uuid
83
+  * name
84
+  * cluster_uuid (the uuid of the cluster where the nodegroup belongs)
85
+  * project_id
86
+  * docker_volume_size
87
+  * labels
88
+  * flavor_id
89
+  * image_id
90
+  * node_addresses
91
+  * node_count
92
+  * role (shows if the nodegroup contains master or worker nodes for now)
93
+
94
+The project id could be fetched by the cluster, but we add it here also for
95
+future use. This is the scenario where the master nodes belong to an operator
96
+tenant and the cluster nodegroups belong to different projects.
97
+
98
+Adding the nodegroup entity means that some information currently stored in the
99
+the cluster, should be moved to nodegroup table. The cluster columns that need
100
+to be dropped are the following:
101
+
102
+* node_count
103
+* master_count
104
+* node_addresses
105
+* master_addresses
106
+
107
+    NOTE::
108
+    It is really important to point out that moving information from the
109
+    cluster to the nodegroup table will NOT result in changing the output of
110
+    the existing CLIs. The only thing that will change is the way this
111
+    information is stored and subsequently fetched from the database.
112
+    e.g. The cluster show output will contain the node_count information but it
113
+         will be calculated at the API level by summing the node_count of all
114
+         the associated worker nodegroups.
115
+
116
+REST API Impact
117
+---------------
118
+
119
+This change leads to a minor version increase in the Magnum API, the
120
+addition of a new REST endpoint and a new set of CLI commands.
121
+
122
+Below is a description of the commands to manage nodegroups:
123
+
124
+* add a new nodegroup, in an existing cluster::
125
+
126
+    openstack coe node-group create <params> <cluster> <nodegroup>
127
+
128
+* delete an existing nodegroup::
129
+
130
+    openstack coe node-group delete <cluster> <nodegroup>
131
+
132
+* update an existing nodegroup::
133
+
134
+    openstack coe node-group update <params> <cluster> <nodegroup>
135
+
136
+* list existing nodegroups given an existing cluster::
137
+
138
+    openstack coe node-group list <cluster>
139
+
140
+    +------+-------------+-------------+------------+-----------+
141
+    | uuid | name        |  flavor id  | node count |   role    |
142
+    +------+-------------+-------------+------------+-----------+
143
+    | ...  | nodegroup1  |  flavor-1   |      3     |   master  |
144
+    +------+-------------+-------------+------------+-----------+
145
+    | ...  | nodegroup2  |  flavor-2   |      5     |   worker  |
146
+    +------+-------------+-------------+------------+-----------+
147
+
148
+* show details of an existing nodegroup::
149
+
150
+    openstack coe node-group show <cluster> <nodegroup>
151
+
152
+    +---------------------+-------------------------------------------+
153
+    | Property            | Value                                     |
154
+    +---------------------+-------------------------------------------+
155
+    | uuid                | 5b2ee3b5-2f85-4917-be7c-11a2c82031ad      |
156
+    | name                | nodegroup1                                |
157
+    | cluster uuid        | <uuid-cluster1>                           |
158
+    | project id          | <uuid-project1>                           |
159
+    | docker volume size  | 5                                         |
160
+    | labels              | <label1>, <label2>, <label3>              |
161
+    | flavor id           | flavor1                                   |
162
+    | node count          | 3                                         |
163
+    | node addresses      | <ip-node1>, <ip-node2>, <ip-node3>        |
164
+    | role                | master                                    |
165
+    +---------------------+-------------------------------------------+
166
+
167
+Backward Compatibility
168
+----------------------
169
+
170
+In this section we refer to the clusters created before the introduction of
171
+Magnum Nodegroups as "old clusters".
172
+
173
+During the upgrade, the existing stacks will not be modified. This is the
174
+reason that adding as well as deleting nodegroups to/from old clusters will be
175
+not permitted.
176
+
177
+Showing details for a nodegroup in an old cluster should work correctly.
178
+
179
+Security Impact
180
+---------------
181
+
182
+There is no keypair added in the nodegroup object as all nodegroups will
183
+inherit the one set to the cluster. This approach was chosen, in order to not
184
+propagate the use of keypairs to the level of nodegroups and complicate further
185
+their removal in the future.
186
+
187
+Notifications Impact
188
+--------------------
189
+
190
+New notifications will be added for:
191
+* nodegroup creation
192
+* nodegroup deletion
193
+* nodegroup update
194
+
195
+Other End User Impact
196
+---------------------
197
+
198
+New subcommands will be added to the openstack client as described above.
199
+
200
+At the same time, some of the existing commands for managing clusters have to
201
+be adapted:
202
+
203
+### Cluster Create ###
204
+The existing create cluster cli will result in a cluster with two default
205
+nodegroups, one for the master node(s) and one for the worker(s).
206
+
207
+### Cluster Delete ###
208
+When the user deletes a cluster, all the associated nodegroups will be deleted
209
+as well. There is no point of making the user delete all the nodegroups
210
+separately before deleting the cluster.
211
+
212
+### Cluster Update ###
213
+Cluster update should continue working for the already existing clusters and it
214
+should be deprecated for the new ones. All scaling operations for new clusters
215
+should be done using the "node-group update" command.
216
+
217
+### Cluster Show ###
218
+Firstly, the node count of the cluster should reflect the sum of the node count
219
+fields of all its nodegroups.
220
+Another thing that has to be handled is showing the status of the cluster. The
221
+show cluster cli should summarize the status of its nodegroups since each stack
222
+has its own status.
223
+
224
+Developer Impact
225
+----------------
226
+
227
+None.
228
+
229
+Implementation
230
+--------------
231
+
232
+The implementation will be done in 4 phases.
233
+
234
+1. Add the new API endpoint and data model entity, and the corresponding
235
+   controller implementation linked to each driver. At this point we will
236
+   have all drivers declaring every operation regarding nodegroups as
237
+   'Not Implemented'. At the same step, we need to adapt all the operations
238
+   for cluster management.
239
+
240
+2. Implement the nodegroup functionality for all drivers.
241
+
242
+3. Add the new command line tools to the openstack client.
243
+
244
+4. Implement the Magnum nodegroup notifications, for creation, deletion and
245
+   update.
246
+
247
+Assignee(s)
248
+-----------
249
+
250
+Primary assignee:
251
+  <ttsiouts>
252
+
253
+Work Items
254
+----------
255
+
256
+See `Implementation`_.
257
+
258
+Testing
259
+-------
260
+
261
+A new set of unit and functional tests covering creation, deletion and update
262
+of nodegroups is needed. At the same time, the existing tests for cluster
263
+creation, deletion and update should be adapted.
264
+
265
+Documentation Impact
266
+--------------------
267
+
268
+New documentation will be added to describe the new API endpoint and its
269
+functionality as well as the changes in the existing cluster API.
270
+
271
+References
272
+----------
273
+
274
+Magnum Nodegroups Blueprint:
275
+https://blueprints.launchpad.net/magnum/+spec/magnum-nodegroups

Loading…
Cancel
Save