From e1c67df536e3f06388fbd2a07c8fc0b4968b9573 Mon Sep 17 00:00:00 2001 From: Theodoros Tsioutsias Date: Tue, 2 Oct 2018 20:59:23 +0000 Subject: [PATCH] Introduce magnum nodegroups Introduce the concept of nodegroups. This approach gives the ability to create heterogeneous clusters by defining groups of nodes with different properties. This spec tries to summarize the changes needed in order to support nodegroups with Magnum. story: 2004169 task: 27644 Change-Id: I4e64e928b8d8c7a3b9fd1772875edd4d60cae6ee --- specs/stein/magnum-nodegroups.rst | 275 ++++++++++++++++++++++++++++++ 1 file changed, 275 insertions(+) create mode 100644 specs/stein/magnum-nodegroups.rst diff --git a/specs/stein/magnum-nodegroups.rst b/specs/stein/magnum-nodegroups.rst new file mode 100644 index 0000000..22d0391 --- /dev/null +++ b/specs/stein/magnum-nodegroups.rst @@ -0,0 +1,275 @@ +Magnum Nodegroups +================= + +Launchpad blueprint: + +https://blueprints.launchpad.net/magnum/+spec/magnum-nodegroups + +This is a proposal to extend the Magnum API adding support for nodegroups. + +Problem Description +------------------- + +Currently Magnum supports the creation of clusters with all the nodes in the +same availability zone. At the same time, the user has the ability to choose +one flavor for master nodes and one for worker nodes. + +The concept of nodegroups provides users with the ability to specify groups of +nodes with different properties. Within the scope of a group users are able to +define labels, used image, flavor, etc depending on the purpose these nodes are +going to be used for. + +This proposal tries to address the changes needed to support nodegroups with +Magnum. + +Use Cases +--------- + +1. As a user, I want to deploy heterogeneous workloads in the same cluster. + These can include sql databases with high iops requirements, caches + requiring large amounts of memory and batch jobs requiring a larger number + of cpus or even gpus. + +2. As a user I want to create higly available clusters with Magnum. + +Proposed Changes +---------------- + +The proposed change includes: + +* Add a new '/clusters/{cluster_id}/nodegroups' REST API endpoint to Magnum + providing management of the given cluster's nodegroups. This includes + nodegroup creation, update and deletion. + +* Add a new object to the data model to represent a nodegroup. + +* Change the cluster create procedure to create two default nodegroups, one + containing the master node(s) of the cluster and one containing the worker + node(s). + +* Adapt the cluster delete procedure to delete also the nodegroups associated + with the cluster being deleted. + +Check sections `Data Model Impact`_ and `REST API Impact`_ for more details. + + NOTE:: + As a first step, users will be able to create nodegroups containing only + worker nodes. This is because the scripts used for scaling up do not + support adding new master nodes to the cluster. This change is left as + future work and will be handled by another spec. + +Alternatives +------------ + +As an alternative to the proposed solution, a user could create multiple +independent clusters and connect them in one single federated control plane, +acting as one heterogeneous cluster. + +The problem is that there is no feature parity between the cluster and the +federation APIs and for the time being, cluster federation is supported only by +the Kubernetes COE. + +It seems that the concept of nodegroups takes care of the matter at hand, in a +more complete way. + +Data Model Impact +----------------- + +A new entity would be added (corresponding tables will be added): + +* **nodegroup** + + * uuid + * name + * cluster_uuid (the uuid of the cluster where the nodegroup belongs) + * project_id + * docker_volume_size + * labels + * flavor_id + * image_id + * node_addresses + * node_count + * role (shows if the nodegroup contains master or worker nodes for now) + +The project id could be fetched by the cluster, but we add it here also for +future use. This is the scenario where the master nodes belong to an operator +tenant and the cluster nodegroups belong to different projects. + +Adding the nodegroup entity means that some information currently stored in the +the cluster, should be moved to nodegroup table. The cluster columns that need +to be dropped are the following: + +* node_count +* master_count +* node_addresses +* master_addresses + + NOTE:: + It is really important to point out that moving information from the + cluster to the nodegroup table will NOT result in changing the output of + the existing CLIs. The only thing that will change is the way this + information is stored and subsequently fetched from the database. + e.g. The cluster show output will contain the node_count information but it + will be calculated at the API level by summing the node_count of all + the associated worker nodegroups. + +REST API Impact +--------------- + +This change leads to a minor version increase in the Magnum API, the +addition of a new REST endpoint and a new set of CLI commands. + +Below is a description of the commands to manage nodegroups: + +* add a new nodegroup, in an existing cluster:: + + openstack coe node-group create + +* delete an existing nodegroup:: + + openstack coe node-group delete + +* update an existing nodegroup:: + + openstack coe node-group update + +* list existing nodegroups given an existing cluster:: + + openstack coe node-group list + + +------+-------------+-------------+------------+-----------+ + | uuid | name | flavor id | node count | role | + +------+-------------+-------------+------------+-----------+ + | ... | nodegroup1 | flavor-1 | 3 | master | + +------+-------------+-------------+------------+-----------+ + | ... | nodegroup2 | flavor-2 | 5 | worker | + +------+-------------+-------------+------------+-----------+ + +* show details of an existing nodegroup:: + + openstack coe node-group show + + +---------------------+-------------------------------------------+ + | Property | Value | + +---------------------+-------------------------------------------+ + | uuid | 5b2ee3b5-2f85-4917-be7c-11a2c82031ad | + | name | nodegroup1 | + | cluster uuid | | + | project id | | + | docker volume size | 5 | + | labels | , , | + | flavor id | flavor1 | + | node count | 3 | + | node addresses | , , | + | role | master | + +---------------------+-------------------------------------------+ + +Backward Compatibility +---------------------- + +In this section we refer to the clusters created before the introduction of +Magnum Nodegroups as "old clusters". + +During the upgrade, the existing stacks will not be modified. This is the +reason that adding as well as deleting nodegroups to/from old clusters will be +not permitted. + +Showing details for a nodegroup in an old cluster should work correctly. + +Security Impact +--------------- + +There is no keypair added in the nodegroup object as all nodegroups will +inherit the one set to the cluster. This approach was chosen, in order to not +propagate the use of keypairs to the level of nodegroups and complicate further +their removal in the future. + +Notifications Impact +-------------------- + +New notifications will be added for: +* nodegroup creation +* nodegroup deletion +* nodegroup update + +Other End User Impact +--------------------- + +New subcommands will be added to the openstack client as described above. + +At the same time, some of the existing commands for managing clusters have to +be adapted: + +### Cluster Create ### +The existing create cluster cli will result in a cluster with two default +nodegroups, one for the master node(s) and one for the worker(s). + +### Cluster Delete ### +When the user deletes a cluster, all the associated nodegroups will be deleted +as well. There is no point of making the user delete all the nodegroups +separately before deleting the cluster. + +### Cluster Update ### +Cluster update should continue working for the already existing clusters and it +should be deprecated for the new ones. All scaling operations for new clusters +should be done using the "node-group update" command. + +### Cluster Show ### +Firstly, the node count of the cluster should reflect the sum of the node count +fields of all its nodegroups. +Another thing that has to be handled is showing the status of the cluster. The +show cluster cli should summarize the status of its nodegroups since each stack +has its own status. + +Developer Impact +---------------- + +None. + +Implementation +-------------- + +The implementation will be done in 4 phases. + +1. Add the new API endpoint and data model entity, and the corresponding + controller implementation linked to each driver. At this point we will + have all drivers declaring every operation regarding nodegroups as + 'Not Implemented'. At the same step, we need to adapt all the operations + for cluster management. + +2. Implement the nodegroup functionality for all drivers. + +3. Add the new command line tools to the openstack client. + +4. Implement the Magnum nodegroup notifications, for creation, deletion and + update. + +Assignee(s) +----------- + +Primary assignee: + + +Work Items +---------- + +See `Implementation`_. + +Testing +------- + +A new set of unit and functional tests covering creation, deletion and update +of nodegroups is needed. At the same time, the existing tests for cluster +creation, deletion and update should be adapted. + +Documentation Impact +-------------------- + +New documentation will be added to describe the new API endpoint and its +functionality as well as the changes in the existing cluster API. + +References +---------- + +Magnum Nodegroups Blueprint: +https://blueprints.launchpad.net/magnum/+spec/magnum-nodegroups