This story will introduce a new node personality 'edgeworker' to StarlingX.
The biggest difference between 'edgeworker' node and 'worker' node is that the OS of 'edgeworker' nodes are not installed or configured by StarlingX controller and they may vary due to different cases, for example Ubuntu, Debian, Fedora... The basic idea is to deploy containerd and kubelet service to the 'edgeworker' nodes, so that the StarlingX Kubernetes platform will be extended to 'edgeworker' nodes.
The second difference is that 'edgeworker' are usually deployed close to edge devices while 'worker' nodes are usually servers deployed in the server room. The 'edgeworker' personality are suitable for the nodes that users may want to install their customized OS and may require a deployment physically close to the data producer or consumer devices.
The way to leverage advantages of StarlingX functionality is to get most flock agents containerized and enabled on edgeworker nodes. That is also aligned with long term strategy of flock service containerization.
The whole topic is broken down into 4 phases approximately:
This spec focuses on Phase One.
In a typical IoT or industrial use case, StarlingX is usually used to facilitate the whole edge cluster setup and management. But there are different types of nodes existing in the cluster that are not in current StarlingX management scope. Various reasons are hindering administrator to get these nodes deployed as 'worker' nodes, from software to hardware. In particular, the common setbacks are:
In this story, these nodes are categorized into a new personality to distinguish from 'worker' nodes. The new personality is called 'edgeworker' since these nodes are usually deployed close to the edge device side. An edge device could probably be an I/O device, a camera, a servo motor or a sensor.
The first three setbacks will be addressed in this phase one, while network requirement and manageability enhancement will be addressed in the next few phases. Separate specs for different phases will be submitted during different releases.
Adding a new personality will require changes in sysinv db, sysinv api and sysinv conductor, as well as cgts-client.
In order to get 'edgeworker' node into sysinv, the 'edgeworker' value will be added to enum type invPersonalityEnum in sysinv db. Accordingly, adding 'edgeworker' to db models is required as well. After this change, a host from sysinv db perspective could be assigned as edgeworker personality.
Mainly focus on host api, adding checks during host add for 'edgeworker' hosts. Possible checks:
sysinv conductor is responsible for mgmt ip allocation when the mgmt network is in dynamic type.
Add 'edgeworker' choice for argument 'personality' of host-add/ host-update command.
After underlying changes applied, the administrator is able to use
# system host-add -n <hostname> -p edgeworker or # system host-update <id> hostname=<hostname> personality=edgeworker
to add an edgeworker node to the inventory.
When an edgeworker node is added to inventory, sysinv could provide following services:
The function that will not be supported on edgeworker:
An edgeworker node is not a server, but a normal PC like industrial PC/NUC/workstation. BMC is not a required feature for those nodes. The node life cycle management is done in-band or by the maintainer manually. The use case which uses edgeworker nodes does not expect an out-of-band node management for these nodes.
Additional semantic check will be added for these functions.
Other functions will be described in detail in each phase's spec.
ansible playbook for provisioning edgeworker nodes
The main steps for provisioning an edgeworker node are installing kubelet, kubeadm and containerd packages to the node due to different Linux distributions and joining the node to StarlingX Kubernetes platform. Besides these steps, system configurations like ntp setup, interface configuration, dns setup etc. are needed as well.
The first two Linux distributions we propose to support for edgeworker are Ubuntu and CentOS.
The version of all the kubernetes packages on edgeworker nodes must be exactly the same as the packages on controllers. If they are not, the playbook will reinstall the packages to the proper version.
The playbook sequence to provision an edgeworker node:
There will be one playbook with different roles included.
There are several open source projects that can provision a Kubernetes node.
Kubespray1 is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for generic OS/Kubernetes clusters configuration management tasks. Kubespray performs generic OS configuration as well as Kubernetes cluster bootstrapping.
Kubespray provides the whole functionality of provisioning a Kubernetes node just like the edgeworker provisioning playbook does. However, Kubespray supports multiple container runtimes, multiple CNI plugins and control plane bootstrap which are too much in functionality to provision an edgeworker.
What edgeworker need is a playbook for certain container runtime, certain CNI plugins and provision a Kubernetes node only.
KubeEdge2 is an open source system for extending native containerized application orchestration capabilities to hosts at Edge. KubeEdge could run upon an existing Kubernetes cluster and deploy a customized kubelet service called 'edged' to the edge node. In between the apiserver and edged, the EdgeController is the bridge who manages edge nodes and pods metadata so that the data can be targeted to a specific edge node.
KubeEdge is able to provision edge nodes from cloud. But the kubelet service is customized to fulfill the specific requirement that the administrator is able to manage the pods running on edge nodes from public cloud platform. The customized kubelet(edged) brings compatibility issues when Kubernetes upgrading to a newer release, which leads to an extra effort to test/upgrade KubeEdge during each Kubernetes upgrade since edgeworker provision is a key step to enable these nodes.
Besides, KubeEdge has a whole edge device management logic that is not in current StarlingX platform scope.
The only data model change is to insert 'edgeworker' to 'invPersonalityEnum' in sysinv db model.
The potential security threat and mitigation could be:
It must be guaranteed by the administrator that no unauthorized node could physically connect into the management network. The authentication of the edgeworker node onboard will be introduced in the later phases.
Malicious packages in edgeworker node
It must be guaranteed by the administrator that the packages running in edgeworker nodes are secure since the OS is managed by the administrator.
The deployer is required to run edgeworker provision playbook after adding or updating the node as edgeworker personality.
The kubelet needs to be upgraded during the Kubernetes upgrade process. The upgrade process will trigger an additional script/playbook to check the version of the packages on edgeworker nodes, and upgrade them according to their own distribution.
The distribution's repo may not update the corresponding packages to the newest version, due to Kubernetes version skew support policy3 , up to two minor versions older against apiserver is acceptable for kubelet and kube-proxy.
The SW patching/updating will be addressed in phase four. It could either be a 3rd party solution or plugins of current SW management. Because current SW management could not patch/update packages other than RPMs, while the OS of edgeworker nodes could be different types of packages.
The work items are already introduced in section Proposed change above.
|stx.5.0||Edgeworker management phase one introduced|