Integrate Kata Containers into StarlingX

This specification is to integrate Kata Containers into StarlingX.
So that StarlingX could support container run with kata container
runtime.

Story: 2006145
Task: 36054

Change-Id: I2de786ff678581f48026dff072f6d9841a3f4e85
Signed-off-by: Shuicheng Lin <shuicheng.lin@intel.com>
This commit is contained in:
Shuicheng Lin 2019-08-01 20:23:18 +08:00
parent 05932b89d9
commit 6cfc5ebd66
1 changed files with 322 additions and 0 deletions

View File

@ -0,0 +1,322 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License. http://creativecommons.org/licenses/by/3.0/legalcode
========================================
Integrate Kata Containers into StarlingX
========================================
Storyboard: https://storyboard.openstack.org/#!/story/2006145
This story integrates Kata Containers into StarlingX. So that StarlingX could
support container run with kata container runtime.
Problem description
===================
Kata Containers is an open source community working to build a secure
container runtime with lightweight virtual machines that feel and perform
like containers, but provide stronger workload isolation using hardware
virtualization technology as a second layer of defense.[0]
This story try to integrate Kata Containers into StarlingX, to address
customer's concern for the security of container.
Due to there is still performance gap between Kata Containers and conventional
containers [1], and Kubernetes cluster system cannot completely run on Kata
Containers [2], Kata Containers is not used in default unless explicitly
requirement is found in the Kubernetes's pod configuration file. Or in other
words, regardless Kata Containers is enabled or not by this story, containers
currently we have in StarlingX will not use kata container runtime in default,
unless extra change is made. The method to declare the requirement is shared in
"Proposed change" section.
Use Cases
---------
With Kata Containers supported in StarlingX, System developers, testers,
operators, administrators could choose container runtime to use when run
a container image, based on their needs. Such as if there is security concern
for the pod/container, it could be selected to run with kata container. For the
default case, runc will be selected as default low-level container runtime,
which is the default low-level runtime for Kubernetes.
Proposed change
===============
Kubernetes is the only container orchestration tool in StarlingX. To integrate
Kata Containers into StarlingX, Kuberentes will be configured to support kata
container runtime.
For Kubernetes, there are two methods to select kata container runtime.
1. Use annotation. Add "io.kubernetes.cri.untrusted-workload" annotation in
pod file.[3]
2. Use Runtime Class.[4]
1st method is simpler, but it is deprecated. 2nd method is recommended.
Runtime Class is supported in Kubernetes since v1.14, and current Kubernetes
version in StarlingX master is v1.15. So use Runtime Class to select kata
container runtime is recommended.
There are 3 major container runtime in kubernetes: docker/cri-o/containerd.[5]
Docker runtime in Kubernetes supports runc only, it cannot be configured to
support kata runtime. So in order to use kata container runtime, Kubernetes
has to switch to use containerd or cri-o.
Per Kata Containers [6]: "As of Kata Containers 1.5, using shimv2 with
containerd 1.2.0 or above is the preferred way to run Kata Containers with
Kubernetes (see the howto). The CRI-O will catch up soon." So I choose
containerd as the new container runtime for Kubernetes to support Kata
Containers.
So we have two solutions here now for Kata Containers support in Kubernetes.
Solution 1:
A global flag will be added in ansible playbook local override file
(localhost.yml) to let deployer choose whether Kata Containers support is
enabled or not in StarlingX.
If flag is not set or set as False, Kata Containers support is disabled,
Kubernetes uses docker as container runtime. The flow will be like below:
kubelet -> docker -> runc -> container image
If flag is set as True, Kata Containers support is enabled, Kubernetes uses
containerd as container runtime. For container doesn't require kata container,
the flow will be like below:
kubelet -> containerd -> runc -> container image
For container requires kata container, the follow will be like below:
kubelet -> containerd -> kata container runtime -> container image
Advantage:
For user doesn't use Kata Containers, system will act the same as previous.
Code implementation will not affect current design. So it could be implemented
in master directly piece by piece.
Disadvantage:
User cannot enable Kata Containers support after deployment.
Code implementation needs consider both docker and containerd case.
Solution 2:
No choice for user, and Kubernetes uses containerd as the container runtime.
For container doesn't require kata container, the flow will be like below:
kubelet -> containerd -> runc -> container image
For container requires kata container, the flow will be like below:
kubelet -> containerd -> kata container runtime -> container image
Advantage:
User always could use kata container.
Code implementation will be simpler since just need consider containerd case.
Disadvantage:
Docker tools cannot be used for container deployed by Kubernetes. User should
use kubectl to operate container or new tool like crictl/ctr.
There is a cutover from docker to containerd. Code need be fully verified
before merge.
To keep code clean and easy for maintenance, Solution 2 will be choosen for
this feature implementation.
More details for Solution 2 implementation is shared in "Work Items" section.
Alternatives
------------
CRI-O should be another option to enable Kata Containers support in Kubernetes.
Containerd is selected instead of CRI-O, because Kata Containers prefer it.[5]
Data model impact
-----------------
None. This story does not change any existing data models.
REST API impact
---------------
None. This story does not change any existing REST APIs.
Security impact
---------------
There is no sensitive data touch for this feature, except containerd need
touch certificate/key for local secure registry "registry.local".
Kata Containers could help enhance system's security by "Runs in a dedicated
kernel, providing isolation of network, I/O and memory and can utilize
hardware-enforced isolation with virtualization VT extensions." [0]
Other end user impact
---------------------
With this feature implemented, users are able to run containers with
kata-runtime.
The major change for this feature is Kubernetes switch to use containerd as
container runtime instead of docker. This change is transparent if users access
the container with kubectl. Users will find container created by Kubernetes
cannot be accessd by docker CLI. Users need use other CLI like crictl.
Performance Impact
------------------
For containers doesn't use kata-runtime, the performance should be the same as
previous.
For containers use kata-runtime, some performance drop is expected. Per Kata
Containers website:"Delivers consistent performance as standard Linux
containers; increased isolation with the performance tax of standard virtual
machines."[0]
Also there is a report of "I/O performance of Kata Containers". [1]
Other deployer impact
---------------------
There is no change for the StarlingX deploy process.
In node installation stage, a few RPMs for Kata Containers will be installed
automatically in each node in order to support Kata Containers.
In node configuration stage, docker and Kubernetes will be auto configured to
support kata container runtime.
After deployment, users need to update application tarball file to select kata
runtime if want to use kata-runtime. Current application tarball will still
work with runc runtime, just the same as before.
Developer impact
----------------
The major change for this feature is, Kubernetes will switch to use containerd
as container runtime. So any code implemented as docker specific in Kubernetes
need switch to use containerd.
Upgrade impact
--------------
This feature will introduce several new rpms for kata container. Kubernetes may
be initialized with another container runtime during upgrade. Per my test,
container runtime selection is per node. It means it is independent for the
container runtime for each node. So this feature implementation should not
affect upgrade.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
Shuicheng Lin
Yi Wang
Repos Impacted
--------------
* stx-integ - add containerd configuration files
* stx-config - add containerd confguration
* stx-ansible-playbook - configure Kubernetes to use containerd
* stx-root - add kata container rpms to ISO
* stx-metal - add containerd partition
* stx-tools - add kata container repo and rpms
Work Items
----------
Add kata container
* Add kata container rpms to ISO and install to nodes
There is repo for Kata Containers rpms on CentOS 7. [7]
kata-runtime rpm will be added to ISO explicitly, and other rpms will be
added as dependency automatically.
Kata Containers uses qemu as the hypervisor. Kata also could be configured
to use Firecracker, which is out of scope of this story, and could be
handled later if there is request.
Support kata container in Kubernetes
* Configure containerd/kubelet service
Containerd binary is already installed in StarlingX as part of docker.
A package with containerd related configuration file will be added. File
with parameter need be configured dynamically will be configured through
puppet. File doesn't need be updated will be installed by package.
* Switch kubeadm to use containerd
This will be done in ansible playbook for controller-0, and puppet
configuration for controller-1, both by kubeadm init cmd. For worker node,
it will be configured to use containerd by kubeadm join cmd.
* Configure containerd/docker to share same partition for container storage
By default, docker and containerd has different path for container image
storage. /var/lib/docker for docker, and /var/lib/containerd for containerd.
This is already a 30G lvm partition(/var/run/docker) created for docker
currently. This partition is mainly for save container images and container
runtime ephemeral data. With current StarlingX system, there are two
container images armada and tiller pulled by docker directly, and 2
application platform-integ-apps and stx-openstack pulled by Kubernetes.
Armada and tiller images consume 550 Mb disk, while 2 applications consume
17G disk.
To avoid create 1 more lvm partition for containerd. Containerd will be
configured to use /var/lib/docker path also. And all container images will
be pulled by containerd in default. But users are still able to pull and run
container with docker.
* Add crictl as the CLI for containerd
crictl is a CLI for CRI-compatible container runtimes. It allows the CRI
runtime developers to debug their runtime without needing to set up
Kubernetes components.[9] It could help debug containerd related issues.
There is a rpm for this tool from OpenSuSE repo with latest version, but it
is configured to contact with cri-o in default, need update configure file to
contact with containerd.
* Configure containerd to support insecure registry/local registry/proxy etc
This is to replace the docker specific function used in Kubernetes.
Containerd also need be configured to support upper feature.
Docker SDK is used in sysinv to pull container image, and push it to local
registry. Containerd doesn't have similar python SDK. We need re-write the
code to call containerd or CLI to achieve the same function.
Besides upper work items, there are other work items maybe need added during
implementation of this story. Please check the story link in the beginning of
this spec to get the full work item list.
Dependencies
============
This specification depends upon the open source upstream:
Kata Containers/containerd
https://github.com/kata-containers/kata-containers
https://github.com/containerd/containerd
Testing
=======
This feature can be tested in a fully provisioned StarlingX system
* if kata container support flag is not set, or set as False, the system
should perform as previous.
* if kata container support flag is set as True, the system should also
pass deploy/sanity test, and containers managed by Kubernetes should be
managed by containerd also, and use runc as low-level runtime in default.
With kata container required container, it should be run with kata runtime.
To check whether the running container is with kata runtime or not, we could
login into the container, and use "uname" cmd to check the kernel version.
Host's kernel version is 3.10 while kata's kernel version should be 4.x.
Documentation Impact
====================
A new page should be added to show how to enable Kata Containers support, and
how to explicitly require kata container runtime in Kubernetes. This feature
will add a new container runtime containered for Kubernetes. Any docker
specific operation in Kubernetes may be updated to include containerd
operation.
References
==========
[0]: https://katacontainers.io/
[1]: https://www.stackhpc.com/kata-io-1.html
[2]: https://github.com/kata-containers/runtime/issues/1853
[3]: https://github.com/kata-containers/documentation/blob/master/how-to/how-to-use-k8s-with-cri-containerd-and-kata.md#create-an-untrusted-pod-using-kata-containers
[4]: https://kubernetes.io/docs/concepts/containers/runtime-class/
[5]: https://kubernetes.io/docs/setup/production-environment/container-runtimes/
[6]: https://github.com/kata-containers/documentation/blob/master/Developer-Guide.md#install-a-cri-implementation
[7]: http://download.opensuse.org/repositories/home:/katacontainers:/releases:/x86_64:/master/CentOS_7/
[8]: https://github.com/kata-containers/documentation/blob/master/install/docker/centos-docker-install.md
[9]: https://github.com/kubernetes-sigs/cri-tools/blob/master/docs/crictl.md
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - stx-3.0
- Introduced