6bde2f6839
Change-Id: I9426b176a0d7af452c569590b771e4661b4652a2
353 lines
14 KiB
ReStructuredText
353 lines
14 KiB
ReStructuredText
===================================
|
|
Deploy Kolla images with Kubernetes
|
|
===================================
|
|
|
|
https://blueprints.launchpad.net/kolla/+spec/kolla-kubernetes
|
|
|
|
Kubernetes was evaluated by the Kolla team in the first two months of the
|
|
project and it was found to be problematic because it did not support net=host,
|
|
pid=host, and --privileged features in docker. Since then, it has developed
|
|
these features [1].
|
|
|
|
The objective is to manage the lifecycle of containerized OpenStack services by
|
|
using Kubernetes container management tools in order to obtain the self-healing
|
|
and upgrade capabilities inherent in Kubernetes.
|
|
|
|
Problem description
|
|
===================
|
|
|
|
Kubernetes
|
|
- life-cycle management: service monitoring, HA, loadbalancing, and
|
|
health checks
|
|
- upgrades: rolling
|
|
|
|
Kubernetes has services that provide kolla-kubernetes with service monitoring,
|
|
health checks, service scaling, and upgrades. The community can use the
|
|
scheduler and node affinity trait to assign work loads to appropriate nodes [2].
|
|
Kubernetes also has built in health checks that monitors a container's status.
|
|
Finally, kolla-kubernetes can use the replication controller to scale up and
|
|
down the stack [3].
|
|
|
|
For upgrades, Kubernetes has an object called 'deployments'[4], which detects
|
|
when a pod needs to change. It starts to scale down the current running
|
|
pods and scale up the new pods.
|
|
|
|
Use Cases
|
|
=========
|
|
|
|
- Kubernetes as an underlay for OpenStack.
|
|
- Kubernetes to handle container scheduling.
|
|
- Feedback loop when using Kubernetes health checks during deployment and
|
|
upgrade.
|
|
- High Availability for individual containers
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
- Add a deployment specific git repo (kolla-kubernetes) under the kolla
|
|
governance that contains the Kubernetes deployment code.
|
|
|
|
Orchestration
|
|
-------------
|
|
|
|
OpenStack on Kubernetes will be orchestrated by outside tools in order to create
|
|
a production ready OpenStack environment. The kolla-kubernetes repo is where
|
|
any deployment tool can join the community and be a part of orchestrating a
|
|
kolla-kubernetes deployment.
|
|
|
|
Service Config Management
|
|
-------------------------
|
|
|
|
Config generation will be completely decoupled from the deployment. The
|
|
containers only expect a config file to land in a specific directory in
|
|
the container in order to run. With this decoupled model, any tool could be
|
|
used to generate config files. The kolla-kubernetes community will evaluate
|
|
any config generation tool, but will likely use Ansible for config generation
|
|
in order to reuse existing work from the community. This solution uses
|
|
customized Ansible and jinja2 templates to generate the config. Also, there will
|
|
be a maintained set of defaults and a global yaml file that can override the
|
|
defaults.
|
|
|
|
The config files will be injected into the kubernetes configmap and loaded into
|
|
the containers. There will be one configmap per configuration file and there can
|
|
be multiple config maps. The containers will configure themselves using the
|
|
configuration files loaded into the appropriate directories [5][6].
|
|
|
|
Bootstrapping
|
|
-------------
|
|
|
|
Bootstrapping the Kolla containers involves running a single task per service
|
|
that will initialize the databases and create the users. The bootstrapping task
|
|
will be a Kubernetes Job, which will run the task until completion then
|
|
terminate the pods [7].
|
|
|
|
Each service will have a bootstrap task so that when the operator upgrades,
|
|
the bootstrap tasks are reused to upgrade the database. This will allow
|
|
deployment and upgrades to follow the same pipeline.
|
|
|
|
The Kolla containers will communicate with the Kubernetes API server to in order
|
|
to be self aware of if any bootstrapping processes are occurring.
|
|
|
|
Alternative bootstrap approaches:
|
|
|
|
1) Create 2 pods per OpenStack service. One pod is designed to do the
|
|
bootstrapping/db_sync while the other pod runs as the normal service. This will
|
|
require some orchestration and the bootstrap pod will need to be setup to
|
|
never restart or be replicated.
|
|
|
|
2) Use a sidecar container in the pod to handle the database sync with proper
|
|
health checking to make sure the services are coming up healthy. The big
|
|
difference between kolla's old docker-compose solution and Kubernetes, is that
|
|
docker-compose would only restart the containers. Kubernetes will completely
|
|
reschedule them. Which means, removing the pod and restarting it. The reason
|
|
this would fix that race condition failure kolla saw from docker-compose is
|
|
because glance would be rescheduled on failure allowing keystone to get a
|
|
chance to sync with the database and become active instead of constantly being
|
|
piled with glance requests. There can also be health checks around this to help
|
|
determine order.
|
|
|
|
If kolla-kubernetes used this sidecar approach, it would regain the use of
|
|
native Kubernetes upgrades [16].
|
|
|
|
Dependencies
|
|
------------
|
|
|
|
- Kubernetes >= 1.3.0
|
|
- Docker >= 1.10.0
|
|
- Jinja2 >= 2.8.0
|
|
|
|
Kubernetes does not support dependencies between pods. The operator will launch
|
|
all the services and use kubernetes health checks to bring the deployment to an
|
|
operational state.
|
|
|
|
With orchestration around Kubernetes, the operator can determine what tasks are
|
|
run and when the tasks are run. This way, dependencies are handled at the
|
|
orchestration level, but they are not required because proper health checking
|
|
will bring up the cluster in a healthy state.
|
|
|
|
Upgrades
|
|
--------
|
|
|
|
Kubernetes has an object called a Deployment, where the operator defines a
|
|
desired state for the pods and the deployment will move the cluster to the
|
|
desired state when a change is detected.
|
|
|
|
Kolla-kubernetes will provide Jobs that will provide the operator with the
|
|
flexibility needed to under go a step wise upgrade. In future releases,
|
|
kolla-kubernetes will look to Kubernetes to provide a means for operators to
|
|
plugin these jobs into a Deployment.
|
|
|
|
Reconfigure
|
|
-----------
|
|
|
|
The operator generates a new config and loads it into the Kubernetes configmap
|
|
by changing the configmap version in the service yaml file. Then, the operator
|
|
will trigger a rolling upgrade, which will scale down old pods and bring up new
|
|
ones that will run with the updated configuration files.
|
|
|
|
There's an open issue upstream in Kubernetes where the plan is to add support
|
|
around detecting if a pod has a changed in the configmap [6]. Depending on what
|
|
the solution is, kolla-kubernetes may or may not use it. The rolling
|
|
upgrade feature will provide kolla-kubernetes with an elegant way to handle
|
|
restarting the services.
|
|
|
|
HA Architecture
|
|
---------------
|
|
|
|
Kubernetes uses health checks to bring up the services. Therefore,
|
|
kolla-kubernetes will use the same checks when monitoring if a service is
|
|
healthy. When a service fails, the replication controller will be responsible
|
|
for bringing up a new container in its place [8][9].
|
|
|
|
However, Kubernetes does not cover all the HA corner cases, for instance,
|
|
fencing. But, there are some operator known practices that can be used to get
|
|
around this [10]. For example, to implement storage fencing, the operator can
|
|
use ceph backed storage [11][12]. This is an option that the community can
|
|
document in order to provide kolla-kubernetes with a production ready solution
|
|
if Kubernetes cannot.
|
|
|
|
.. note:: There is a known issue in Kubernetes with releasing volumes from a
|
|
node that disappeared from the cluster. This is expected to be fixed in the 1.3
|
|
release [13].
|
|
|
|
Persistent Storage
|
|
------------------
|
|
|
|
Kubernetes has many types of persistent storage [14]. Since Kubernetes doesn't
|
|
guarantee a pod will always be scheduled to a host, it makes node based
|
|
persistent storage unlikely, unless the community uses labels for every pod.
|
|
|
|
Persistent storage in kolla-kubernetes will come from volumes backed by
|
|
different storage offerings to provide persistent storage. Kolla-kubernetes
|
|
will provide a default solution using Ceph RBD, that the community will use to
|
|
deploy multinode with. From there, kolla-kubernetes can add any additional
|
|
persistent storage options as well as support options for the operator to
|
|
reference an existing storage solution.
|
|
|
|
To deploy Ceph, the community will use the Ansible playbooks from Kolla to
|
|
deploy a containerized Ceph at least for the 1.0 release. After Kubernetes
|
|
deployment matures, the community can evaluate building its own Ceph deployment
|
|
solution.
|
|
|
|
Existing external Ceph deployments will require additional documentation
|
|
to describe how to integrate them with a Kubernetes deployment.
|
|
|
|
Service Roles
|
|
-------------
|
|
|
|
At the broadest level, OpenStack can split up into two main roles, Controller
|
|
and Compute. With Kubernetes, the role definition layer changes.
|
|
Kolla-kubernetes will still need to define Compute nodes, but not Controller
|
|
nodes. Compute nodes hold the libvirt container and the running vms. That
|
|
service cannont migrate because the vms associated with it exist on the node.
|
|
However, the Controller role is more flexible. The Kubernetes layer provides IP
|
|
persistence so that APIs will remain active and abstracted from the operator's
|
|
view [15]. kolla-kubernetes can direct Controller services away from the Compute
|
|
node using labels, while managing Compute services more strictly.
|
|
|
|
The Kubernetes Label field will be configurable to allow the operator to
|
|
define roles and direct where services will land.
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
Kolla-Kubernetes will run the containers as non root wherever possible.
|
|
SELinux or AppArmor will be in place to limit the damage from container
|
|
breakouts.
|
|
|
|
Kubernetes is planning to adding capabilities to the pod level that will enable
|
|
the community to restrict container privileges even further [16].
|
|
|
|
Performance Impact
|
|
------------------
|
|
|
|
Since kolla-kubernetes is not using dependencies for the service deployment, the
|
|
services will take a different amount of time to start up for each deployment
|
|
because the order will always vary when the services become active.
|
|
As such, it's hard to quantify the exact performance impact other than it is
|
|
small.
|
|
|
|
Networking
|
|
----------
|
|
|
|
Kolla-kubernetes will initially use 'net=host' everywhere to get the project
|
|
going. As the project matures, 'net=host' needs to be reevaluated as to which
|
|
services will run without 'net=host' in order to gain additional functionality.
|
|
For example, controller services will float between nodes potentially landing
|
|
two of the same pods on the same node. Those pods will be listening on the same
|
|
ports in the hosts network stack, which could prevent the pods from working.
|
|
|
|
Logging & Monitoring
|
|
--------------------
|
|
|
|
To reuse Kolla's containers, kolla-kubernetes will use elastic search, heka, and
|
|
kibana as the default logging mechanism.
|
|
|
|
The community will implement centralized logging by using a 'side car' container
|
|
in the Kubernetes pod [17]. The logging service will trace the logs from the
|
|
shared volume of the running serivce and send the data to elastic search. This
|
|
solution is ideal because volumes are shared among the containers in a pod.
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Primary Assignee(s)
|
|
-----------
|
|
Ryan Hallisey (rhallisey)
|
|
|
|
Other contributor(s):
|
|
kolla-core team [18]
|
|
Alex Polvi (polvi)
|
|
Andrew Battye
|
|
Brandon Jozsa (v1k0d3n)
|
|
Britt Houser (britthouser)
|
|
Davanum Srinivas (dims)
|
|
David Wang (dcwangmit01)
|
|
Egor Guz (eghobo)
|
|
Greg Herlein (gherlein)
|
|
Hui Kang (huikang)
|
|
Ian Main (Slower)
|
|
Jinay Vora (jvora)
|
|
Keith Byrne (kbyrne)
|
|
Ken Wronkiewicz (wirehead)
|
|
Kevin Fox (kfox1111)
|
|
Marga Millet (fragatina)
|
|
Marian Schwarz
|
|
Mark Casey (mark-casey)
|
|
Mauricio Lima (mlima)
|
|
Md Nadeem (mail2nadeem92)
|
|
Michael Schmidt
|
|
Michal Rostecki (mrostecki)
|
|
Qiu Yu (unicell)
|
|
Rajath Agasthya (rajathagasthya)
|
|
Rob Mason
|
|
Sean Mooney (sean-k-mooney)
|
|
Serguei Bezverkhi (sbezverk)
|
|
Sidharth Surana (ssurana)
|
|
Zdenek Janda (xdeu)
|
|
<Please add your name here if you are getting involved in kolla-kubernetes>
|
|
|
|
Milestones
|
|
----------
|
|
|
|
Target Milestone for tech-preview code:
|
|
Newton
|
|
|
|
Work Items
|
|
----------
|
|
1. Create kolla-kubernetes repo
|
|
2. Build yaml files for each service
|
|
3. Build a CLI to handle templated yaml files
|
|
4. Build an all in one environment
|
|
5. Drop net=host on a set of services
|
|
6. Write per service health checks
|
|
7. Write startup docs
|
|
8. Add orchestration tools around the pods
|
|
9. All in one gating
|
|
10. Convert each service to a 'Deployment'
|
|
11. Build multinode environment
|
|
12. Config generation tools
|
|
13. Multinode docs
|
|
14. Implement reconfigure by templating configmaps
|
|
15. Centralized logging
|
|
16. Implement upgrades
|
|
17. Advanced deployment docs
|
|
<Please add new work items that are worth mentioning in the spec>
|
|
|
|
Testing
|
|
=======
|
|
|
|
Functional tests will be implemented in the OpenStack check/gating system to
|
|
automatically test that the Kubernetes deployment works for an AIO
|
|
environment [19].
|
|
|
|
Documentation Impact
|
|
====================
|
|
Add a quick start guide, which explains how to deploy kolla-kubernetes.
|
|
Add a developer guide on how to contribute which also explains how the
|
|
deployment works.
|
|
|
|
References
|
|
==========
|
|
|
|
- [1] https://github.com/kubernetes/kubernetes/releases/tag/v1.2.0
|
|
- [2] http://kubernetes.io/docs/user-guide/node-selection/
|
|
- [3] http://kubernetes.io/v1.0/docs/user-guide/managing-deployments.html
|
|
- [4] https://cloud.google.com/container-engine/docs/replicationcontrollers/
|
|
- [5] https://github.com/kubernetes/kubernetes/blob/master/docs/design/configmap.md
|
|
- [6] https://github.com/kubernetes/kubernetes/issues/24957
|
|
- [7] http://kubernetes.io/docs/user-guide/jobs/
|
|
- [8] http://kubernetes.io/docs/user-guide/replication-controller/
|
|
- [9] http://kubernetes.io/docs/user-guide/replicasets/
|
|
- [10] http://kubernetes.io/docs/admin/high-availability/#master-elected-components
|
|
- [11] http://kubernetes.io/docs/user-guide/volumes/#rbd
|
|
- [12] http://docs.ceph.com/docs/master/cephfs/eviction/
|
|
- [13] https://github.com/kubernetes/kubernetes/issues/20262
|
|
- [14] http://kubernetes.io/docs/user-guide/volumes/
|
|
- [15] http://kubernetes.io/docs/user-guide/node-selection/
|
|
- [16] https://github.com/kubernetes/kubernetes/blob/master/docs/proposals/pod-security-context.md
|
|
- [17] http://blog.kubernetes.io/2015/06/the-distributed-system-toolkit-patterns.html
|
|
- [18] https://review.openstack.org/#/admin/groups/460,members
|
|
- [19] https://etherpad.openstack.org/p/kolla-newton-summit-kolla-gate-walkthru
|
|
- https://github.com/kubernetes/kubernetes
|