Merge "Spec - Kolla-Kubernetes Deployment Architecture"

This commit is contained in:
Jenkins 2016-11-17 01:50:22 +00:00 committed by Gerrit Code Review
commit 6a7a7b61f5
1 changed files with 391 additions and 0 deletions

View File

@ -0,0 +1,391 @@
=============================
Kolla-kubernetes Architecture
=============================
OpenStack deployment requires multiple sequential steps to occur in workflow
like fashion. This puts the onus on Kubernetes to handle dependency management
and task ordering for each OpenStack service, which it currently doesn't. Using
the kolla-kubernetes fencing pod [1] and etcd-operator [2] as an examples, the
community will evaluate adding Kubernetes Operators [3] as a means of handling
the deployment of OpenStack.
In addition, there will be an effort to use the Kubernetes native package
manager, the Helm project [4]. Kolla-kubernetes will evaluate using Helm charts
as a means for making each OpenStack service available in the native Kubernetes
ecosystem and increase consumability of Kolla containers in real world
deployments and across communities.
Problem description
===================
In order to execute the OpenStack day 1 and day 2 workflows (deploy, upgrade,
and reconfigure), the community needs a way to handle running a series of tasks
in a certain order. Kubernetes does not natively support workflow operations.
OpenStack as a whole is highly customizable and flexible. A deployment tool
needs to be equally customizable and flexiable in order for a user to get
the most benifit from it.
Proposed change
===============
1. The kolla-kubernetes community will write Kubernetes Operators in **Python** as
part of the 1.0 release and revisit writing them in **Go** at a later date. The
reason for this is because Go lacks the ecosystem required by the TC in order to
allow for it to be used as part of OpenStack. Specifically, it has to do with
tools and code distribution.
2. The kolla-kubernetes community will write Helm charts for each of the OpenStack
services.
3. The kolla-kubernetes community will implement Dependency Init Containers.
Dependencies
------------
- Helm >= v2.0
- Kubernetes >= 1.4
Dependency Init Container
=========================
The dependency init container is a Kubernetes Init container object. The init
container will be the first container run in a pod and its job check if its safe
for a service to run. This layer is great for solving low level dependency
issues that operators are overkill for doing.
A good example of this is openvswitch needing to wait until ovs-db is running
in order to proceed.
Dependency List for the Keystone Pod
------------------------------------
Init container for keystone pod would be blocked until these dependencies are
met:
- MariaDB pod is ready
- Keystone bootstrap job is done
- Keystone configmap is present
After these conditions are met, the init container is marked as ready and
keystone pod can be deployed.
Kubernetes Operator
===================
*Kubernetes Controller* - A Kubernetes Controller is a piece of code that
manages the lifecycle of a complex application [5].
*Kubernetes Operator* - A Kubernetes Operator is defined as a containerized
Kubernetes Controller that uses the Kubernetes ThirdPartyResource [6]. The
Kubernetes Operator is meant to be directed by input, configuration, and action
from the user.
The Kubernetes Operator for deploying Keystone will perform the following tasks:
1. Check Mariadb exists
2. Check if Keystone exists
3. Read Keystone configuration file from the Kubernetes ThirdPartyResource and
register it as a configmap
4. Perform any user directed actions from the Kubernetes ThirdPartyResource
5. Use Helm to run a Kubernetes Job to create the Keystone database
6. Use Helm to run a Kubernetes Job to create the users and roles
7. Execute any additional startup tasks to bring the service to a ready state
8. Use Helm to run Kubernetes pods for Keystone
9. The Dependency Init Container determines its safe to run Keystone
10. The Keystone pod starts
User perspective of a Kubernetes Operator
------------------------------------------
OpenStack lifecycle management is a workflow and Kubernetes Operators are the
implementation of it. However, using a kubernetes Operator needs to be optional
because a user may want to handle all the necessary lifecycle steps by hand.
Therefore, the Kubernetes Operator needs to be detached from the services
themselves so that it can be flexible.
Code Outline
------------
The Controller code will live in the kolla-kubernetes repo. The Kubernetes
Operator container(s) will be added as kolla image(s) and exist in the
kolla-kubernetes repo.
Layering
========
The kolla-kubernetes project is broken down into multiple consumable building
blocks. The building blocks are referred to as **layers**.
**layer** - A layer in kolla-kubernetes has the following properties:
1. Optional - it can be skipped or turned off by an operator
2. Performs a specific task
3. All layers can be used for deployment
The diagram below is a representation of option 1 outlined in the
Kolla-kubernetes Design Options section.
As an example, a user needs the ability to choose the layer that best fits their
needs and preferences. That could mean skipping all upper layers and using layer
2.1 for deployment, but using layer 4 for upgrades. Also, a user could jump in
at layer 5 for deployment, then use layer 3 for upgrades or other lifecycle
operations.
_____________________
| OpenStack |
| Kubernetes Operator | layer 5 - The Kubernetes Operator that will have
|_________ __________| oversight of all the other Kubernetes
|| Operators
_________\/__________
| Service |
| Kubernetes Operator | layer 4 - Executes Helm packages for a service.
|_________ __________|
||
_________\/__________
| |
| Helm Package | layer 3 - Organizes the Kubernetes resources for each
|_________ __________| pod
||
_________\/__________
| |
| Kubernetes resource | layer 2 - Deploys pods/deployments/replication-
| _________________ | controllers/daemon-sets for an OpenStack
| | Dependency Init | | service.
| | Container | |layer 2.1- The Dependency Init Container checks to be
| |_________________| | sure dependencies are met before starting
|_________ __________| the service.
||
_________\/__________
| |
| OpenStack Service | layer 1 - Kolla container
|_____________________|
5. OpenStack Kubernetes Operator (optional)
4. Service Kubernetes Operator (optional)
3. Helm package (optional)
2. Kubernetes service pod
2.1 Dependency Init Container (optional)
1. OpenStack service
Kolla-kubernetes Design
=======================
The goal of this section is to provide the best kolla-kubernetes design so that
OpenStack lifecycle management will abide by the following principles:
- Granular - Each resource is a well defined building block with a purpose
- Consumable - Any user or project is able to easily use kolla-kubernetes
- Flexible - The project is capable of being used in different ways
- Customizable - The project is capable of adopting new use cases
- Debuggable - It is clear when a resource is misbehaving and why
There were six models considered for how the community will deploy OpenStack on
Kubernetes. Option 1 is highlighted below as the option the community found
the most appealing:
**Kolla-kubernetes will be using option 1 for its design model**
Option 1:
There are multiple Kubernetes Operators. One Kubernetes Operator per each
**service** (nova, keystone, neutron). There is a single OpenStack Kubernetes
operator that orchestrates each **service** Kubernetes Operator. The Dependency
Init Container exists as a Kubernetes Init object within a **service** pod.
The Dependency Init Container sits at the pod level to handle simpler dependency
resolution that isn't required by Kubernetes Operators.
- This option provides a good amount of granularity in that there is a clear
separation between the different layers of abstraction. *Every layer is*
*optional and can be used for deployment*.
Other Options That Were Considered
----------------------------------
Option 2:
There are multiple Kubernetes Operators. One Kubernetes Operator per each
**service** (nova, keystone, neutron). There is a single OpenStack kubernetes
operator that orchestrates each **service** Kubernetes Operator.
- This option provides a good amount of granularity in that there is a clear
separation between the different layers of abstraction. This could have
more layers or it might be the right amount of abstraction a user is looking
for.
Option 3:
There is a single OpenStack Kubernetes Operator that handles the deployment of
OpenStack.
- Though the simplicity is appealing here, this option is concerning because
it does not reflect the complexity, consumability, or granularity OpenStack
lifecycle management requires.
Option 4:
There are multiple Kubernetes Operators. One Kubernetes Operator per each
**micro service** (nova-api, nova-conductor, nova-scheduler). Above that,
there is a Kubernetes Operator per **service** that controls the micro
service Kubernetes Operator. At the top level, there is a single OpenStack
Kubernetes Operator that orchestrates each **service** Kubernetes
Operator.
- This option provides a high amount of granularity with every service being
thinned out to the microservice level. This option will be the most costly
in terms of code written since every service will have multiple Kubernetes
Operators. Also, there needs to be an account of the additional layering
being more granular. More layers will add flexibility for the user.
Option 5:
There are multiple Kubernetes Operators. One Kubernetes Operator per each
**role** (compute, controller, networking, monitoring). There is a single
OpenStack Kubernetes Operator that orchestrates each **role** Kubernetes
Operator.
- This option provides an average amount of granularity. The roles would
have to be completely customizable and exposed to the kubernetes operator.
There isn't a defined layering to handle an individual lifecycle task for a
specific service of a role in this model.
Option 6:
There is no OpenStack Kubernetes Operator. Only Kubernetes Operators for
Mariadb, rabbitmq, memcached, and to handle operations like back-ups or
disaster recovery. Entrypoints are used for workflow management.
- This is a simpler approach which subjects the orchestration layer to
the pod level.
- There is a concern here for trying to go too far to achieve maximum
usability. In Kolla's past, this has shown to create a lack of
complexity and flexibility which are both required for an OpenStack
deployment.
- OpenStack is a complex application where day 2 operations demand a lot of
care. If we handle complex operations at layer 1 (in the diagram), the
containers will have to each carry all logic required to perform all
operations at run time. Therefore, a user can easily run into a situation
where the cluster is expected to do something, but the code underneath does
something else entirely.
- Debugging is a huge pain in layer 1 because it is a challenge to know where
the workflow failed and how it failed with the logic scattered across all
the containers.
- The visibility and consumability are difficult because like Kolla learned,
deploying logic from the client side is more effective then having logic
run server side. Operators are the client side controllers, while
entrypoints are run in the containers (server side). Kolla has shown
that client side logic is far more effective for deploying a complex tool
like OpenStack because a user has more control over what is happening.
Workflow Example
================
As an example, if a user had a cluster with Keystone and MariaDB already running
and wanted to run Glance, this is what would occur at each layer:
User - The user creates a custom config file for glance-api and saves it in
/etc/kolla/glance-api/config.json.
Next, the user runs the OpenStack Kubernetes Operator and inputs the
customized glance-api config file as part of the Kubernetes
ThirdPartyResource. Glance-registry will use the default config file
in this example.
Layer 5 - The OpenStack Kubernetes Operator will look at the config files and
merge any changes overwriting the existing config files. Then, the
Kubernetes Operator looks at what the user requested for deployment.
The Kubernetes Operator checks if the cluster has deployed the
service then resolves the dependencies required for the service to be
deployed.
The OpenStack Kubernetes Operator spawns the Glance Kubernetes
Operator.
Layer 4 - The Glance Kubernetes Operator will gather the config data placed in
its Kubernetes ThirdPartyResource and create ConfigMaps for the config
files.
The Glance Kubernetes Operator will use Helm to start a Kubernetes Job
which creates the Glance DB user and password. Also, use Helm to run
the Kubernetes Job that will create the Keystone user and password.
Finally, the Glance Kubernetes Operator will run the Glance services
using Helm.
Layer 3 - The Glance Helm package will run the glance-api and glance-registry
pods based on the Glance kolla-kubernetes templates.
Layer 2 - The Kubernetes pods run glance-api and glance-registry as Deployments.
The templates will map the configmaps to the location Kolla expects
config files to appear.
- The Dependency Init Container checks to be sure glance-api and
glance-registry are safe to run. Once they are, the services start.
Layer 1 - The Kolla containers run. They each pick up the mounted config files
and run their service.
Helm
====
Helm inserts kolla-kubernetes into the Kubernetes app distribution system. That
way, a Kubernetes Operator can search for and consume the pieces of OpenStack as
building blocks to assemble a real world deployment. The user experience is
better using the Kubernetes native distribution system. In addition, Helm
provides a templating engine to ensure the templates are flexible and to make
the templates easy to version.
OpenStack and Kubernetes are different communities. In order to grow the
interop between the two, it makes sense for OpenStack to be distrubuted using
the Kubernetes native package manager.
Code Outline
------------
There will be a Helm chart for each OpenStack service [7]. The Helm charts will
premier in the kolla-kubernetes repo. After reaching some stabilization, the
community can decide to publish the charts to the incubation directory in the
Kubernetes repo [8].
Implementation
==============
Community discussion etherpads [9][10].
Primary Assignee(s)
-------------------
Ryan Hallisey (rhallisey)
Steven Dake (sdake)
Kevin Fox (kfox1111)
Pete Birley (portdirect)
Michal Jastrzebski (inc0)
Mark Giles (mgiles)
Takashi Sogabe (sogabe)
Steve Wilkerson (srwilkers)
Duong Ha-Quang (duonghq)
Serguei Bezverkhi (sbezverk)
Surya Prakash Singh (sp_)
kolla-kubernetes team
< add your name here >
Work Items
----------
1. Write Kubernetes Operators required to run OpenStack
2. Write scripts that will execute the Kubernetes Operator
3. Write Helm charts for OpenStack services
4. Adjust the CLI to work with Kubernetes Operators and Helm
<Please add new work items that are worth mentioning in the spec>
Documentation Impact
====================
< more docs >
References
==========
- [1] - https://review.openstack.org/#/c/383922/
- [2] - https://github.com/coreos/etcd-operator
- [3] - https://coreos.com/blog/introducing-operators.html
- [4] - https://github.com/kubernetes/helm
- [5] - https://coreos.com/blog/introducing-the-etcd-operator.html
- [6] - https://github.com/coreos/etcd-operator/blob/master/doc/design/arch.png
- [7] - https://github.com/sapcc/openstack-helm
- [8] - https://github.com/kubernetes/charts/tree/master/incubator
- [9] - https://etherpad.openstack.org/p/161115-kolla-kubernetes-cn-discussion
- [10] - https://etherpad.openstack.org/p/operator-base-class