Spec: Deploy Kolla images using Mesos
Co-Authored-By: Georgy Okrokvertskhov <gokrokvertskhov@mirantis.com> related blueprint mesos Change-Id: I0bc2d9e69819079b8cdaf9a47c5363a3017dced9
This commit is contained in:
parent
e9c47f3d64
commit
c019854a2b
188
specs/mesos-deployment.rst
Normal file
188
specs/mesos-deployment.rst
Normal file
@ -0,0 +1,188 @@
|
||||
==============================
|
||||
Deploy Kolla images with Mesos
|
||||
==============================
|
||||
|
||||
https://blueprints.launchpad.net/kolla/+spec/mesos
|
||||
|
||||
Kolla deploys the containers using Ansible, however this is just one
|
||||
way to deploy the containers. For example TripleO deploys Kolla
|
||||
containers using Heat in-guest agents.
|
||||
|
||||
This specification defines the support for deploying Kolla containers
|
||||
using Mesos and Marathon.
|
||||
|
||||
What is Mesos?
|
||||
From (http://mesos.apache.org/) Mesos "provides efficient resource
|
||||
isolation and sharing across distributed applications, or frameworks".
|
||||
The software enables resource sharing in a fine-grained manner,
|
||||
improving cluster utilization.
|
||||
|
||||
What is Marathon?
|
||||
From (https://mesosphere.github.io/marathon/):
|
||||
"A cluster-wide init and control system for services in cgroups or
|
||||
Docker containers".
|
||||
|
||||
Adding Mesos/Marathon support to Kolla will enable those interested in
|
||||
deploying OpenStack with Mesos to contribute to the Kolla community
|
||||
in a more direct way.
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
The current deployment (Ansible) is done somewhat serially, meaning
|
||||
that some services depend on others, and the deployment is controlled
|
||||
by the command line (a user). In addition to deployment, Mesos/Marathon
|
||||
provides the following features that will eventually be used:
|
||||
|
||||
- life-cycle management: like service monitoring, restart, scaling
|
||||
and rolling\restarts\upgrades
|
||||
- constraints [1]: the Marathon scheduler will be used to more
|
||||
effectively place containers (esp. during scaling/recovery)
|
||||
- integration with core infrastructure services like DNS, Load
|
||||
Balancing, Service Discovery and Service components.
|
||||
|
||||
In order to reuse a large amount of functionality, it would be best
|
||||
to use an existing framework that provides a proven stable and
|
||||
mature solution.
|
||||
Given that Mesos/Marathon is used and tested at scale by many large
|
||||
companies, it will give operators the confidence to adopt
|
||||
OpenStack to meet any scaling requirements they need.
|
||||
|
||||
Marathon [2] will be used to manage the containers. Marathon is a
|
||||
framework that runs on top of Mesos and it is for long running
|
||||
services.
|
||||
|
||||
Part of this change is to start all the containers at the same time
|
||||
(in parallel) so that there are as few dependencies from the
|
||||
deployment tool’s point of view. This should enable a couple of things:
|
||||
- faster initial deployment
|
||||
- reduce unnecessary restarts during upgrades
|
||||
- make each container more self sufficient
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
- Add a deployment specific git repo (kolla-mesos) to contain the
|
||||
Mesos/Marathon specific deployment code and boot strapping.
|
||||
- Enhance Kolla container API (config.json) to permit loading
|
||||
of custom startup script while maintaining immutability with copy_once.
|
||||
- Implement an all in one (AIO) basic OpenStack
|
||||
- Implement a separate controller/compute setup similar to the Ansible one.
|
||||
- Throughout add docs to assist users and contributors/reviewers.
|
||||
|
||||
Bootstrapping:
|
||||
--------------
|
||||
|
||||
At first, Mesos/Marathon/Zookeeper bootstrapping will be done by
|
||||
setting up docker container. Later, bootstrapping will be handled by Ironic/PXE
|
||||
(the aim is to be practical and do what is easiest for the AIO).
|
||||
|
||||
Dependancy management
|
||||
---------------------
|
||||
|
||||
Instead of the serialising the dependant steps, each container is
|
||||
started and only actually starts the service if the requirements are
|
||||
fulfilled.
|
||||
|
||||
These dependencies will come in the form of:
|
||||
|
||||
- service discovery (service X needs service Y running)
|
||||
Note: that Marathon DNS and LB can be self-configured based on service
|
||||
registry information.
|
||||
To achieve this the container also needs to register itself once
|
||||
it has started.
|
||||
- checking to see if service configuration is complete
|
||||
(has keystone got the service user that is required, is the DB
|
||||
schema complete, etc..)
|
||||
Use Zookeeper to watch for these configuration steps.
|
||||
|
||||
One time tasks
|
||||
--------------
|
||||
Ansible runs a number of scripts to setup the database, keystone etc.
|
||||
These can be run as a Mesos Executor (command line run in the
|
||||
container of choice).
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
Mesos and Marathon are mature products used by various companies in
|
||||
production. The central configuration storage will require careful
|
||||
security risk assessment. The deployed OpenStack’s security should not
|
||||
be affected by the deployment tool.
|
||||
|
||||
Performance Impact
|
||||
------------------
|
||||
|
||||
Given that the Mesos slaves are distributed and all containers will be
|
||||
started in parallel, the deployment *may* be faster, though this is
|
||||
not the main focus.
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
Kubernetes was evaluated by the Kolla team 6 months ago and found to
|
||||
not work at that time as it did not support net=host and pid=host
|
||||
features of docker. Since then it has developed these features, if
|
||||
Mesos/Marathon fails to produce results, then going back to kubernetes
|
||||
is an option. However at the time of writing this Mesos/Marathon was
|
||||
deemed to be more mature and stable.
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Primary Assignee(s)
|
||||
-----------
|
||||
Angus Salkeld (asalkeld)
|
||||
Kirill Proskurin (kproskurin)
|
||||
Michal Rostecki (nihilifer)
|
||||
|
||||
Other contributor(s):
|
||||
Harm Weites (harmw)
|
||||
Jeff Peeler (jpeeler)
|
||||
Michal Jastrzebski (inc0)
|
||||
Sam Yaple (SamYaple)
|
||||
Steven Dake (sdake)
|
||||
<Please add your name here if you are getting involved in kolla-mesos>
|
||||
|
||||
Milestones
|
||||
----------
|
||||
|
||||
Target Milestone for completion:
|
||||
mitaka
|
||||
|
||||
Work Items
|
||||
----------
|
||||
1. Allow a custom startup script to run (change in Kolla)
|
||||
2. Add startup scripts to kolla-mesos to read config from zookeeper
|
||||
instead of bindmounted directory. Propose oslo.config changes to
|
||||
use this method (oslo work done in parallel, initially this will be
|
||||
done in the startup script).
|
||||
3. Add startup scripts for service discovery so that services only
|
||||
start once their needs are fulfilled.
|
||||
a. register a service once a service is running
|
||||
b. wait for dependent services if they are needed before starting
|
||||
a service.
|
||||
c. DNS and LB self-configuration based on service registry information
|
||||
5. Add bootstrapping code to install Marathon, Zookeeper,
|
||||
Mesos master and slave.
|
||||
6. Add calls to to marathon to deploy containers.
|
||||
7. Add support for kolla-mesos to kolla-cli.
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
Functional tests will be implemented in the OpenStack check/gating system to
|
||||
automatically check that the Mesos/Marathon deployment works for an AIO environment.
|
||||
|
||||
Documentation Impact
|
||||
====================
|
||||
A quick start guide will be written to explain how to deploy.
|
||||
A develop guide will be written on how to contribute and how the deployment works.
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
- [1] https://mesosphere.github.io/marathon/docs/constraints.html
|
||||
- [2] https://mesosphere.github.io/marathon/
|
||||
- http://radar.oreilly.com/2015/10/swarm-v-fleet-v-kubernetes-v-mesos.html
|
||||
- https://www.wehkamplabs.com/blog/2015/10/15/applying-consul-within-the-blaze-microservices-platform/
|
Loading…
x
Reference in New Issue
Block a user