diff --git a/specs/mesos-deployment.rst b/specs/mesos-deployment.rst new file mode 100644 index 0000000000..741758b674 --- /dev/null +++ b/specs/mesos-deployment.rst @@ -0,0 +1,188 @@ +============================== +Deploy Kolla images with Mesos +============================== + +https://blueprints.launchpad.net/kolla/+spec/mesos + +Kolla deploys the containers using Ansible, however this is just one +way to deploy the containers. For example TripleO deploys Kolla +containers using Heat in-guest agents. + +This specification defines the support for deploying Kolla containers +using Mesos and Marathon. + +What is Mesos? +From (http://mesos.apache.org/) Mesos "provides efficient resource +isolation and sharing across distributed applications, or frameworks". +The software enables resource sharing in a fine-grained manner, +improving cluster utilization. + +What is Marathon? +From (https://mesosphere.github.io/marathon/): +"A cluster-wide init and control system for services in cgroups or +Docker containers". + +Adding Mesos/Marathon support to Kolla will enable those interested in +deploying OpenStack with Mesos to contribute to the Kolla community +in a more direct way. + +Problem description +=================== + +The current deployment (Ansible) is done somewhat serially, meaning +that some services depend on others, and the deployment is controlled +by the command line (a user). In addition to deployment, Mesos/Marathon +provides the following features that will eventually be used: + +- life-cycle management: like service monitoring, restart, scaling + and rolling\restarts\upgrades +- constraints [1]: the Marathon scheduler will be used to more + effectively place containers (esp. during scaling/recovery) +- integration with core infrastructure services like DNS, Load + Balancing, Service Discovery and Service components. + +In order to reuse a large amount of functionality, it would be best +to use an existing framework that provides a proven stable and +mature solution. +Given that Mesos/Marathon is used and tested at scale by many large +companies, it will give operators the confidence to adopt +OpenStack to meet any scaling requirements they need. + +Marathon [2] will be used to manage the containers. Marathon is a +framework that runs on top of Mesos and it is for long running +services. + +Part of this change is to start all the containers at the same time +(in parallel) so that there are as few dependencies from the +deployment tool’s point of view. This should enable a couple of things: +- faster initial deployment +- reduce unnecessary restarts during upgrades +- make each container more self sufficient + +Proposed change +=============== + +- Add a deployment specific git repo (kolla-mesos) to contain the + Mesos/Marathon specific deployment code and boot strapping. +- Enhance Kolla container API (config.json) to permit loading + of custom startup script while maintaining immutability with copy_once. +- Implement an all in one (AIO) basic OpenStack +- Implement a separate controller/compute setup similar to the Ansible one. +- Throughout add docs to assist users and contributors/reviewers. + +Bootstrapping: +-------------- + +At first, Mesos/Marathon/Zookeeper bootstrapping will be done by +setting up docker container. Later, bootstrapping will be handled by Ironic/PXE +(the aim is to be practical and do what is easiest for the AIO). + +Dependancy management +--------------------- + +Instead of the serialising the dependant steps, each container is +started and only actually starts the service if the requirements are +fulfilled. + +These dependencies will come in the form of: + +- service discovery (service X needs service Y running) + Note: that Marathon DNS and LB can be self-configured based on service + registry information. + To achieve this the container also needs to register itself once + it has started. +- checking to see if service configuration is complete + (has keystone got the service user that is required, is the DB + schema complete, etc..) + Use Zookeeper to watch for these configuration steps. + +One time tasks +-------------- +Ansible runs a number of scripts to setup the database, keystone etc. +These can be run as a Mesos Executor (command line run in the +container of choice). + +Security impact +--------------- + +Mesos and Marathon are mature products used by various companies in +production. The central configuration storage will require careful +security risk assessment. The deployed OpenStack’s security should not +be affected by the deployment tool. + +Performance Impact +------------------ + +Given that the Mesos slaves are distributed and all containers will be +started in parallel, the deployment *may* be faster, though this is +not the main focus. + +Alternatives +------------ + +Kubernetes was evaluated by the Kolla team 6 months ago and found to +not work at that time as it did not support net=host and pid=host +features of docker. Since then it has developed these features, if +Mesos/Marathon fails to produce results, then going back to kubernetes +is an option. However at the time of writing this Mesos/Marathon was +deemed to be more mature and stable. + +Implementation +============== + +Primary Assignee(s) +----------- + Angus Salkeld (asalkeld) + Kirill Proskurin (kproskurin) + Michal Rostecki (nihilifer) + +Other contributor(s): + Harm Weites (harmw) + Jeff Peeler (jpeeler) + Michal Jastrzebski (inc0) + Sam Yaple (SamYaple) + Steven Dake (sdake) + + +Milestones +---------- + +Target Milestone for completion: + mitaka + +Work Items +---------- +1. Allow a custom startup script to run (change in Kolla) +2. Add startup scripts to kolla-mesos to read config from zookeeper + instead of bindmounted directory. Propose oslo.config changes to + use this method (oslo work done in parallel, initially this will be + done in the startup script). +3. Add startup scripts for service discovery so that services only + start once their needs are fulfilled. + a. register a service once a service is running + b. wait for dependent services if they are needed before starting + a service. + c. DNS and LB self-configuration based on service registry information +5. Add bootstrapping code to install Marathon, Zookeeper, + Mesos master and slave. +6. Add calls to to marathon to deploy containers. +7. Add support for kolla-mesos to kolla-cli. + +Testing +======= + +Functional tests will be implemented in the OpenStack check/gating system to +automatically check that the Mesos/Marathon deployment works for an AIO environment. + +Documentation Impact +==================== +A quick start guide will be written to explain how to deploy. +A develop guide will be written on how to contribute and how the deployment works. + +References +========== + +- [1] https://mesosphere.github.io/marathon/docs/constraints.html +- [2] https://mesosphere.github.io/marathon/ +- http://radar.oreilly.com/2015/10/swarm-v-fleet-v-kubernetes-v-mesos.html +- https://www.wehkamplabs.com/blog/2015/10/15/applying-consul-within-the-blaze-microservices-platform/