Deploy cloud hypervisor type
Proposes a method for varying the hypervisor type used by Nova in the deploy cloud to deploy overcloud services from baremetal/ironic to any other hypervisor type supported by Nova. In particular, containers and Docker make this deployment attractive due to their lightweight nature. Change-Id: I7d153b0341e148274590d3d049cf0e2141369e58
This commit is contained in:
parent
35c975a2b3
commit
75de7b236d
|
@ -0,0 +1,258 @@
|
||||||
|
..
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||||
|
License.
|
||||||
|
|
||||||
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
====================================
|
||||||
|
TripleO Deploy Cloud Hypervisor Type
|
||||||
|
====================================
|
||||||
|
|
||||||
|
# TODO: file the actual blueprint...
|
||||||
|
https://blueprints.launchpad.net/tripleo/+spec/tripleo-deploy-cloud-hypervisor-type
|
||||||
|
|
||||||
|
The goal of this spec is to detail how the TripleO deploy cloud type could be
|
||||||
|
varied from just baremetal to baremetal plus other hypervisors to deploy
|
||||||
|
Overcloud services.
|
||||||
|
|
||||||
|
Linux kernel containers make this approach attractive due to the lightweight
|
||||||
|
nature that services and process can be virtualized and isolated, so it seems
|
||||||
|
likely that libvirt+lxc and Docker would be likely targets. However we should
|
||||||
|
aim to make this approach as agnostic as possible for those deployers who may
|
||||||
|
wish to use any Nova driver, such as libvirt+kvm.
|
||||||
|
|
||||||
|
Problem Description
|
||||||
|
===================
|
||||||
|
|
||||||
|
The overcloud control plane is generally lightly loaded and allocation of
|
||||||
|
entire baremetal machines to it is wasteful. Also, when the Overcloud services
|
||||||
|
are running entirely on baremetal they take longer to upgrade and rollback.
|
||||||
|
|
||||||
|
Proposed Change
|
||||||
|
===============
|
||||||
|
|
||||||
|
We should support any Nova virtualization type as a target for Overcloud
|
||||||
|
services, as opposed to using baremetal nodes to deploy overcloud images.
|
||||||
|
Containers are particularly attractive because they are lightweight, easy to
|
||||||
|
upgrade/rollback and offer similar isolation and security as full VM's. For the
|
||||||
|
purpose of this spec, the alternate Nova virtualization target for the
|
||||||
|
Overcloud will be referred to as alt-hypervisor. alt-hypervisor could be
|
||||||
|
substituted with libvirt+lxc, Docker, libvirt+kvm, etc.
|
||||||
|
|
||||||
|
At a minimum, we should support running each Overcloud service in isolation in
|
||||||
|
its own alt-hypervisor instance in order to be as flexible as possible to deployer
|
||||||
|
needs. We should also support combining services.
|
||||||
|
|
||||||
|
In order to make other alt-hypervisors available as deployment targets for the
|
||||||
|
Overcloud, we need additional Nova Compute nodes/services configured to use
|
||||||
|
alt-hypervisors registered with the undercloud Nova.
|
||||||
|
|
||||||
|
Additionally, the undercloud must still be running a Nova compute with the
|
||||||
|
ironic driver in order to allow for scaling itself out to add additional
|
||||||
|
undercloud compute nodes.
|
||||||
|
|
||||||
|
To accomplish this, we can run 2 Nova compute processes on each undercloud
|
||||||
|
node. One configured with Nova+Ironic and one configured with
|
||||||
|
Nova+alt-hypervisor. For the straight baremetal deployment, where an alternate
|
||||||
|
hypervisor is not desired, the additional Nova compute process would not be
|
||||||
|
included. This would be accomplished via the standard inclusion/exclusion of
|
||||||
|
elements during a diskimage-builder tripleo image build.
|
||||||
|
|
||||||
|
It will also be possible to build and deploy just an alt-hypervisor compute
|
||||||
|
node that is registered with the Undercloud as an additional compute node.
|
||||||
|
|
||||||
|
To minimize the changes needed to the elements, we will aim to run a full init
|
||||||
|
stack in each alt-hypervisor instance, such as systemd. This will allow all the
|
||||||
|
services that we need to also be running in the instance (cloud-init,
|
||||||
|
os-collect-config, etc). It will also make troubleshooting similar to the
|
||||||
|
baremetal process in that you'd be able to ssh to individual instances, read
|
||||||
|
logs, restart services, turn on debug mode, etc.
|
||||||
|
|
||||||
|
To handle Neutron network configuration for the Overcloud, the Overcloud
|
||||||
|
neutron L2 agent will have to be on a provider network that is shared between
|
||||||
|
the hypervisors. VLAN provider networks will have to be modeled in Neutron and
|
||||||
|
connected to alt-hypervisor instances.
|
||||||
|
|
||||||
|
Overcloud compute nodes themselves would be deployed to baremetal nodes. These
|
||||||
|
images would be made up of:
|
||||||
|
* libvirt+kvm (assuming this is the hypervisor choice for the Overcloud)
|
||||||
|
* nova-compute + libvirt+kvm driver (registered to overcloud control).
|
||||||
|
* neutron-l2-agent (registered to overcloud control)
|
||||||
|
An image with those contents is deployed to a baremetal node via nova+ironic
|
||||||
|
from the undercloud.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
Deployment from the seed
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
An alternative to having the undercloud deploy additional alt-hypervisor
|
||||||
|
compute nodes would be to register additional baremetal nodes with the seed vm,
|
||||||
|
and then describe an undercloud stack in a template that is the undercloud
|
||||||
|
controller and its set of alt-hypervisor compute nodes. When the undercloud
|
||||||
|
is deployed via the seed, all of the nodes are set up initially.
|
||||||
|
|
||||||
|
The drawback with that approach is that the seed is meant to be short-lived in
|
||||||
|
the long term. So, it then becomes difficult to scale out the undercloud if
|
||||||
|
needed. We could offer a hybrid of the 2 models: launch all nodes initially
|
||||||
|
from the seed, but still have the functionality in the undercloud to deploy
|
||||||
|
more alt-hypervisor compute nodes if needed.
|
||||||
|
|
||||||
|
The init process
|
||||||
|
^^^^^^^^^^^^^^^^
|
||||||
|
If running systemd in a container turns out to be problematic, it should be
|
||||||
|
possible to run a single process in the container that starts just the
|
||||||
|
OpenStack service that we care about. However that process would also need to
|
||||||
|
do things like read Heat metadata. It's possible this process could be
|
||||||
|
os-collect-config. This change would require more changes to the elements
|
||||||
|
themselves however since they are so dependent on an init process currently in
|
||||||
|
how they enable/restart services etc. It may be possible to replace os-svc-*
|
||||||
|
with other tools that don't use systemd or upstart when you're building images
|
||||||
|
for containers.
|
||||||
|
|
||||||
|
Security Impact
|
||||||
|
---------------
|
||||||
|
* We should aim for equivalent security when deploying to alt-hypervisor
|
||||||
|
instances as we do when deploying to baremetal. To the best of our ability, it
|
||||||
|
should not be possible to compromise the instance if an individual service is
|
||||||
|
compromised.
|
||||||
|
|
||||||
|
* Since Overcloud services and Undercloud services would be co-located on the
|
||||||
|
same baremetal machine, compromising the hypervisor and gaining access to the
|
||||||
|
host is a risk to both the Undercloud and Overcloud. We should mitigate this
|
||||||
|
risk to the best of our ability via things like SELinux, and removing all
|
||||||
|
unecessary software/processes from the alt-hypervisor instances.
|
||||||
|
|
||||||
|
* Certain hypervisors are inherently more secure than others. libvirt+kvm uses
|
||||||
|
virtualization and is much more secure then container based hypervisors such as
|
||||||
|
libvirt+lxc and Docker which use namespacing.
|
||||||
|
|
||||||
|
Other End User Impact
|
||||||
|
---------------------
|
||||||
|
None. The impact of this change is limited to Deployers. End users should have
|
||||||
|
no visibility into the actual infrastructure of the Overcloud.
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
Ideally, deploying an overcloud to containers should result in a faster
|
||||||
|
deployment than deploying to baremetal. Upgrading and downgrading the Overcloud
|
||||||
|
should also be faster.
|
||||||
|
|
||||||
|
More images will have to be built via diskimage-builder however, which will
|
||||||
|
take more time.
|
||||||
|
|
||||||
|
Other Deployer Impact
|
||||||
|
---------------------
|
||||||
|
The main impact to deployers will be the ability to use alt-hypervisors
|
||||||
|
instances, such as containers if they wish. They also must understand how to
|
||||||
|
use nova-baremetal/ironic on the undercloud to scale out the undercloud and add
|
||||||
|
additional alt-hypervisor compute nodes if needed.
|
||||||
|
|
||||||
|
Additional space in the configured glance backend would also likely be needed
|
||||||
|
to store additional images.
|
||||||
|
|
||||||
|
Developer Impact
|
||||||
|
----------------
|
||||||
|
* Developers working on TripleO will have the option of deploying to
|
||||||
|
alt-hypervisor instances. This should make testing and developing on some
|
||||||
|
aspects of TripleO easier due to the need for less vm's.
|
||||||
|
|
||||||
|
* More images will have to be built due to the greater potential variety with
|
||||||
|
alt-hypervisor instances housing Overcloud services.
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
james-slagle
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
tripleo-incubator
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
* document how to use an alternate hypervisor for the overcloud deployment
|
||||||
|
** eventually, this could possibly be the default
|
||||||
|
* document how to troubleshoot this type of deployment
|
||||||
|
* need a user option or json property to describe if the devtest
|
||||||
|
environment being set up should use an alternate hypervisor for the overcloud
|
||||||
|
deployment or not. Consider using HEAT_ENV where appropriate.
|
||||||
|
* load-image should be updated to add an additional optional argument that sets
|
||||||
|
the hypervisor_type property on the loaded images in glance. The argument is
|
||||||
|
optional and wouldn't need to be specified for some images, such as regular
|
||||||
|
dib images that can run under KVM.
|
||||||
|
* Document commands to setup-neutron for modeling provider VLAN networks.
|
||||||
|
|
||||||
|
tripleo-image-elements
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
* add new element for nova docker driver
|
||||||
|
* add new element for docker registry (currently required by nova docker
|
||||||
|
driver)
|
||||||
|
* more hypervisor specific configuration files for the different nova compute
|
||||||
|
driver elements
|
||||||
|
** /etc/nova/compute/nova-kvm.conf
|
||||||
|
** /etc/nova/compute/nova-baremetal.conf
|
||||||
|
** /etc/nova/compute/nova-ironic.conf
|
||||||
|
** /etc/nova/compute/nova-docker.conf
|
||||||
|
* Separate configuration options per compute process for:
|
||||||
|
** host (undercloud-kvm, undercloud-baremetal, etc).
|
||||||
|
** state_path (/var/lib/nova-kvm, /var/lib/nova-baremetal, etc).
|
||||||
|
* Maintain backwards compatibility in the elements by consulting both old and
|
||||||
|
new heat metadata key namespaces.
|
||||||
|
|
||||||
|
tripleo-heat-templates
|
||||||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||||||
|
* Split out heat metadata into separate namespaces for each compute process
|
||||||
|
configuration.
|
||||||
|
* For the vlan case, update templates for any network modeling for
|
||||||
|
alt-hypervisor instances so that those instances have correct interfaces
|
||||||
|
attached to the vlan network.
|
||||||
|
|
||||||
|
diskimage-builder
|
||||||
|
^^^^^^^^^^^^^^^^^
|
||||||
|
* add ability where needed to build new image types for alt-hypervisor
|
||||||
|
** Docker
|
||||||
|
** libvirt+lxc
|
||||||
|
* Document how to build images for the new types
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
For Docker support, this effort depends on continued development on the nova
|
||||||
|
Docker driver. We would need to drive any missing features or bug fixes that
|
||||||
|
were needed in that project.
|
||||||
|
|
||||||
|
For other drivers that may not be as well supported as libvirt+kvm, we will
|
||||||
|
also have to drive missing features there as well if we want to support them,
|
||||||
|
such as libvirt+lxc, openvz, etc.
|
||||||
|
|
||||||
|
This effort also depends on the provider resource templates spec (unwritten)
|
||||||
|
that will be done for the template backend for Tuskar. That work should be done
|
||||||
|
in such a way that the provider resource templates are reusable for this effort
|
||||||
|
as well in that you will be able to create templates to match the images that
|
||||||
|
you intend to create for your Overcloud deployment.
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
We would need a separate set of CI jobs that were configured to deploy an
|
||||||
|
Overcloud to each alternate hypervisor that TripleO intended to support well.
|
||||||
|
|
||||||
|
For Docker support specifically, CI jobs could be considered non-voting since
|
||||||
|
they'd rely on a stackforge project which isn't officially part of OpenStack.
|
||||||
|
We could potentially make this job voting if TripleO CI was enabled on the
|
||||||
|
stackforge/nova-docker repo so that changes there are less likely to break
|
||||||
|
TripleO deployments.
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
We should update the TripleO specific docs in tripleo-incubator to document how
|
||||||
|
to use an alternate hypervisor for an Overcloud deployment.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
Juno Design Summit etherpad: https://etherpad.openstack.org/p/juno-summit-tripleo-and-docker
|
||||||
|
nova-docker driver: https://git.openstack.org/cgit/stackforge/nova-docker
|
||||||
|
Docker: https://www.docker.io/
|
||||||
|
Docker github: https://github.com/dotcloud/docker
|
Loading…
Reference in New Issue