Add split-controlplane spec
First phase of a multi-cycle spec explaining the requirements for deploying a controlplane, then batches of compute/storage nodes independently for scaleout. Co-Authored-By: John Fulton <fulton@redhat.com> Change-Id: Ib3511fc2f611e944143035f70e146234ed7a7204
This commit is contained in:
parent
ae41e2be2d
commit
e3163c4823
Binary file not shown.
After Width: | Height: | Size: 582 KiB |
|
@ -0,0 +1,248 @@
|
||||||
|
..
|
||||||
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
||||||
|
License.
|
||||||
|
|
||||||
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
||||||
|
|
||||||
|
========================================================
|
||||||
|
TripleO Split Control Plane from Compute/Storage Support
|
||||||
|
========================================================
|
||||||
|
|
||||||
|
https://blueprints.launchpad.net/tripleo/+spec/split-controlplane
|
||||||
|
|
||||||
|
This spec introduces support for a mode of deployment where the controlplane
|
||||||
|
nodes are deployed and then batches of compute/storage nodes can be added
|
||||||
|
independently.
|
||||||
|
|
||||||
|
Problem Description
|
||||||
|
===================
|
||||||
|
|
||||||
|
Currently tripleo deploys all services, for all roles (groups of nodes) in
|
||||||
|
a single heat stack. This works quite well for small to medium size deployments
|
||||||
|
but for very large environments, there is considerable benefit to dividing the
|
||||||
|
batches of nodes, e.g when deploying many hundreds/thousands of compute nodes.
|
||||||
|
|
||||||
|
* Scalability can be improved when deploying a fairly static controlplane then
|
||||||
|
adding batches of e.g compute nodes when demand requires scale out. The overhead
|
||||||
|
of updating all the nodes in every role for any scale out operation is non-trivial
|
||||||
|
and although this is somewhat mitigated by the split from heat deployed servers
|
||||||
|
to config download & ansible for configuration, making modular deployments easier
|
||||||
|
is of benefit when needing to scale deployments to very large environments.
|
||||||
|
|
||||||
|
* Risk reduction - there are often requests to avoid any update to controlplane
|
||||||
|
nodes when adding capacity for e.g compute or storage, and modular deployments
|
||||||
|
makes this easier as no modification is required to the controalplane nodes to
|
||||||
|
e.g add compute nodes.
|
||||||
|
|
||||||
|
This spec is not intended to cover all the possible ways achieving modular deployments,
|
||||||
|
but instead outline the requirements and give an overview of the interfaces we need to
|
||||||
|
consider to enable this flexibility.
|
||||||
|
|
||||||
|
Proposed Change
|
||||||
|
===============
|
||||||
|
|
||||||
|
Overview
|
||||||
|
--------
|
||||||
|
|
||||||
|
To enable incremental changes, I'm assuming we could still deploy the controlplane
|
||||||
|
nodes via the existing architecture, e.g Heat deploys the nodes/networks and we
|
||||||
|
then use config download to configure those nodes via ansible.
|
||||||
|
|
||||||
|
To deploy compute nodes, we have several options:
|
||||||
|
|
||||||
|
1. Deploy multiple "compute only" heat stacks, which would generate
|
||||||
|
ansible playbooks via config download, and consume some output data
|
||||||
|
from the controlplane stack.
|
||||||
|
|
||||||
|
2. Deploy additional nodes via mistral, then configure them via
|
||||||
|
ansible (today this still requires heat to generate the
|
||||||
|
playbooks/inventory even if it's a transient stack).
|
||||||
|
|
||||||
|
3. Deploy nodes via ansible, then configure them via ansible (again,
|
||||||
|
with the config download mechanism we have available today we'd
|
||||||
|
need heat to generate the configuration data).
|
||||||
|
|
||||||
|
The above doesn't consider a "pure ansible" solution as we would have to first make ansible
|
||||||
|
role equivalents for all the composable service templates available, and that effort
|
||||||
|
is out of scope for this spec.
|
||||||
|
|
||||||
|
Scope and Phases
|
||||||
|
----------------
|
||||||
|
|
||||||
|
The three items listed in the overview cover an incremental approach
|
||||||
|
and the first phase is to implement the first item. Though this item
|
||||||
|
adds an additional dependency on Heat, this is done only to allow the
|
||||||
|
desired functionality using what is available today. In future phases
|
||||||
|
any additional dependency on Heat will need to be addressed and any
|
||||||
|
changes done during the first phase should be minimal and focus on
|
||||||
|
parameter exposure between Heat stacks. Implementation of the other
|
||||||
|
items in the overview could span multiple OpenStack development cycles
|
||||||
|
and additional details may need to be addressed in future
|
||||||
|
specifications.
|
||||||
|
|
||||||
|
If a deployer is able to do the following simple scenario, then this
|
||||||
|
specification is implemented as phase 1 of the larger feature:
|
||||||
|
|
||||||
|
- Deploy a single undercloud with one control-plane network
|
||||||
|
- Create a Heat stack called overcloud-controllers with 0 compute nodes
|
||||||
|
- Create a Heat stack called overcloud-computes which may be used by the controllers
|
||||||
|
- Use the APIs of the controllers to boot an instance on the computes deployed from the overcloud-computes Heat stack
|
||||||
|
|
||||||
|
In the above scenario the majority of the work involves exposing the
|
||||||
|
correct parameters between Heat stacks so that a controller node is
|
||||||
|
able to use a compute node as if it were an external service. This is
|
||||||
|
analogous to how TripleO provides a template where properties of an
|
||||||
|
external Ceph cluster may be used by TripleO to configure a service
|
||||||
|
like Cinder which uses the external Ceph cluster.
|
||||||
|
|
||||||
|
The simple scenario above is possible without network isolation. In
|
||||||
|
the more complex workload site vs control site scenario, described
|
||||||
|
in the following section, network traffic will not be routed through
|
||||||
|
the controller. How the networking aspect of that deployment scenario
|
||||||
|
is managed will need to be addressed in a separate specification and
|
||||||
|
the overall effort will likely to span multiple OpenStack development
|
||||||
|
cycles.
|
||||||
|
|
||||||
|
For the phase of implementation covered in this specification, the
|
||||||
|
compute nodes will be PXE booted by Ironic from the same provisioning
|
||||||
|
network as the controller nodes during deployment. Instances booted on
|
||||||
|
these compute nodes could connect to a provider network to which their
|
||||||
|
compute nodes have direct access. Alternatively these compute nodes
|
||||||
|
could be deployed with physical access to the network which hosts
|
||||||
|
the overlay networks. The resulting overcloud should look the same as
|
||||||
|
one in which the compute nodes were deployed as part of the overcloud
|
||||||
|
Heat stack. Thus, the controller and compute nodes will run the same
|
||||||
|
services they normally would regardless of if the deployment were
|
||||||
|
split between two undercloud Heat stacks. The services on the
|
||||||
|
controller and compute nodes could be composed to multiple servers
|
||||||
|
but determining the limits of composition is out of scope for the
|
||||||
|
first phase.
|
||||||
|
|
||||||
|
Example Usecase Scenario: Workload vs Control Sites
|
||||||
|
---------------------------------------------------
|
||||||
|
|
||||||
|
One application of this feature includes the ability to deploy
|
||||||
|
separate workload and control sites. A control site provides
|
||||||
|
management and OpenStack API services, e.g. the Nova API and
|
||||||
|
Scheduler. A workload site provides resources needed only by the
|
||||||
|
workload, e.g. Nova compute resources with local storage in
|
||||||
|
availability zones which directly serve workload network traffic
|
||||||
|
without routing back to the control site. Though there would be
|
||||||
|
additional latency between the control site and workload site with
|
||||||
|
respect to managing instances, there would be no reason that the
|
||||||
|
workload itself could not perform adequately once running and each
|
||||||
|
workload site would have a smaller footprint.
|
||||||
|
|
||||||
|
.. image:: ../../images/split-controlplane/ceph-details.png
|
||||||
|
:height: 445px
|
||||||
|
:width: 629px
|
||||||
|
:alt: Diagram of an example control site with multiple workload sites
|
||||||
|
:align: center
|
||||||
|
|
||||||
|
This scenario is included in this specification as an example
|
||||||
|
application of the feature. This specification does not aim to address
|
||||||
|
all of the details of operating separate control and workload sites
|
||||||
|
but only to describe how the proposed feature, *deployment of
|
||||||
|
independent controlplane and compute nodes*, for TripleO could be
|
||||||
|
built upon to simplify deployment of such sites in future versions of
|
||||||
|
TripleO. For example the blueprint to make it possible to deploy
|
||||||
|
multiple Ceph clusters in the overcloud [1]_ could be applied to
|
||||||
|
provide a separate Ceph cluster per workload site, but its scope only
|
||||||
|
focuses on changes to roles in order to enable only that feature; it
|
||||||
|
is orthogonal to this proposal.
|
||||||
|
|
||||||
|
Alternatives
|
||||||
|
------------
|
||||||
|
|
||||||
|
Alternatives to the incremental change outlined in the overview include reimplementing service
|
||||||
|
configuration in ansible, such that nodes can be configured via playbooks without dependency
|
||||||
|
on the existing heat+ansible architecture. Work is ongoing in this area e.g the ansible roles
|
||||||
|
to deploy services on k8s, but this spec is primarily concerned with finding an interim
|
||||||
|
solution that enables our current architecture to scale to very large deployments.
|
||||||
|
|
||||||
|
Security Impact
|
||||||
|
---------------
|
||||||
|
|
||||||
|
Potentially sensitive data such as passwords will need to be shared between the controlplane
|
||||||
|
stack and the compute-only deployments. Given the admin-only nature of the undercloud I think
|
||||||
|
this is OK.
|
||||||
|
|
||||||
|
Other End User Impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
Users will have more flexibility and control with regard to how they
|
||||||
|
choose to scale their deployments. An example of this includes
|
||||||
|
separate control and workload sites as mentioned in the example use
|
||||||
|
case scenario.
|
||||||
|
|
||||||
|
Performance Impact
|
||||||
|
------------------
|
||||||
|
|
||||||
|
Potentially better performance at scale, although the total time could be increased assuming
|
||||||
|
each scale out is serialized.
|
||||||
|
|
||||||
|
Other Deployer Impact
|
||||||
|
---------------------
|
||||||
|
|
||||||
|
None
|
||||||
|
|
||||||
|
|
||||||
|
Developer Impact
|
||||||
|
----------------
|
||||||
|
|
||||||
|
It is already possible to deploy multiple overcloud Heat stacks from
|
||||||
|
one undercloud, but if there are parts of the TripleO tool-chain which
|
||||||
|
assume a single Heat stack, they made need to be updated.
|
||||||
|
|
||||||
|
Implementation
|
||||||
|
==============
|
||||||
|
|
||||||
|
Assignee(s)
|
||||||
|
-----------
|
||||||
|
|
||||||
|
Primary assignee:
|
||||||
|
shardy
|
||||||
|
|
||||||
|
Other assignees:
|
||||||
|
gfidente
|
||||||
|
fultonj
|
||||||
|
|
||||||
|
|
||||||
|
Work Items
|
||||||
|
----------
|
||||||
|
|
||||||
|
* Proof of concept showing how to deploy independent controlplane and compute nodes using already landed patches [2]_ and by overriding the EndpointMap
|
||||||
|
* If there are problems with overriding the EndpointMap, rework all-nodes-config to output the "all nodes" hieradata and vip details, such that they could span stacks
|
||||||
|
* Determine what data are missing in each stack and propose patches to expose the missing data to each stack that needs it
|
||||||
|
* Modify the proof of concept to support adding a separate and minimal ceph cluster (mon, mgr, osd) through a heat stack separate from the controller node's heat stack.
|
||||||
|
* Refine how the data is shared between each stack to improve the user experience
|
||||||
|
* Update the documentation to include an example of the new deployment method
|
||||||
|
* Retrospect and write a follow up specification covering details necessary for the next phase
|
||||||
|
|
||||||
|
|
||||||
|
Dependencies
|
||||||
|
============
|
||||||
|
|
||||||
|
None.
|
||||||
|
|
||||||
|
Testing
|
||||||
|
=======
|
||||||
|
|
||||||
|
Ideally scale testing will be performed to validate the scalability
|
||||||
|
aspects of this work. For the first phase, any changes done to enable
|
||||||
|
the simple scenario described under Scope and Phases will be tested
|
||||||
|
manually and the existing CI will ensure they do not break current
|
||||||
|
functionality. Changes implemented in the follow up phases could have
|
||||||
|
CI scenarios added.
|
||||||
|
|
||||||
|
Documentation Impact
|
||||||
|
====================
|
||||||
|
|
||||||
|
The deployment documation will need to be updated to cover the configuration of
|
||||||
|
split controlplane environments.
|
||||||
|
|
||||||
|
References
|
||||||
|
==========
|
||||||
|
|
||||||
|
.. [1] `Make it possible to deploy multiple Ceph clusters in the overcloud <https://blueprints.launchpad.net/tripleo/+spec/deploy-multiple-ceph-clusters>`_
|
||||||
|
.. [2] `Topic: topic:compute_only_stack2 <https://review.openstack.org/#/q/topic:compute_only_stack2>`_
|
Loading…
Reference in New Issue