Add split-controlplane spec

First phase of a multi-cycle spec explaining the requirements for deploying a controlplane, then batches of compute/storage nodes independently for scaleout. Co-Authored-By: John Fulton <fulton@redhat.com> Change-Id: Ib3511fc2f611e944143035f70e146234ed7a7204
2017-11-28 15:55:56 +00:00 · 2017-11-28 15:55:56 +00:00 · e3163c4823
commit e3163c4823
parent ae41e2be2d
2 changed files with 248 additions and 0 deletions
--- a/images/split-controlplane/ceph-details.png
+++ b/images/split-controlplane/ceph-details.png
--- a/specs/rocky/split-controlplane.rst
+++ b/specs/rocky/split-controlplane.rst
@ -0,0 +1,248 @@
+..
+ This work is licensed under a Creative Commons Attribution 3.0 Unported
+ License.
+
+ http://creativecommons.org/licenses/by/3.0/legalcode
+
+========================================================
+TripleO Split Control Plane from Compute/Storage Support
+========================================================
+
+https://blueprints.launchpad.net/tripleo/+spec/split-controlplane
+
+This spec introduces support for a mode of deployment where the controlplane
+nodes are deployed and then batches of compute/storage nodes can be added
+independently.
+
+Problem Description
+===================
+
+Currently tripleo deploys all services, for all roles (groups of nodes) in
+a single heat stack.  This works quite well for small to medium size deployments
+but for very large environments, there is considerable benefit to dividing the
+batches of nodes, e.g when deploying many hundreds/thousands of compute nodes.
+
+* Scalability can be improved when deploying a fairly static controlplane then
+  adding batches of e.g compute nodes when demand requires scale out.  The overhead
+  of updating all the nodes in every role for any scale out operation is non-trivial
+  and although this is somewhat mitigated by the split from heat deployed servers
+  to config download & ansible for configuration, making modular deployments easier
+  is of benefit when needing to scale deployments to very large environments.
+
+* Risk reduction - there are often requests to avoid any update to controlplane
+  nodes when adding capacity for e.g compute or storage, and modular deployments
+  makes this easier as no modification is required to the controalplane nodes to
+  e.g add compute nodes.
+
+This spec is not intended to cover all the possible ways achieving modular deployments,
+but instead outline the requirements and give an overview of the interfaces we need to
+consider to enable this flexibility.
+
+Proposed Change
+===============
+
+Overview
+--------
+
+To enable incremental changes, I'm assuming we could still deploy the controlplane
+nodes via the existing architecture, e.g Heat deploys the nodes/networks and we
+then use config download to configure those nodes via ansible.
+
+To deploy compute nodes, we have several options:
+
+1. Deploy multiple "compute only" heat stacks, which would generate
+   ansible playbooks via config download, and consume some output data
+   from the controlplane stack.
+
+2. Deploy additional nodes via mistral, then configure them via
+   ansible (today this still requires heat to generate the
+   playbooks/inventory even if it's a transient stack).
+
+3. Deploy nodes via ansible, then configure them via ansible (again,
+   with the config download mechanism we have available today we'd
+   need heat to generate the configuration data).
+
+The above doesn't consider a "pure ansible" solution as we would have to first make ansible
+role equivalents for all the composable service templates available, and that effort
+is out of scope for this spec.
+
+Scope and Phases
+----------------
+
+The three items listed in the overview cover an incremental approach
+and the first phase is to implement the first item. Though this item
+adds an additional dependency on Heat, this is done only to allow the
+desired functionality using what is available today. In future phases
+any additional dependency on Heat will need to be addressed and any
+changes done during the first phase should be minimal and focus on
+parameter exposure between Heat stacks. Implementation of the other
+items in the overview could span multiple OpenStack development cycles
+and additional details may need to be addressed in future
+specifications.
+
+If a deployer is able to do the following simple scenario, then this
+specification is implemented as phase 1 of the larger feature:
+
+- Deploy a single undercloud with one control-plane network
+- Create a Heat stack called overcloud-controllers with 0 compute nodes
+- Create a Heat stack called overcloud-computes which may be used by the controllers
+- Use the APIs of the controllers to boot an instance on the computes deployed from the overcloud-computes Heat stack
+
+In the above scenario the majority of the work involves exposing the
+correct parameters between Heat stacks so that a controller node is
+able to use a compute node as if it were an external service. This is
+analogous to how TripleO provides a template where properties of an
+external Ceph cluster may be used by TripleO to configure a service
+like Cinder which uses the external Ceph cluster.
+
+The simple scenario above is possible without network isolation. In
+the more complex workload site vs control site scenario, described
+in the following section, network traffic will not be routed through
+the controller. How the networking aspect of that deployment scenario
+is managed will need to be addressed in a separate specification and
+the overall effort will likely to span multiple OpenStack development
+cycles.
+
+For the phase of implementation covered in this specification, the
+compute nodes will be PXE booted by Ironic from the same provisioning
+network as the controller nodes during deployment. Instances booted on
+these compute nodes could connect to a provider network to which their
+compute nodes have direct access. Alternatively these compute nodes
+could be deployed with physical access to the network which hosts
+the overlay networks. The resulting overcloud should look the same as
+one in which the compute nodes were deployed as part of the overcloud
+Heat stack. Thus, the controller and compute nodes will run the same
+services they normally would regardless of if the deployment were
+split between two undercloud Heat stacks. The services on the
+controller and compute nodes could be composed to multiple servers
+but determining the limits of composition is out of scope for the
+first phase.
+
+Example Usecase Scenario: Workload vs Control Sites
+---------------------------------------------------
+
+One application of this feature includes the ability to deploy
+separate workload and control sites. A control site provides
+management and OpenStack API services, e.g. the Nova API and
+Scheduler. A workload site provides resources needed only by the
+workload, e.g. Nova compute resources with local storage in
+availability zones which directly serve workload network traffic
+without routing back to the control site. Though there would be
+additional latency between the control site and workload site with
+respect to managing instances, there would be no reason that the
+workload itself could not perform adequately once running and each
+workload site would have a smaller footprint.
+
+.. image:: ../../images/split-controlplane/ceph-details.png
+   :height: 445px
+   :width: 629px
+   :alt: Diagram of an example control site with multiple workload sites
+   :align: center
+
+This scenario is included in this specification as an example
+application of the feature. This specification does not aim to address
+all of the details of operating separate control and workload sites
+but only to describe how the proposed feature, *deployment of
+independent controlplane and compute nodes*, for TripleO could be
+built upon to simplify deployment of such sites in future versions of
+TripleO. For example the blueprint to make it possible to deploy
+multiple Ceph clusters in the overcloud [1]_ could be applied to
+provide a separate Ceph cluster per workload site, but its scope only
+focuses on changes to roles in order to enable only that feature; it
+is orthogonal to this proposal.
+
+Alternatives
+------------
+
+Alternatives to the incremental change outlined in the overview include reimplementing service
+configuration in ansible, such that nodes can be configured via playbooks without dependency
+on the existing heat+ansible architecture.  Work is ongoing in this area e.g the ansible roles
+to deploy services on k8s, but this spec is primarily concerned with finding an interim
+solution that enables our current architecture to scale to very large deployments.
+
+Security Impact
+---------------
+
+Potentially sensitive data such as passwords will need to be shared between the controlplane
+stack and the compute-only deployments.  Given the admin-only nature of the undercloud I think
+this is OK.
+
+Other End User Impact
+---------------------
+
+Users will have more flexibility and control with regard to how they
+choose to scale their deployments. An example of this includes
+separate control and workload sites as mentioned in the example use
+case scenario.
+
+Performance Impact
+------------------
+
+Potentially better performance at scale, although the total time could be increased assuming
+each scale out is serialized.
+
+Other Deployer Impact
+---------------------
+
+None
+
+
+Developer Impact
+----------------
+
+It is already possible to deploy multiple overcloud Heat stacks from
+one undercloud, but if there are parts of the TripleO tool-chain which
+assume a single Heat stack, they made need to be updated.
+
+Implementation
+==============
+
+Assignee(s)
+-----------
+
+Primary assignee:
+  shardy
+
+Other assignees:
+  gfidente
+  fultonj
+
+
+Work Items
+----------
+
+* Proof of concept showing how to deploy independent controlplane and compute nodes using already landed patches [2]_ and by overriding the EndpointMap
+* If there are problems with overriding the EndpointMap, rework all-nodes-config to output the "all nodes" hieradata and vip details, such that they could span stacks
+* Determine what data are missing in each stack and propose patches to expose the missing data to each stack that needs it
+* Modify the proof of concept to support adding a separate and minimal ceph cluster (mon, mgr, osd) through a heat stack separate from the controller node's heat stack.
+* Refine how the data is shared between each stack to improve the user experience
+* Update the documentation to include an example of the new deployment method
+* Retrospect and write a follow up specification covering details necessary for the next phase
+
+
+Dependencies
+============
+
+None.
+
+Testing
+=======
+
+Ideally scale testing will be performed to validate the scalability
+aspects of this work. For the first phase, any changes done to enable
+the simple scenario described under Scope and Phases will be tested
+manually and the existing CI will ensure they do not break current
+functionality. Changes implemented in the follow up phases could have
+CI scenarios added.
+
+Documentation Impact
+====================
+
+The deployment documation will need to be updated to cover the configuration of
+split controlplane environments.
+
+References
+==========
+
+.. [1] `Make it possible to deploy multiple Ceph clusters in the overcloud <https://blueprints.launchpad.net/tripleo/+spec/deploy-multiple-ceph-clusters>`_
+.. [2] `Topic: topic:compute_only_stack2 <https://review.openstack.org/#/q/topic:compute_only_stack2>`_