Move specs: airship-in-a-bottle to airship-specs

Moves the two blueprint/spec documents that existed in airship-in-a-bottle to the airship-specs. The implemented spec was not reformatted to the spec template. The other spec (in approved folder) was minimally updated to the spec template. Change-Id: I7468579e2fa3077ee1144e5294eba97d8e4ced05
2018-08-01 13:16:47 -05:00 · 2018-08-01 13:16:47 -05:00 · bfbfd56c81
parent 6e0a18e7fa
commit bfbfd56c81
2 changed files with 1189 additions and 0 deletions
--- a/specs/approved/workflow_node-teardown.rst
+++ b/specs/approved/workflow_node-teardown.rst
@ -0,0 +1,620 @@
+..
+      Copyright 2018 AT&T Intellectual Property.
+      All Rights Reserved.
+
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+.. index::
+   single: Teardown node
+   single: workflow;redeploy_server
+   single: Drydock
+   single: Promenade
+   single: Shipyard
+
+
+.. _node-teardown:
+
+=====================
+Airship Node Teardown
+=====================
+
+Shipyard is the entrypoint for Airship actions, including the need to redeploy a
+server. The first part of redeploying a server is the graceful teardown of the
+software running on the server; specifically Kubernetes and etcd are of
+critical concern. It is the duty of Shipyard to orchestrate the teardown of the
+server, followed by steps to deploy the desired new configuration. This design
+covers only the first portion - node teardown
+
+
+Links
+=====
+
+None
+
+Problem description
+===================
+
+When redeploying a physical host (server) using the Airship Platform,
+it is necessary to trigger a sequence of steps to prevent undesired behaviors
+when the server is redeployed. This blueprint intends to document the
+interaction that must occur between Airship components to teardown a server.
+
+Impacted components
+===================
+
+Drydock
+Promenade
+Shipyard
+
+Proposed change
+===============
+
+Shipyard node teardown Process
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+#. (Existing) Shipyard receives request to redeploy_server, specifying a target
+   server.
+#. (Existing) Shipyard performs preflight, design reference lookup, and
+   validation steps.
+#. (New) Shipyard invokes Promenade to decommission a node.
+#. (New) Shipyard invokes Drydock to destroy the node - setting a node
+   filter to restrict to a single server.
+#. (New) Shipyard invokes Promenade to remove the node from the Kubernetes
+   cluster.
+
+Assumption:
+node_id is the hostname of the server, and is also the identifier that both
+Drydock and Promenade use to identify the appropriate parts - hosts and k8s
+nodes. This convention is set by the join script produced by promenade.
+
+Drydock Destroy Node
+--------------------
+The API/interface for destroy node already exists. The implementation within
+Drydock needs to be developed. This interface will need to accept both the
+specified node_id and the design_id to retrieve from Deckhand.
+
+Using the provided node_id (hardware node), and the design_id, Drydock will
+reset the hardware to a re-provisionable state.
+
+By default, all local storage should be wiped (per datacenter policy for
+wiping before re-use).
+
+An option to allow for only the OS disk to be wiped should be supported, such
+that other local storage is left intact, and could be remounted without data
+loss. e.g.: --preserve-local-storage
+
+The target node should be shut down.
+
+The target node should be removed from the provisioner (e.g. MaaS)
+
+Responses
+~~~~~~~~~
+The responses from this functionality should follow the pattern set by prepare
+nodes, and other Drydock functionality. The Drydock status responses used for
+all async invocations will be utilized for this functionality.
+
+Promenade Decommission Node
+---------------------------
+Performs steps that will result in the specified node being cleanly
+disassociated from Kubernetes, and ready for the server to be destroyed.
+Users of the decommission node API should be aware of the long timeout values
+that may occur while awaiting promenade to complete the appropriate steps.
+At this time, Promenade is a stateless service and doesn't use any database
+storage. As such, requests to Promenade are synchronous.
+
+.. code:: json
+
+  POST /nodes/{node_id}/decommission
+
+  {
+    rel : "design",
+    href: "deckhand+https://{{deckhand_url}}/revisions/{{revision_id}}/rendered-documents",
+    type: "application/x-yaml"
+  }
+
+Such that the design reference body is the design indicated when the
+redeploy_server action is invoked through Shipyard.
+
+Query Parameters:
+
+-  drain-node-timeout: A whole number timeout in seconds to be used for the
+   drain node step (default: none). In the case of no value being provided,
+   the drain node step will use its default.
+-  drain-node-grace-period: A whole number in seconds indicating the
+   grace-period that will be provided to the drain node step. (default: none).
+   If no value is specified, the drain node step will use its default.
+-  clear-labels-timeout: A whole number timeout in seconds to be used for the
+   clear labels step. (default: none).  If no value is specified, clear labels
+   will use its own default.
+-  remove-etcd-timeout: A whole number timeout in seconds to be used for the
+   remove etcd from nodes step. (default: none). If no value is specified,
+   remove-etcd will use its own default.
+-  etcd-ready-timeout: A whole number in seconds indicating how long the
+   decommission node request should allow for etcd clusters to become stable
+   (default: 600).
+
+Process
+~~~~~~~
+Acting upon the node specified by the invocation and the design reference
+details:
+
+#. Drain the Kubernetes node.
+#. Clear the Kubernetes labels on the node.
+#. Remove etcd nodes from their clusters (if impacted).
+   - if the node being decommissioned contains etcd nodes, Promenade will
+   attempt to gracefully have those nodes leave the etcd cluster.
+#. Ensure that etcd cluster(s) are in a stable state.
+   - Polls for status every 30 seconds up to the etcd-ready-timeout, or the
+   cluster meets the defined minimum functionality for the site.
+   - A new document: promenade/EtcdClusters/v1 that will specify details about
+   the etcd clusters deployed in the site, including: identifiers,
+   credentials, and thresholds for minimum functionality.
+   - This process should ignore the node being torn down from any calculation
+   of health
+#. Shutdown the kubelet.
+   - If this is not possible because the node is in a state of disarray such
+   that it cannot schedule the daemonset to run, this step may fail, but
+   should not hold up the process, as the Drydock dismantling of the node
+   will shut the kubelet down.
+
+Responses
+~~~~~~~~~
+All responses will be form of the Airship Status response.
+
+-  Success: Code: 200, reason: Success
+
+   Indicates that all steps are successful.
+
+-  Failure: Code: 404, reason: NotFound
+
+   Indicates that the target node is not discoverable by Promenade.
+
+-  Failure: Code: 500, reason: DisassociateStepFailure
+
+   The details section should detail the successes and failures further. Any
+   4xx series errors from the individual steps would manifest as a 500 here.
+
+Promenade Drain Node
+--------------------
+Drain the Kubernetes node for the target node. This will ensure that this node
+is no longer the target of any pod scheduling, and evicts or deletes the
+running pods. In the case of notes running DaemonSet manged pods, or pods
+that would prevent a drain from occurring, Promenade may be required to provide
+the `ignore-daemonsets` option or `force` option to attempt to drain the node
+as fully as possible.
+
+By default, the drain node will utilize a grace period for pods of 1800
+seconds and a total timeout of 3600 seconds (1 hour). Clients of this
+functionality should be prepared for a long timeout.
+
+.. code:: json
+
+  POST /nodes/{node_id}/drain
+
+Query Paramters:
+
+-  timeout: a whole number in seconds (default = 3600). This value is the total
+   timeout for the kubectl drain command.
+-  grace-period: a whole number in seconds (default = 1800). This value is the
+   grace period used by kubectl drain. Grace period must be less than timeout.
+
+.. note::
+
+   This POST has no message body
+
+Example command being used for drain (reference only)
+`kubectl drain --force --timeout 3600s --grace-period 1800 --ignore-daemonsets --delete-local-data n1`
+https://git.openstack.org/cgit/openstack/airship-promenade/tree/promenade/templates/roles/common/usr/local/bin/promenade-teardown
+
+Responses
+~~~~~~~~~
+All responses will be form of the Airship Status response.
+
+-  Success: Code: 200, reason: Success
+
+   Indicates that the drain node has successfully concluded, and that no pods
+   are currently running
+
+-  Failure: Status response, code: 400, reason: BadRequest
+
+   A request was made with parameters that cannot work - e.g. grace-period is
+   set to a value larger than the timeout value.
+
+-  Failure: Status response, code: 404, reason: NotFound
+
+   The specified node is not discoverable by Promenade
+
+-  Failure: Status response, code: 500, reason: DrainNodeError
+
+   There was a processing exception raised while trying to drain a node. The
+   details section should indicate the underlying cause if it can be
+   determined.
+
+Promenade Clear Labels
+----------------------
+Removes the labels that have been added to the target kubernetes node.
+
+.. code:: json
+
+  POST /nodes/{node_id}/clear-labels
+
+Query Parameters:
+
+-  timeout: A whole number in seconds allowed for the pods to settle/move
+   following removal of labels. (Default = 1800)
+
+.. note::
+
+   This POST has no message body
+
+Responses
+~~~~~~~~~
+All responses will be form of the UCP Status response.
+
+-  Success: Code: 200, reason: Success
+
+   All labels have been removed from the specified Kubernetes node.
+
+-  Failure: Code: 404, reason: NotFound
+
+   The specified node is not discoverable by Promenade
+
+-  Failure: Code: 500, reason: ClearLabelsError
+
+   There was a failure to clear labels that prevented completion. The details
+   section should provide more information about the cause of this failure.
+
+Promenade Remove etcd Node
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+Checks if the node specified contains any etcd nodes. If so, this API will
+trigger that etcd node to leave the associated etcd cluster::
+
+  POST /nodes/{node_id}/remove-etcd
+
+  {
+    rel : "design",
+    href: "deckhand+https://{{deckhand_url}}/revisions/{{revision_id}}/rendered-documents",
+    type: "application/x-yaml"
+  }
+
+Query Parameters:
+
+-  timeout: A whole number in seconds allowed for the removal of etcd nodes
+   from the targe node. (Default = 1800)
+
+Responses
+~~~~~~~~~
+All responses will be form of the UCP Status response.
+
+-  Success: Code: 200, reason: Success
+
+   All etcd nodes have been removed from the specified node.
+
+-  Failure: Code: 404, reason: NotFound
+
+   The specified node is not discoverable by Promenade
+
+-  Failure: Code: 500, reason: RemoveEtcdError
+
+   There was a failure to remove etcd from the target node that prevented
+   completion within the specified timeout, or that etcd prevented removal of
+   the node because it would result in the cluster being broken. The details
+   section should provide more information about the cause of this failure.
+
+
+Promenade Check etcd
+~~~~~~~~~~~~~~~~~~~~
+Retrieves the current interpreted state of etcd.
+
+GET /etcd-cluster-health-statuses?design_ref={the design ref}
+
+Where the design_ref parameter is required for appropriate operation, and is in
+the same format as used for the join-scripts API.
+
+Query Parameters:
+
+-  design_ref: (Required) the design reference to be used to discover etcd
+   instances.
+
+Responses
+~~~~~~~~~
+All responses will be form of the UCP Status response.
+
+-  Success: Code: 200, reason: Success
+
+   The status of each etcd in the site will be returned in the details section.
+   Valid values for status are: Healthy, Unhealthy
+
+https://github.com/att-comdev/ucp-integration/blob/master/docs/source/api-conventions.rst#status-responses
+
+.. code:: json
+
+  { "...": "... standard status response ...",
+    "details": {
+      "errorCount": {{n}},
+      "messageList": [
+        { "message": "Healthy",
+          "error": false,
+          "kind": "HealthMessage",
+          "name": "{{the name of the etcd service}}"
+        },
+        { "message": "Unhealthy"
+          "error": false,
+          "kind": "HealthMessage",
+          "name": "{{the name of the etcd service}}"
+        },
+        { "message": "Unable to access Etcd"
+          "error": true,
+          "kind": "HealthMessage",
+          "name": "{{the name of the etcd service}}"
+        }
+      ]
+    }
+    ...
+  }
+
+-  Failure: Code: 400, reason: MissingDesignRef
+
+   Returned if the design_ref parameter is not specified
+
+-  Failure: Code: 404, reason: NotFound
+
+   Returned if the specified etcd could not be located
+
+-  Failure: Code: 500, reason: EtcdNotAccessible
+
+   Returned if the specified etcd responded with an invalid health response
+   (Not just simply unhealthy - that's a 200).
+
+
+Promenade Shutdown Kubelet
+--------------------------
+Shuts down the kubelet on the specified node. This is accomplished by Promenade
+setting the label `promenade-decomission: enabled` on the node, which will
+trigger a newly-developed daemonset to run something like:
+`systemctl disable kubelet && systemctl stop kubelet`.
+This daemonset will effectively sit dormant until nodes have the appropriate
+label added, and then perform the kubelet teardown.
+
+.. code:: json
+
+  POST /nodes/{node_id}/shutdown-kubelet
+
+.. note::
+
+   This POST has no message body
+
+Responses
+~~~~~~~~~
+All responses will be form of the UCP Status response.
+
+-  Success: Code: 200, reason: Success
+
+   The kubelet has been successfully shutdown
+
+-  Failure: Code: 404, reason: NotFound
+
+   The specified node is not discoverable by Promenade
+
+-  Failure: Code: 500, reason: ShutdownKubeletError
+
+   The specified node's kubelet fails to shutdown. The details section of the
+   status response should contain reasonable information about the source of
+   this failure
+
+Promenade Delete Node from Cluster
+----------------------------------
+Updates the Kubernetes cluster, removing the specified node. Promenade should
+check that the node is drained/cordoned and has no labels other than
+`promenade-decomission: enabled`. In either of these cases, the API should
+respond with a 409 Conflict response.
+
+.. code:: json
+
+  POST /nodes/{node_id}/remove-from-cluster
+
+.. note::
+
+   This POST has no message body
+
+Responses
+~~~~~~~~~
+All responses will be form of the UCP Status response.
+
+-  Success: Code: 200, reason: Success
+
+   The specified node has been removed from the Kubernetes cluster.
+
+-  Failure: Code: 404, reason: NotFound
+
+   The specified node is not discoverable by Promenade
+
+-  Failure: Code: 409, reason: Conflict
+
+   The specified node cannot be deleted due to checks that the node is
+   drained/cordoned and has no labels (other than possibly
+   `promenade-decomission: enabled`).
+
+-  Failure: Code: 500, reason: DeleteNodeError
+
+   The specified node cannot be removed from the cluster due to an error from
+   Kubernetes. The details section of the status response should contain more
+   information about the failure.
+
+
+Shipyard Tag Releases
+---------------------
+Shipyard will need to mark Deckhand revisions with tags when there are
+successful deploy_site or update_site actions to be able to determine the last
+known good design. This is related to issue 16 for Shipyard, which utilizes the
+same need.
+
+.. note::
+
+   Repeated from https://github.com/att-comdev/shipyard/issues/16
+
+   When multiple configdocs commits have been done since the last deployment,
+   there is no ready means to determine what's being done to the site. Shipyard
+   should reject deploy site or update site requests that have had multiple
+   commits since the last site true-up action. An option to override this guard
+   should be allowed for the actions in the form of a parameter to the action.
+
+   The configdocs API should provide a way to see what's been changed since the
+   last site true-up, not just the last commit of configdocs. This might be
+   accommodated by new deckhand tags like the 'commit' tag, but for
+   'site true-up' or similar applied by the deploy and update site commands.
+
+The design for issue 16 includes the bare-minimum marking of Deckhand
+revisions. This design is as follows:
+
+Scenario
+~~~~~~~~
+Multiple commits occur between site actions (deploy_site, update_site) - those
+actions that attempt to bring a site into compliance with a site design.
+When this occurs, the current system of being able to only see what has changed
+between committed and the the buffer versions (configdocs diff) is insufficient
+to be able to investigate what has changed since the last successful (or
+unsuccessful) site action.
+To accommodate this, Shipyard needs several enhancements.
+
+Enhancements
+~~~~~~~~~~~~
+
+#. Deckhand revision tags for site actions
+
+   Using the tagging facility provided by Deckhand, Shipyard will tag the end
+   of site actions.
+   Upon completing a site action successfully tag the revision being used with
+   the tag site-action-success, and a body of dag_id:<dag_id>
+
+   Upon completion of a site action unsuccessfully, tag the revision being used
+   with the tag site-action-failure, and a body of dag_id:<dag_id>
+
+   The completion tags should only be applied upon failure if the site action
+   gets past document validation successfully (i.e. gets to the point where it
+   can start making changes via the other UCP components)
+
+   This could result in a single revision having both site-action-success and
+   site-action-failure if a later re-invocation of a site action is successful.
+
+#. Check for intermediate committed revisions
+
+   Upon running a site action, before tagging the revision with the site action
+   tag(s), the dag needs to check to see if there are committed revisions that
+   do not have an associated site-action tag.  If there are any committed
+   revisions since the last site action other than the current revision being
+   used (between them), then the action should not be allowed to proceed (stop
+   before triggering validations). For the calculation of intermediate
+   committed revisions, assume revision 0 if there are no revisions with a
+   site-action tag (null case)
+
+   If the action is invoked with a parameter of
+   allow-intermediate-commits=true, then this check should log that the
+   intermediate committed revisions check is being skipped and not take any
+   other action.
+
+#. Support action parameter of allow-intermediate-commits=true|false
+
+   In the CLI for create action, the --param option supports adding parameters
+   to actions. The parameters passed should be relayed by the CLI to the API
+   and ultimately to the invocation of the DAG.  The DAG as noted above will
+   check for the presense of allow-intermediate-commits=true.  This needs to be
+   tested to work.
+
+#. Shipyard needs to support retrieving configdocs and rendered documents for
+   the last successful site action, and last site action (successful or not
+   successful)
+
+   --successful-site-action
+   --last-site-action
+   These options would be mutually exclusive of --buffer or --committed
+
+#. Shipyard diff (shipyard get configdocs)
+
+   Needs to support an option to do the diff of the buffer vs. the last
+   successful site action and the last site action (succesful or not
+   successful).
+
+   Currently there are no options to select which versions to diff (always
+   buffer vs. committed)
+
+   support:
+   --base-version=committed | successful-site-action | last-site-action (Default = committed)
+   --diff-version=buffer | committed | successful-site-action | last-site-action (Default = buffer)
+
+   Equivalent query parameters need to be implemented in the API.
+
+Because the implementation of this design will result in the tagging of
+successful site-actions, Shipyard will be able to determine the correct
+revision to use while attempting to teardown a node.
+
+If the request to teardown a node indicates a revision that doesn't exist, the
+command to do so (e.g. redeploy_server) should not continue, but rather fail
+due to a missing precondition.
+
+The invocation of the Promenade and Drydock steps in this design will utilize
+the appropriate tag based on the request (default is successful-site-action) to
+determine the revision of the Deckhand documents used as the design-ref.
+
+Shipyard redeploy_server Action
+-------------------------------
+The redeploy_server action currently accepts a target node. Additional
+supported parameters are needed:
+
+#. preserve-local-storage=true which will instruct Drydock to only wipe the
+   OS drive, and any other local storage will not be wiped. This would allow
+   for the drives to be remounted to the server upon re-provisioning. The
+   default behavior is that local storage is not preserved.
+
+#. target-revision=committed | successful-site-action | last-site-action
+   This will indicate which revision of the design will be used as the
+   reference for what should be re-provisioned after the teardown.
+   The default is successful-site-action, which is the closest representation
+   to the last-known-good state.
+
+These should be accepted as parameters to the action API/CLI and modify the
+behavior of the redeploy_server DAG.
+
+Security impact
+---------------
+
+None. This change introduces no new security concerns outside of established
+patterns for RBAC controls around API endpoints.
+
+Performance impact
+------------------
+
+As this is an on-demand action, there is no expected performance impact to
+existing processes, although tearing down a host may result in temporary
+degraded service capacity in the case of needing to move workloads to different
+hosts, or a more simple case of reduced capacity.
+
+Alternatives
+------------
+
+N/A
+
+Implementation
+==============
+
+None at this time
+
+Dependencies
+============
+
+None.
+
+
+References
+==========
+
+None
--- a/specs/implemented/deployment-grouping-baremetal.rst
+++ b/specs/implemented/deployment-grouping-baremetal.rst
@ -0,0 +1,569 @@
+..
+      Copyright 2018 AT&T Intellectual Property.
+      All Rights Reserved.
+
+      Licensed under the Apache License, Version 2.0 (the "License"); you may
+      not use this file except in compliance with the License. You may obtain
+      a copy of the License at
+
+          http://www.apache.org/licenses/LICENSE-2.0
+
+      Unless required by applicable law or agreed to in writing, software
+      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+      License for the specific language governing permissions and limitations
+      under the License.
+
+.. index::
+   single: Deployment grouping
+   single: workflow
+   single: Shipyard
+   single: Drydock
+
+.. _deployment-grouping-baremetal:
+
+=======================================
+Deployment Grouping for Baremetal Nodes
+=======================================
+One of the primary functionalities of the Undercloud Platform is the deployment
+of baremetal nodes as part of site deployment and upgrade. This blueprint aims
+to define how deployment strategies can be applied to the workflow during these
+actions.
+
+.. note::
+
+  This document has been moved from the airship-in-a-bottle project, and is
+  previously implemented. The format of this document diverges from the
+  standard template for airship-specs.
+
+Overview
+--------
+When Shipyard is invoked for a deploy_site or update_site action, there are
+three primary stages:
+
+1. Preparation and Validation
+2. Baremetal and Network Deployment
+3. Software Deployment
+
+During the Baremetal and Network Deployment stage, the deploy_site or
+update_site workflow (and perhaps other workflows in the future) invokes
+Drydock to verify the site, prepare the site, prepare the nodes, and deploy the
+nodes. Each of these steps is described in the `Drydock Orchestrator Readme`_
+
+.. _Drydock Orchestrator Readme: https://git.openstack.org/cgit/openstack/airship-drydock/plain/drydock_provisioner/orchestrator/readme.md
+
+The prepare nodes and deploy nodes steps each involve intensive and potentially
+time consuming operations on the target nodes, orchestrated by Drydock and
+MAAS. These steps need to be approached and managed such that grouping,
+ordering, and criticality of success of nodes can be managed in support of
+fault tolerant site deployments and updates.
+
+For the purposes of this document `phase of deployment` refer to the prepare
+nodes and deploy nodes steps of the Baremetal and Network deployment.
+
+Some factors that advise this solution:
+
+1. Limits to the amount of parallelization that can occur due to a centralized
+   MAAS system.
+2. Faults in the hardware, preventing operational nodes.
+3. Miswiring or configuration of network hardware.
+4. Incorrect site design causing a mismatch against the hardware.
+5. Criticality of particular nodes to the realization of the site design.
+6. Desired configurability within the framework of the UCP declarative site
+   design.
+7. Improved visibility into the current state of node deployment.
+8. A desire to begin the deployment of nodes before the finish of the
+   preparation of nodes -- i.e. start deploying nodes as soon as they are ready
+   to be deployed. Note: This design will not achieve new forms of
+   task parallelization within Drydock; this is recognized as a desired
+   functionality.
+
+Solution
+--------
+Updates supporting this solution will require changes to Shipyard for changed
+workflows and Drydock for the desired node targeting, and for retrieval of
+diagnostic and result information.
+
+.. index::
+   single: Shipyard Documents; DeploymentStrategy
+
+Deployment Strategy Document (Shipyard)
+---------------------------------------
+To accommodate the needed changes, this design introduces a new
+DeploymentStrategy document into the site design to be read and utilized
+by the workflows for update_site and deploy_site.
+
+Groups
+~~~~~~
+Groups are named sets of nodes that will be deployed together. The fields of a
+group are:
+
+name
+  Required. The identifying name of the group.
+
+critical
+  Required. Indicates if this group is required to continue to additional
+  phases of deployment.
+
+depends_on
+  Required, may be empty list. Group names that must be successful before this
+  group can be processed.
+
+selectors
+  Required, may be empty list. A list of identifying information to indicate
+  the nodes that are members of this group.
+
+success_criteria
+  Optional. Criteria that must evaluate to be true before a group is considered
+  successfully complete with a phase of deployment.
+
+Criticality
+'''''''''''
+- Field: critical
+- Valid values: true | false
+
+Each group is required to indicate true or false for the `critical` field.
+This drives the behavior after the deployment of baremetal nodes.  If any
+groups that are marked as `critical: true` fail to meet that group's success
+criteria, the workflow should halt after the deployment of baremetal nodes. A
+group that cannot be processed due to a parent dependency failing will be
+considered failed, regardless of the success criteria.
+
+Dependencies
+''''''''''''
+- Field: depends_on
+- Valid values: [] or a list of group names
+
+Each group specifies a list of depends_on groups, or an empty list. All
+identified groups must complete successfully for the phase of deployment before
+the current group is allowed to be processed by the current phase.
+
+- A failure (based on success criteria) of a group prevents any groups
+  dependent upon the failed group from being attempted.
+- Circular dependencies will be rejected as invalid during document validation.
+- There is no guarantee of ordering among groups that have their dependencies
+  met. Any group that is ready for deployment based on declared dependencies
+  will execute. Execution of groups is serialized - two groups will not deploy
+  at the same time.
+
+Selectors
+'''''''''
+- Field: selectors
+- Valid values: [] or a list of selectors
+
+The list of selectors indicate the nodes that will be included in a group.
+Each selector has four available filtering values: node_names, node_tags,
+node_labels, and rack_names. Each selector is an intersection of this
+critera, while the list of selectors is a union of the individual selectors.
+
+- Omitting a criterion from a selector, or using empty list means that criterion
+  is ignored.
+- Having a completely empty list of selectors, or a selector that has no
+  criteria specified indicates ALL nodes.
+- A collection of selectors that results in no nodes being identified will be
+  processed as if 100% of nodes successfully deployed (avoiding division by
+  zero), but would fail the minimum or maximum nodes criteria (still counts as
+  0 nodes)
+- There is no validation against the same node being in multiple groups,
+  however the workflow will not resubmit nodes that have already completed or
+  failed in this deployment to Drydock twice, since it keeps track of each node
+  uniquely. The success or failure of those nodes excluded from submission to
+  Drydock will still be used for the success criteria calculation.
+
+E.g.::
+
+  selectors:
+    - node_names:
+        - node01
+        - node02
+      rack_names:
+        - rack01
+      node_tags:
+        - control
+    - node_names:
+        - node04
+      node_labels:
+        - ucp_control_plane: enabled
+
+Will indicate (not really SQL, just for illustration)::
+
+    SELECT nodes
+    WHERE node_name in ('node01', 'node02')
+          AND rack_name in ('rack01')
+          AND node_tags in ('control')
+    UNION
+    SELECT nodes
+    WHERE node_name in ('node04')
+          AND node_label in ('ucp_control_plane: enabled')
+
+Success Criteria
+''''''''''''''''
+- Field: success_criteria
+- Valid values: for possible values, see below
+
+Each group optionally contains success criteria which is used to indicate if
+the deployment of that group is successful. The values that may be specified:
+
+percent_successful_nodes
+  The calculated success rate of nodes completing the deployment phase.
+
+  E.g.: 75 would mean that 3 of 4 nodes must complete the phase successfully.
+
+  This is useful for groups that have larger numbers of nodes, and do not
+  have critical minimums or are not sensitive to an arbitrary number of nodes
+  not working.
+
+minimum_successful_nodes
+  An integer indicating how many nodes must complete the phase to be considered
+  successful.
+
+maximum_failed_nodes
+  An integer indicating a number of nodes that are allowed to have failed the
+  deployment phase and still consider that group successful.
+
+When no criteria are specified, it means that no checks are done - processing
+continues as if nothing is wrong.
+
+When more than one criterion is specified, each is evaluated separately - if
+any fail, the group is considered failed.
+
+
+Example Deployment Strategy Document
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+This example shows a deployment strategy with 5 groups: control-nodes,
+compute-nodes-1, compute-nodes-2, monitoring-nodes, and ntp-node.
+
+::
+
+  ---
+  schema: shipyard/DeploymentStrategy/v1
+  metadata:
+    schema: metadata/Document/v1
+    name: deployment-strategy
+    layeringDefinition:
+        abstract: false
+        layer: global
+    storagePolicy: cleartext
+  data:
+    groups:
+      - name: control-nodes
+        critical: true
+        depends_on:
+          - ntp-node
+        selectors:
+          - node_names: []
+            node_labels: []
+            node_tags:
+              - control
+            rack_names:
+              - rack03
+        success_criteria:
+          percent_successful_nodes: 90
+          minimum_successful_nodes: 3
+          maximum_failed_nodes: 1
+      - name: compute-nodes-1
+        critical: false
+        depends_on:
+          - control-nodes
+        selectors:
+          - node_names: []
+            node_labels: []
+            rack_names:
+              - rack01
+            node_tags:
+              - compute
+        success_criteria:
+          percent_successful_nodes: 50
+      - name: compute-nodes-2
+        critical: false
+        depends_on:
+          - control-nodes
+        selectors:
+          - node_names: []
+            node_labels: []
+            rack_names:
+              - rack02
+            node_tags:
+              - compute
+        success_criteria:
+          percent_successful_nodes: 50
+      - name: monitoring-nodes
+        critical: false
+        depends_on: []
+        selectors:
+          - node_names: []
+            node_labels: []
+            node_tags:
+              - monitoring
+            rack_names:
+              - rack03
+              - rack02
+              - rack01
+      - name: ntp-node
+        critical: true
+        depends_on: []
+        selectors:
+          - node_names:
+              - ntp01
+            node_labels: []
+            node_tags: []
+            rack_names: []
+        success_criteria:
+          minimum_successful_nodes: 1
+
+The ordering of groups, as defined by the dependencies (``depends-on``
+fields)::
+
+   __________     __________________
+  | ntp-node |   | monitoring-nodes |
+   ----------     ------------------
+       |
+   ____V__________
+  | control-nodes |
+   ---------------
+       |_________________________
+           |                     |
+     ______V__________     ______V__________
+    | compute-nodes-1 |   | compute-nodes-2 |
+     -----------------     -----------------
+
+Given this, the order of execution could be:
+
+- ntp-node > monitoring-nodes > control-nodes > compute-nodes-1 > compute-nodes-2
+- ntp-node > control-nodes > compute-nodes-2 > compute-nodes-1 > monitoring-nodes
+- monitoring-nodes > ntp-node > control-nodes > compute-nodes-1 > compute-nodes-2
+- and many more ... the only guarantee is that ntp-node will run some time
+  before control-nodes, which will run sometime before both of the
+  compute-nodes. Monitoring-nodes can run at any time.
+
+Also of note are the various combinations of selectors and the varied use of
+success criteria.
+
+Deployment Configuration Document (Shipyard)
+--------------------------------------------
+The existing deployment-configuration document that is used by the workflows
+will also be modified to use the existing deployment_strategy field to provide
+the name of the deployment-straegy document that will be used.
+
+The default value for the name of the DeploymentStrategy document will be
+``deployment-strategy``.
+
+Drydock Changes
+---------------
+
+API and CLI
+~~~~~~~~~~~
+- A new API needs to be provided that accepts a node filter (i.e. selector,
+  above) and returns a list of node names that result from analysis of the
+  design. Input to this API will also need to include a design reference.
+
+- Drydock needs to provide a "tree" output of tasks rooted at the requested
+  parent task. This will provide the needed success/failure status for nodes
+  that have been prepared/deployed.
+
+Documentation
+~~~~~~~~~~~~~
+Drydock documentation will be updated to match the introduction of new APIs
+
+
+Shipyard Changes
+----------------
+
+API and CLI
+~~~~~~~~~~~
+- The commit configdocs api will need to be enhanced to look up the
+  DeploymentStrategy by using the DeploymentConfiguration.
+- The DeploymentStrategy document will need to be validated to ensure there are
+  no circular dependencies in the groups' declared dependencies (perhaps
+  NetworkX_).
+- A new API endpoint (and matching CLI) is desired to retrieve the status of
+  nodes as known to Drydock/MAAS and their MAAS status. The existing node list
+  API in Drydock provides a JSON output that can be utilized for this purpose.
+
+Workflow
+~~~~~~~~
+The deploy_site and update_site workflows will be modified to utilize the
+DeploymentStrategy.
+
+- The deployment configuration step will be enhanced to also read the
+  deployment strategy and pass the information on a new xcom for use by the
+  baremetal nodes step (see below)
+- The prepare nodes and deploy nodes steps will be combined to perform both as
+  part of the resolution of an overall ``baremetal nodes`` step.
+  The baremetal nodes step will introduce functionality that reads in the
+  deployment strategy (from the prior xcom), and can orchestrate the calls to
+  Drydock to enact the grouping, ordering and and success evaluation.
+  Note that Drydock will serialize tasks; there is no parallelization of
+  prepare/deploy at this time.
+
+Needed Functionality
+''''''''''''''''''''
+
+- function to formulate the ordered groups based on dependencies (perhaps
+  NetworkX_)
+- function to evaluate success/failure against the success criteria for a group
+  based on the result list of succeeded or failed nodes.
+- function to mark groups as success or failure (including failed due to
+  dependency failure), as well as keep track of the (if any) successful and
+  failed nodes.
+- function to get a group that is ready to execute, or 'Done' when all groups
+  are either complete or failed.
+- function to formulate the node filter for Drydock based on a group's
+  selectors
+- function to orchestrate processing groups, moving to the next group (or being
+  done) when a prior group completes or fails.
+- function to summarize the success/failed nodes for a group (primarily for
+  reporting to the logs at this time).
+
+Process
+'''''''
+The baremetal nodes step (preparation and deployment of nodes) will proceed as
+follows:
+
+1. Each group's selector will be sent to Drydock to determine the list of
+   nodes that are a part of that group.
+
+   - An overall status will be kept for each unique node (not started |
+     prepared | success | failure).
+   - When sending a task to Drydock for processing, the nodes associated with
+     that group will be sent as a simple `node_name` node filter. This will
+     allow for this list to exclude nodes that have a status that is not
+     congruent for the task being performed.
+
+     - prepare nodes valid status: not started
+     - deploy nodes valid status: prepared
+
+2. In a processing loop, groups that are ready to be processed based on their
+   dependencies (and the success criteria of groups they are dependent upon)
+   will be selected for processing until there are no more groups that can be
+   processed. The processing will consist of preparing and then deploying the
+   group.
+
+   - The selected group will be prepared and then deployed before selecting
+     another group for processing.
+   - Any nodes that failed as part of that group will be excluded from
+     subsequent deployment or preparation of that node for this deployment.
+
+     - Excluding nodes that are already processed addresses groups that have
+       overlapping lists of nodes due to the group's selectors, and prevents
+       sending them to Drydock for re-processing.
+     - Evaluation of the success criteria will use the full set of nodes
+       identified by the selector. This means that if a node was previously
+       successfully deployed, that same node will count as "successful" when
+       evaluating the success criteria.
+
+   - The success criteria will be evaluated after the group's prepare step and
+     the deploy step. A failure to meet the success criteria in a prepare step
+     will cause the deploy step for that group to be skipped (and marked as
+     failed).
+   - Any nodes that fail during the prepare step, will not be used in the
+     corresponding deploy step.
+   - Upon completion (success, partial success, or failure) of a prepare step,
+     the nodes that were sent for preparation will be marked in the unique list
+     of nodes (above) with their appropriate status: prepared or failure
+   - Upon completion of a group's deployment step, the nodes status will be
+     updated to their current status: success or failure.
+
+4. Before the end of the baremetal nodes step, following all eligible group
+   processing, a report will be logged to indicate the success/failure of
+   groups and the status of the individual nodes. Note that it is possible for
+   individual nodes to be left in `not started` state if they were only part of
+   groups that were never allowed to process due to dependencies and success
+   criteria.
+
+5. At the end of the baremetal nodes step, if any nodes that have failed
+   due to timeout, dependency failure, or success criteria failure and are
+   marked as critical will trigger an Airflow Exception, resulting in a failed
+   deployment.
+
+Notes:
+
+- The timeout values specified for the prepare nodes and deploy nodes steps
+  will be used to put bounds on the individual calls to Drydock. A failure
+  based on these values will be treated as a failure for the group; we need to
+  be vigilant on if this will lead to indeterminate states for nodes that mess
+  with further processing. (e.g. Timed out, but the requested work still
+  continued to completion)
+
+Example Processing
+''''''''''''''''''
+Using the defined deployment strategy in the above example, the following is
+an example of how it may process::
+
+  Start
+  |
+  | prepare ntp-node           <SUCCESS>
+  | deploy ntp-node            <SUCCESS>
+  V
+  | prepare control-nodes      <SUCCESS>
+  | deploy control-nodes       <SUCCESS>
+  V
+  | prepare monitoring-nodes   <SUCCESS>
+  | deploy monitoring-nodes    <SUCCESS>
+  V
+  | prepare compute-nodes-2    <SUCCESS>
+  | deploy compute-nodes-2     <SUCCESS>
+  V
+  | prepare compute-nodes-1    <SUCCESS>
+  | deploy compute-nodes-1     <SUCCESS>
+  |
+  Finish (success)
+
+If there were a failure in preparing the ntp-node, the following would be the
+result::
+
+  Start
+  |
+  | prepare ntp-node           <FAILED>
+  | deploy ntp-node            <FAILED, due to prepare failure>
+  V
+  | prepare control-nodes      <FAILED, due to dependency>
+  | deploy control-nodes       <FAILED, due to dependency>
+  V
+  | prepare monitoring-nodes   <SUCCESS>
+  | deploy monitoring-nodes    <SUCCESS>
+  V
+  | prepare compute-nodes-2    <FAILED, due to dependency>
+  | deploy compute-nodes-2     <FAILED, due to dependency>
+  V
+  | prepare compute-nodes-1    <FAILED, due to dependency>
+  | deploy compute-nodes-1     <FAILED, due to dependency>
+  |
+  Finish (failed due to critical group failed)
+
+If a failure occurred during the deploy of compute-nodes-2, the following would
+result::
+
+  Start
+  |
+  | prepare ntp-node           <SUCCESS>
+  | deploy ntp-node            <SUCCESS>
+  V
+  | prepare control-nodes      <SUCCESS>
+  | deploy control-nodes       <SUCCESS>
+  V
+  | prepare monitoring-nodes   <SUCCESS>
+  | deploy monitoring-nodes    <SUCCESS>
+  V
+  | prepare compute-nodes-2    <SUCCESS>
+  | deploy compute-nodes-2     <FAILED>
+  V
+  | prepare compute-nodes-1    <SUCCESS>
+  | deploy compute-nodes-1     <SUCCESS>
+  |
+  Finish (success with some nodes/groups failed)
+
+Schemas
+~~~~~~~
+A new schema will need to be provided by Shipyard to validate the
+DeploymentStrategy document.
+
+Documentation
+~~~~~~~~~~~~~
+The Shipyard action documentation will need to include details defining the
+DeploymentStrategy document (mostly as defined here), as well as the update to
+the DeploymentConfiguration document to contain the name of the
+DeploymentStrategy document.
+
+
+.. _NetworkX: https://networkx.github.io/documentation/networkx-1.9/reference/generated/networkx.algorithms.dag.topological_sort.html