Move specs: airship-in-a-bottle to airship-specs

Moves the two blueprint/spec documents that existed in airship-in-a-bottle to the airship-specs. The implemented spec was not reformatted to the spec template. The other spec (in approved folder) was minimally updated to the spec template. Change-Id: I7468579e2fa3077ee1144e5294eba97d8e4ced05
2018-08-01 13:16:47 -05:00 · 2018-08-01 13:16:47 -05:00 · bfbfd56c81
parent 6e0a18e7fa
commit bfbfd56c81
2 changed files with 1189 additions and 0 deletions
--- a/specs/approved/workflow_node-teardown.rst
+++ b/specs/approved/workflow_node-teardown.rst
@ -0,0 +1,620 @@
 ..
      Copyright 2018 AT&T Intellectual Property.
      All Rights Reserved.
      Licensed under the Apache License, Version 2.0 (the "License"); you may
      not use this file except in compliance with the License. You may obtain
      a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
      License for the specific language governing permissions and limitations
      under the License.
 .. index::
   single: Teardown node
   single: workflow;redeploy_server
   single: Drydock
   single: Promenade
   single: Shipyard
 .. _node-teardown:
 =====================
 Airship Node Teardown
 =====================
 Shipyard is the entrypoint for Airship actions, including the need to redeploy a
 server. The first part of redeploying a server is the graceful teardown of the
 software running on the server; specifically Kubernetes and etcd are of
 critical concern. It is the duty of Shipyard to orchestrate the teardown of the
 server, followed by steps to deploy the desired new configuration. This design
 covers only the first portion - node teardown
 Links
 =====
 None
 Problem description
 ===================
 When redeploying a physical host (server) using the Airship Platform,
 it is necessary to trigger a sequence of steps to prevent undesired behaviors
 when the server is redeployed. This blueprint intends to document the
 interaction that must occur between Airship components to teardown a server.
 Impacted components
 ===================
 Drydock
 Promenade
 Shipyard
 Proposed change
 ===============
 Shipyard node teardown Process
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 #. (Existing) Shipyard receives request to redeploy_server, specifying a target
   server.
 #. (Existing) Shipyard performs preflight, design reference lookup, and
   validation steps.
 #. (New) Shipyard invokes Promenade to decommission a node.
 #. (New) Shipyard invokes Drydock to destroy the node - setting a node
   filter to restrict to a single server.
 #. (New) Shipyard invokes Promenade to remove the node from the Kubernetes
   cluster.
 Assumption:
 node_id is the hostname of the server, and is also the identifier that both
 Drydock and Promenade use to identify the appropriate parts - hosts and k8s
 nodes. This convention is set by the join script produced by promenade.
 Drydock Destroy Node
 --------------------
 The API/interface for destroy node already exists. The implementation within
 Drydock needs to be developed. This interface will need to accept both the
 specified node_id and the design_id to retrieve from Deckhand.
 Using the provided node_id (hardware node), and the design_id, Drydock will
 reset the hardware to a re-provisionable state.
 By default, all local storage should be wiped (per datacenter policy for
 wiping before re-use).
 An option to allow for only the OS disk to be wiped should be supported, such
 that other local storage is left intact, and could be remounted without data
 loss. e.g.: --preserve-local-storage
 The target node should be shut down.
 The target node should be removed from the provisioner (e.g. MaaS)
 Responses
 ~~~~~~~~~
 The responses from this functionality should follow the pattern set by prepare
 nodes, and other Drydock functionality. The Drydock status responses used for
 all async invocations will be utilized for this functionality.
 Promenade Decommission Node
 ---------------------------
 Performs steps that will result in the specified node being cleanly
 disassociated from Kubernetes, and ready for the server to be destroyed.
 Users of the decommission node API should be aware of the long timeout values
 that may occur while awaiting promenade to complete the appropriate steps.
 At this time, Promenade is a stateless service and doesn't use any database
 storage. As such, requests to Promenade are synchronous.
 .. code:: json
  POST /nodes/{node_id}/decommission
  {
    rel : "design",
    href: "deckhand+https://{{deckhand_url}}/revisions/{{revision_id}}/rendered-documents",
    type: "application/x-yaml"
  }
 Such that the design reference body is the design indicated when the
 redeploy_server action is invoked through Shipyard.
 Query Parameters:
 -  drain-node-timeout: A whole number timeout in seconds to be used for the
   drain node step (default: none). In the case of no value being provided,
   the drain node step will use its default.
 -  drain-node-grace-period: A whole number in seconds indicating the
   grace-period that will be provided to the drain node step. (default: none).
   If no value is specified, the drain node step will use its default.
 -  clear-labels-timeout: A whole number timeout in seconds to be used for the
   clear labels step. (default: none).  If no value is specified, clear labels
   will use its own default.
 -  remove-etcd-timeout: A whole number timeout in seconds to be used for the
   remove etcd from nodes step. (default: none). If no value is specified,
   remove-etcd will use its own default.
 -  etcd-ready-timeout: A whole number in seconds indicating how long the
   decommission node request should allow for etcd clusters to become stable
   (default: 600).
 Process
 ~~~~~~~
 Acting upon the node specified by the invocation and the design reference
 details:
 #. Drain the Kubernetes node.
 #. Clear the Kubernetes labels on the node.
 #. Remove etcd nodes from their clusters (if impacted).
   - if the node being decommissioned contains etcd nodes, Promenade will
   attempt to gracefully have those nodes leave the etcd cluster.
 #. Ensure that etcd cluster(s) are in a stable state.
   - Polls for status every 30 seconds up to the etcd-ready-timeout, or the
   cluster meets the defined minimum functionality for the site.
   - A new document: promenade/EtcdClusters/v1 that will specify details about
   the etcd clusters deployed in the site, including: identifiers,
   credentials, and thresholds for minimum functionality.
   - This process should ignore the node being torn down from any calculation
   of health
 #. Shutdown the kubelet.
   - If this is not possible because the node is in a state of disarray such
   that it cannot schedule the daemonset to run, this step may fail, but
   should not hold up the process, as the Drydock dismantling of the node
   will shut the kubelet down.
 Responses
 ~~~~~~~~~
 All responses will be form of the Airship Status response.
 -  Success: Code: 200, reason: Success
   Indicates that all steps are successful.
 -  Failure: Code: 404, reason: NotFound
   Indicates that the target node is not discoverable by Promenade.
 -  Failure: Code: 500, reason: DisassociateStepFailure
   The details section should detail the successes and failures further. Any
   4xx series errors from the individual steps would manifest as a 500 here.
 Promenade Drain Node
 --------------------
 Drain the Kubernetes node for the target node. This will ensure that this node
 is no longer the target of any pod scheduling, and evicts or deletes the
 running pods. In the case of notes running DaemonSet manged pods, or pods
 that would prevent a drain from occurring, Promenade may be required to provide
 the `ignore-daemonsets` option or `force` option to attempt to drain the node
 as fully as possible.
 By default, the drain node will utilize a grace period for pods of 1800
 seconds and a total timeout of 3600 seconds (1 hour). Clients of this
 functionality should be prepared for a long timeout.
 .. code:: json
  POST /nodes/{node_id}/drain
 Query Paramters:
 -  timeout: a whole number in seconds (default = 3600). This value is the total
   timeout for the kubectl drain command.
 -  grace-period: a whole number in seconds (default = 1800). This value is the
   grace period used by kubectl drain. Grace period must be less than timeout.
 .. note::
   This POST has no message body
 Example command being used for drain (reference only)
 `kubectl drain --force --timeout 3600s --grace-period 1800 --ignore-daemonsets --delete-local-data n1`
 https://git.openstack.org/cgit/openstack/airship-promenade/tree/promenade/templates/roles/common/usr/local/bin/promenade-teardown
 Responses
 ~~~~~~~~~
 All responses will be form of the Airship Status response.
 -  Success: Code: 200, reason: Success
   Indicates that the drain node has successfully concluded, and that no pods
   are currently running
 -  Failure: Status response, code: 400, reason: BadRequest
   A request was made with parameters that cannot work - e.g. grace-period is
   set to a value larger than the timeout value.
 -  Failure: Status response, code: 404, reason: NotFound
   The specified node is not discoverable by Promenade
 -  Failure: Status response, code: 500, reason: DrainNodeError
   There was a processing exception raised while trying to drain a node. The
   details section should indicate the underlying cause if it can be
   determined.
 Promenade Clear Labels
 ----------------------
 Removes the labels that have been added to the target kubernetes node.
 .. code:: json
  POST /nodes/{node_id}/clear-labels
 Query Parameters:
 -  timeout: A whole number in seconds allowed for the pods to settle/move
   following removal of labels. (Default = 1800)
 .. note::
   This POST has no message body
 Responses
 ~~~~~~~~~
 All responses will be form of the UCP Status response.
 -  Success: Code: 200, reason: Success
   All labels have been removed from the specified Kubernetes node.
 -  Failure: Code: 404, reason: NotFound
   The specified node is not discoverable by Promenade
 -  Failure: Code: 500, reason: ClearLabelsError
   There was a failure to clear labels that prevented completion. The details
   section should provide more information about the cause of this failure.
 Promenade Remove etcd Node
 ~~~~~~~~~~~~~~~~~~~~~~~~~~
 Checks if the node specified contains any etcd nodes. If so, this API will
 trigger that etcd node to leave the associated etcd cluster::
  POST /nodes/{node_id}/remove-etcd
  {
    rel : "design",
    href: "deckhand+https://{{deckhand_url}}/revisions/{{revision_id}}/rendered-documents",
    type: "application/x-yaml"
  }
 Query Parameters:
 -  timeout: A whole number in seconds allowed for the removal of etcd nodes
   from the targe node. (Default = 1800)
 Responses
 ~~~~~~~~~
 All responses will be form of the UCP Status response.
 -  Success: Code: 200, reason: Success
   All etcd nodes have been removed from the specified node.
 -  Failure: Code: 404, reason: NotFound
   The specified node is not discoverable by Promenade
 -  Failure: Code: 500, reason: RemoveEtcdError
   There was a failure to remove etcd from the target node that prevented
   completion within the specified timeout, or that etcd prevented removal of
   the node because it would result in the cluster being broken. The details
   section should provide more information about the cause of this failure.
 Promenade Check etcd
 ~~~~~~~~~~~~~~~~~~~~
 Retrieves the current interpreted state of etcd.
 GET /etcd-cluster-health-statuses?design_ref={the design ref}
 Where the design_ref parameter is required for appropriate operation, and is in
 the same format as used for the join-scripts API.
 Query Parameters:
 -  design_ref: (Required) the design reference to be used to discover etcd
   instances.
 Responses
 ~~~~~~~~~
 All responses will be form of the UCP Status response.
 -  Success: Code: 200, reason: Success
   The status of each etcd in the site will be returned in the details section.
   Valid values for status are: Healthy, Unhealthy
 https://github.com/att-comdev/ucp-integration/blob/master/docs/source/api-conventions.rst#status-responses
 .. code:: json
  { "...": "... standard status response ...",
    "details": {
      "errorCount": {{n}},
      "messageList": [
        { "message": "Healthy",
          "error": false,
          "kind": "HealthMessage",
          "name": "{{the name of the etcd service}}"
        },
        { "message": "Unhealthy"
          "error": false,
          "kind": "HealthMessage",
          "name": "{{the name of the etcd service}}"
        },
        { "message": "Unable to access Etcd"
          "error": true,
          "kind": "HealthMessage",
          "name": "{{the name of the etcd service}}"
        }
      ]
    }
    ...
  }
 -  Failure: Code: 400, reason: MissingDesignRef
   Returned if the design_ref parameter is not specified
 -  Failure: Code: 404, reason: NotFound
   Returned if the specified etcd could not be located
 -  Failure: Code: 500, reason: EtcdNotAccessible
   Returned if the specified etcd responded with an invalid health response
   (Not just simply unhealthy - that's a 200).
 Promenade Shutdown Kubelet
 --------------------------
 Shuts down the kubelet on the specified node. This is accomplished by Promenade
 setting the label `promenade-decomission: enabled` on the node, which will
 trigger a newly-developed daemonset to run something like:
 `systemctl disable kubelet && systemctl stop kubelet`.
 This daemonset will effectively sit dormant until nodes have the appropriate
 label added, and then perform the kubelet teardown.
 .. code:: json
  POST /nodes/{node_id}/shutdown-kubelet
 .. note::
   This POST has no message body
 Responses
 ~~~~~~~~~
 All responses will be form of the UCP Status response.
 -  Success: Code: 200, reason: Success
   The kubelet has been successfully shutdown
 -  Failure: Code: 404, reason: NotFound
   The specified node is not discoverable by Promenade
 -  Failure: Code: 500, reason: ShutdownKubeletError
   The specified node's kubelet fails to shutdown. The details section of the
   status response should contain reasonable information about the source of
   this failure
 Promenade Delete Node from Cluster
 ----------------------------------
 Updates the Kubernetes cluster, removing the specified node. Promenade should
 check that the node is drained/cordoned and has no labels other than
 `promenade-decomission: enabled`. In either of these cases, the API should
 respond with a 409 Conflict response.
 .. code:: json
  POST /nodes/{node_id}/remove-from-cluster
 .. note::
   This POST has no message body
 Responses
 ~~~~~~~~~
 All responses will be form of the UCP Status response.
 -  Success: Code: 200, reason: Success
   The specified node has been removed from the Kubernetes cluster.
 -  Failure: Code: 404, reason: NotFound
   The specified node is not discoverable by Promenade
 -  Failure: Code: 409, reason: Conflict
   The specified node cannot be deleted due to checks that the node is
   drained/cordoned and has no labels (other than possibly
   `promenade-decomission: enabled`).
 -  Failure: Code: 500, reason: DeleteNodeError
   The specified node cannot be removed from the cluster due to an error from
   Kubernetes. The details section of the status response should contain more
   information about the failure.
 Shipyard Tag Releases
 ---------------------
 Shipyard will need to mark Deckhand revisions with tags when there are
 successful deploy_site or update_site actions to be able to determine the last
 known good design. This is related to issue 16 for Shipyard, which utilizes the
 same need.
 .. note::
   Repeated from https://github.com/att-comdev/shipyard/issues/16
   When multiple configdocs commits have been done since the last deployment,
   there is no ready means to determine what's being done to the site. Shipyard
   should reject deploy site or update site requests that have had multiple
   commits since the last site true-up action. An option to override this guard
   should be allowed for the actions in the form of a parameter to the action.
   The configdocs API should provide a way to see what's been changed since the
   last site true-up, not just the last commit of configdocs. This might be
   accommodated by new deckhand tags like the 'commit' tag, but for
   'site true-up' or similar applied by the deploy and update site commands.
 The design for issue 16 includes the bare-minimum marking of Deckhand
 revisions. This design is as follows:
 Scenario
 ~~~~~~~~
 Multiple commits occur between site actions (deploy_site, update_site) - those
 actions that attempt to bring a site into compliance with a site design.
 When this occurs, the current system of being able to only see what has changed
 between committed and the the buffer versions (configdocs diff) is insufficient
 to be able to investigate what has changed since the last successful (or
 unsuccessful) site action.
 To accommodate this, Shipyard needs several enhancements.
 Enhancements
 ~~~~~~~~~~~~
 #. Deckhand revision tags for site actions
   Using the tagging facility provided by Deckhand, Shipyard will tag the end
   of site actions.
   Upon completing a site action successfully tag the revision being used with
   the tag site-action-success, and a body of dag_id:<dag_id>
   Upon completion of a site action unsuccessfully, tag the revision being used
   with the tag site-action-failure, and a body of dag_id:<dag_id>
   The completion tags should only be applied upon failure if the site action
   gets past document validation successfully (i.e. gets to the point where it
   can start making changes via the other UCP components)
   This could result in a single revision having both site-action-success and
   site-action-failure if a later re-invocation of a site action is successful.
 #. Check for intermediate committed revisions
   Upon running a site action, before tagging the revision with the site action
   tag(s), the dag needs to check to see if there are committed revisions that
   do not have an associated site-action tag.  If there are any committed
   revisions since the last site action other than the current revision being
   used (between them), then the action should not be allowed to proceed (stop
   before triggering validations). For the calculation of intermediate
   committed revisions, assume revision 0 if there are no revisions with a
   site-action tag (null case)
   If the action is invoked with a parameter of
   allow-intermediate-commits=true, then this check should log that the
   intermediate committed revisions check is being skipped and not take any
   other action.
 #. Support action parameter of allow-intermediate-commits=true|false
   In the CLI for create action, the --param option supports adding parameters
   to actions. The parameters passed should be relayed by the CLI to the API
   and ultimately to the invocation of the DAG.  The DAG as noted above will
   check for the presense of allow-intermediate-commits=true.  This needs to be
   tested to work.
 #. Shipyard needs to support retrieving configdocs and rendered documents for
   the last successful site action, and last site action (successful or not
   successful)
   --successful-site-action
   --last-site-action
   These options would be mutually exclusive of --buffer or --committed
 #. Shipyard diff (shipyard get configdocs)
   Needs to support an option to do the diff of the buffer vs. the last
   successful site action and the last site action (succesful or not
   successful).
   Currently there are no options to select which versions to diff (always
   buffer vs. committed)
   support:
   --base-version=committed | successful-site-action | last-site-action (Default = committed)
   --diff-version=buffer | committed | successful-site-action | last-site-action (Default = buffer)
   Equivalent query parameters need to be implemented in the API.
 Because the implementation of this design will result in the tagging of
 successful site-actions, Shipyard will be able to determine the correct
 revision to use while attempting to teardown a node.
 If the request to teardown a node indicates a revision that doesn't exist, the
 command to do so (e.g. redeploy_server) should not continue, but rather fail
 due to a missing precondition.
 The invocation of the Promenade and Drydock steps in this design will utilize
 the appropriate tag based on the request (default is successful-site-action) to
 determine the revision of the Deckhand documents used as the design-ref.
 Shipyard redeploy_server Action
 -------------------------------
 The redeploy_server action currently accepts a target node. Additional
 supported parameters are needed:
 #. preserve-local-storage=true which will instruct Drydock to only wipe the
   OS drive, and any other local storage will not be wiped. This would allow
   for the drives to be remounted to the server upon re-provisioning. The
   default behavior is that local storage is not preserved.
 #. target-revision=committed | successful-site-action | last-site-action
   This will indicate which revision of the design will be used as the
   reference for what should be re-provisioned after the teardown.
   The default is successful-site-action, which is the closest representation
   to the last-known-good state.
 These should be accepted as parameters to the action API/CLI and modify the
 behavior of the redeploy_server DAG.
 Security impact
 ---------------
 None. This change introduces no new security concerns outside of established
 patterns for RBAC controls around API endpoints.
 Performance impact
 ------------------
 As this is an on-demand action, there is no expected performance impact to
 existing processes, although tearing down a host may result in temporary
 degraded service capacity in the case of needing to move workloads to different
 hosts, or a more simple case of reduced capacity.
 Alternatives
 ------------
 N/A
 Implementation
 ==============
 None at this time
 Dependencies
 ============
 None.
 References
 ==========
 None
--- a/specs/implemented/deployment-grouping-baremetal.rst
+++ b/specs/implemented/deployment-grouping-baremetal.rst
@ -0,0 +1,569 @@
 ..
      Copyright 2018 AT&T Intellectual Property.
      All Rights Reserved.
      Licensed under the Apache License, Version 2.0 (the "License"); you may
      not use this file except in compliance with the License. You may obtain
      a copy of the License at
          http://www.apache.org/licenses/LICENSE-2.0
      Unless required by applicable law or agreed to in writing, software
      distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
      WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
      License for the specific language governing permissions and limitations
      under the License.
 .. index::
   single: Deployment grouping
   single: workflow
   single: Shipyard
   single: Drydock
 .. _deployment-grouping-baremetal:
 =======================================
 Deployment Grouping for Baremetal Nodes
 =======================================
 One of the primary functionalities of the Undercloud Platform is the deployment
 of baremetal nodes as part of site deployment and upgrade. This blueprint aims
 to define how deployment strategies can be applied to the workflow during these
 actions.
 .. note::
  This document has been moved from the airship-in-a-bottle project, and is
  previously implemented. The format of this document diverges from the
  standard template for airship-specs.
 Overview
 --------
 When Shipyard is invoked for a deploy_site or update_site action, there are
 three primary stages:
 1. Preparation and Validation
 2. Baremetal and Network Deployment
 3. Software Deployment
 During the Baremetal and Network Deployment stage, the deploy_site or
 update_site workflow (and perhaps other workflows in the future) invokes
 Drydock to verify the site, prepare the site, prepare the nodes, and deploy the
 nodes. Each of these steps is described in the `Drydock Orchestrator Readme`_
 .. _Drydock Orchestrator Readme: https://git.openstack.org/cgit/openstack/airship-drydock/plain/drydock_provisioner/orchestrator/readme.md
 The prepare nodes and deploy nodes steps each involve intensive and potentially
 time consuming operations on the target nodes, orchestrated by Drydock and
 MAAS. These steps need to be approached and managed such that grouping,
 ordering, and criticality of success of nodes can be managed in support of
 fault tolerant site deployments and updates.
 For the purposes of this document `phase of deployment` refer to the prepare
 nodes and deploy nodes steps of the Baremetal and Network deployment.
 Some factors that advise this solution:
 1. Limits to the amount of parallelization that can occur due to a centralized
   MAAS system.
 2. Faults in the hardware, preventing operational nodes.
 3. Miswiring or configuration of network hardware.
 4. Incorrect site design causing a mismatch against the hardware.
 5. Criticality of particular nodes to the realization of the site design.
 6. Desired configurability within the framework of the UCP declarative site
   design.
 7. Improved visibility into the current state of node deployment.
 8. A desire to begin the deployment of nodes before the finish of the
   preparation of nodes -- i.e. start deploying nodes as soon as they are ready
   to be deployed. Note: This design will not achieve new forms of
   task parallelization within Drydock; this is recognized as a desired
   functionality.
 Solution
 --------
 Updates supporting this solution will require changes to Shipyard for changed
 workflows and Drydock for the desired node targeting, and for retrieval of
 diagnostic and result information.
 .. index::
   single: Shipyard Documents; DeploymentStrategy
 Deployment Strategy Document (Shipyard)
 ---------------------------------------
 To accommodate the needed changes, this design introduces a new
 DeploymentStrategy document into the site design to be read and utilized
 by the workflows for update_site and deploy_site.
 Groups
 ~~~~~~
 Groups are named sets of nodes that will be deployed together. The fields of a
 group are:
 name
  Required. The identifying name of the group.
 critical
  Required. Indicates if this group is required to continue to additional
  phases of deployment.
 depends_on
  Required, may be empty list. Group names that must be successful before this
  group can be processed.
 selectors
  Required, may be empty list. A list of identifying information to indicate
  the nodes that are members of this group.
 success_criteria
  Optional. Criteria that must evaluate to be true before a group is considered
  successfully complete with a phase of deployment.
 Criticality
 '''''''''''
 - Field: critical
 - Valid values: true | false
 Each group is required to indicate true or false for the `critical` field.
 This drives the behavior after the deployment of baremetal nodes.  If any
 groups that are marked as `critical: true` fail to meet that group's success
 criteria, the workflow should halt after the deployment of baremetal nodes. A
 group that cannot be processed due to a parent dependency failing will be
 considered failed, regardless of the success criteria.
 Dependencies
 ''''''''''''
 - Field: depends_on
 - Valid values: [] or a list of group names
 Each group specifies a list of depends_on groups, or an empty list. All
 identified groups must complete successfully for the phase of deployment before
 the current group is allowed to be processed by the current phase.
 - A failure (based on success criteria) of a group prevents any groups
  dependent upon the failed group from being attempted.
 - Circular dependencies will be rejected as invalid during document validation.
 - There is no guarantee of ordering among groups that have their dependencies
  met. Any group that is ready for deployment based on declared dependencies
  will execute. Execution of groups is serialized - two groups will not deploy
  at the same time.
 Selectors
 '''''''''
 - Field: selectors
 - Valid values: [] or a list of selectors
 The list of selectors indicate the nodes that will be included in a group.
 Each selector has four available filtering values: node_names, node_tags,
 node_labels, and rack_names. Each selector is an intersection of this
 critera, while the list of selectors is a union of the individual selectors.
 - Omitting a criterion from a selector, or using empty list means that criterion
  is ignored.
 - Having a completely empty list of selectors, or a selector that has no
  criteria specified indicates ALL nodes.
 - A collection of selectors that results in no nodes being identified will be
  processed as if 100% of nodes successfully deployed (avoiding division by
  zero), but would fail the minimum or maximum nodes criteria (still counts as
  0 nodes)
 - There is no validation against the same node being in multiple groups,
  however the workflow will not resubmit nodes that have already completed or
  failed in this deployment to Drydock twice, since it keeps track of each node
  uniquely. The success or failure of those nodes excluded from submission to
  Drydock will still be used for the success criteria calculation.
 E.g.::
  selectors:
    - node_names:
        - node01
        - node02
      rack_names:
        - rack01
      node_tags:
        - control
    - node_names:
        - node04
      node_labels:
        - ucp_control_plane: enabled
 Will indicate (not really SQL, just for illustration)::
    SELECT nodes
    WHERE node_name in ('node01', 'node02')
          AND rack_name in ('rack01')
          AND node_tags in ('control')
    UNION
    SELECT nodes
    WHERE node_name in ('node04')
          AND node_label in ('ucp_control_plane: enabled')
 Success Criteria
 ''''''''''''''''
 - Field: success_criteria
 - Valid values: for possible values, see below
 Each group optionally contains success criteria which is used to indicate if
 the deployment of that group is successful. The values that may be specified:
 percent_successful_nodes
  The calculated success rate of nodes completing the deployment phase.
  E.g.: 75 would mean that 3 of 4 nodes must complete the phase successfully.
  This is useful for groups that have larger numbers of nodes, and do not
  have critical minimums or are not sensitive to an arbitrary number of nodes
  not working.
 minimum_successful_nodes
  An integer indicating how many nodes must complete the phase to be considered
  successful.
 maximum_failed_nodes
  An integer indicating a number of nodes that are allowed to have failed the
  deployment phase and still consider that group successful.
 When no criteria are specified, it means that no checks are done - processing
 continues as if nothing is wrong.
 When more than one criterion is specified, each is evaluated separately - if
 any fail, the group is considered failed.
 Example Deployment Strategy Document
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 This example shows a deployment strategy with 5 groups: control-nodes,
 compute-nodes-1, compute-nodes-2, monitoring-nodes, and ntp-node.
 ::
  ---
  schema: shipyard/DeploymentStrategy/v1
  metadata:
    schema: metadata/Document/v1
    name: deployment-strategy
    layeringDefinition:
        abstract: false
        layer: global
    storagePolicy: cleartext
  data:
    groups:
      - name: control-nodes
        critical: true
        depends_on:
          - ntp-node
        selectors:
          - node_names: []
            node_labels: []
            node_tags:
              - control
            rack_names:
              - rack03
        success_criteria:
          percent_successful_nodes: 90
          minimum_successful_nodes: 3
          maximum_failed_nodes: 1
      - name: compute-nodes-1
        critical: false
        depends_on:
          - control-nodes
        selectors:
          - node_names: []
            node_labels: []
            rack_names:
              - rack01
            node_tags:
              - compute
        success_criteria:
          percent_successful_nodes: 50
      - name: compute-nodes-2
        critical: false
        depends_on:
          - control-nodes
        selectors:
          - node_names: []
            node_labels: []
            rack_names:
              - rack02
            node_tags:
              - compute
        success_criteria:
          percent_successful_nodes: 50
      - name: monitoring-nodes
        critical: false
        depends_on: []
        selectors:
          - node_names: []
            node_labels: []
            node_tags:
              - monitoring
            rack_names:
              - rack03
              - rack02
              - rack01
      - name: ntp-node
        critical: true
        depends_on: []
        selectors:
          - node_names:
              - ntp01
            node_labels: []
            node_tags: []
            rack_names: []
        success_criteria:
          minimum_successful_nodes: 1
 The ordering of groups, as defined by the dependencies (``depends-on``
 fields)::
   __________     __________________
  | ntp-node |   | monitoring-nodes |
   ----------     ------------------
       |
   ____V__________
  | control-nodes |
   ---------------
       |_________________________
           |                     |
     ______V__________     ______V__________
    | compute-nodes-1 |   | compute-nodes-2 |
     -----------------     -----------------
 Given this, the order of execution could be:
 - ntp-node > monitoring-nodes > control-nodes > compute-nodes-1 > compute-nodes-2
 - ntp-node > control-nodes > compute-nodes-2 > compute-nodes-1 > monitoring-nodes
 - monitoring-nodes > ntp-node > control-nodes > compute-nodes-1 > compute-nodes-2
 - and many more ... the only guarantee is that ntp-node will run some time
  before control-nodes, which will run sometime before both of the
  compute-nodes. Monitoring-nodes can run at any time.
 Also of note are the various combinations of selectors and the varied use of
 success criteria.
 Deployment Configuration Document (Shipyard)
 --------------------------------------------
 The existing deployment-configuration document that is used by the workflows
 will also be modified to use the existing deployment_strategy field to provide
 the name of the deployment-straegy document that will be used.
 The default value for the name of the DeploymentStrategy document will be
 ``deployment-strategy``.
 Drydock Changes
 ---------------
 API and CLI
 ~~~~~~~~~~~
 - A new API needs to be provided that accepts a node filter (i.e. selector,
  above) and returns a list of node names that result from analysis of the
  design. Input to this API will also need to include a design reference.
 - Drydock needs to provide a "tree" output of tasks rooted at the requested
  parent task. This will provide the needed success/failure status for nodes
  that have been prepared/deployed.
 Documentation
 ~~~~~~~~~~~~~
 Drydock documentation will be updated to match the introduction of new APIs
 Shipyard Changes
 ----------------
 API and CLI
 ~~~~~~~~~~~
 - The commit configdocs api will need to be enhanced to look up the
  DeploymentStrategy by using the DeploymentConfiguration.
 - The DeploymentStrategy document will need to be validated to ensure there are
  no circular dependencies in the groups' declared dependencies (perhaps
  NetworkX_).
 - A new API endpoint (and matching CLI) is desired to retrieve the status of
  nodes as known to Drydock/MAAS and their MAAS status. The existing node list
  API in Drydock provides a JSON output that can be utilized for this purpose.
 Workflow
 ~~~~~~~~
 The deploy_site and update_site workflows will be modified to utilize the
 DeploymentStrategy.
 - The deployment configuration step will be enhanced to also read the
  deployment strategy and pass the information on a new xcom for use by the
  baremetal nodes step (see below)
 - The prepare nodes and deploy nodes steps will be combined to perform both as
  part of the resolution of an overall ``baremetal nodes`` step.
  The baremetal nodes step will introduce functionality that reads in the
  deployment strategy (from the prior xcom), and can orchestrate the calls to
  Drydock to enact the grouping, ordering and and success evaluation.
  Note that Drydock will serialize tasks; there is no parallelization of
  prepare/deploy at this time.
 Needed Functionality
 ''''''''''''''''''''
 - function to formulate the ordered groups based on dependencies (perhaps
  NetworkX_)
 - function to evaluate success/failure against the success criteria for a group
  based on the result list of succeeded or failed nodes.
 - function to mark groups as success or failure (including failed due to
  dependency failure), as well as keep track of the (if any) successful and
  failed nodes.
 - function to get a group that is ready to execute, or 'Done' when all groups
  are either complete or failed.
 - function to formulate the node filter for Drydock based on a group's
  selectors
 - function to orchestrate processing groups, moving to the next group (or being
  done) when a prior group completes or fails.
 - function to summarize the success/failed nodes for a group (primarily for
  reporting to the logs at this time).
 Process
 '''''''
 The baremetal nodes step (preparation and deployment of nodes) will proceed as
 follows:
 1. Each group's selector will be sent to Drydock to determine the list of
   nodes that are a part of that group.
   - An overall status will be kept for each unique node (not started |
     prepared | success | failure).
   - When sending a task to Drydock for processing, the nodes associated with
     that group will be sent as a simple `node_name` node filter. This will
     allow for this list to exclude nodes that have a status that is not
     congruent for the task being performed.
     - prepare nodes valid status: not started
     - deploy nodes valid status: prepared
 2. In a processing loop, groups that are ready to be processed based on their
   dependencies (and the success criteria of groups they are dependent upon)
   will be selected for processing until there are no more groups that can be
   processed. The processing will consist of preparing and then deploying the
   group.
   - The selected group will be prepared and then deployed before selecting
     another group for processing.
   - Any nodes that failed as part of that group will be excluded from
     subsequent deployment or preparation of that node for this deployment.
     - Excluding nodes that are already processed addresses groups that have
       overlapping lists of nodes due to the group's selectors, and prevents
       sending them to Drydock for re-processing.
     - Evaluation of the success criteria will use the full set of nodes
       identified by the selector. This means that if a node was previously
       successfully deployed, that same node will count as "successful" when
       evaluating the success criteria.
   - The success criteria will be evaluated after the group's prepare step and
     the deploy step. A failure to meet the success criteria in a prepare step
     will cause the deploy step for that group to be skipped (and marked as
     failed).
   - Any nodes that fail during the prepare step, will not be used in the
     corresponding deploy step.
   - Upon completion (success, partial success, or failure) of a prepare step,
     the nodes that were sent for preparation will be marked in the unique list
     of nodes (above) with their appropriate status: prepared or failure
   - Upon completion of a group's deployment step, the nodes status will be
     updated to their current status: success or failure.
 4. Before the end of the baremetal nodes step, following all eligible group
   processing, a report will be logged to indicate the success/failure of
   groups and the status of the individual nodes. Note that it is possible for
   individual nodes to be left in `not started` state if they were only part of
   groups that were never allowed to process due to dependencies and success
   criteria.
 5. At the end of the baremetal nodes step, if any nodes that have failed
   due to timeout, dependency failure, or success criteria failure and are
   marked as critical will trigger an Airflow Exception, resulting in a failed
   deployment.
 Notes:
 - The timeout values specified for the prepare nodes and deploy nodes steps
  will be used to put bounds on the individual calls to Drydock. A failure
  based on these values will be treated as a failure for the group; we need to
  be vigilant on if this will lead to indeterminate states for nodes that mess
  with further processing. (e.g. Timed out, but the requested work still
  continued to completion)
 Example Processing
 ''''''''''''''''''
 Using the defined deployment strategy in the above example, the following is
 an example of how it may process::
  Start
  |
  | prepare ntp-node           <SUCCESS>
  | deploy ntp-node            <SUCCESS>
  V
  | prepare control-nodes      <SUCCESS>
  | deploy control-nodes       <SUCCESS>
  V
  | prepare monitoring-nodes   <SUCCESS>
  | deploy monitoring-nodes    <SUCCESS>
  V
  | prepare compute-nodes-2    <SUCCESS>
  | deploy compute-nodes-2     <SUCCESS>
  V
  | prepare compute-nodes-1    <SUCCESS>
  | deploy compute-nodes-1     <SUCCESS>
  |
  Finish (success)
 If there were a failure in preparing the ntp-node, the following would be the
 result::
  Start
  |
  | prepare ntp-node           <FAILED>
  | deploy ntp-node            <FAILED, due to prepare failure>
  V
  | prepare control-nodes      <FAILED, due to dependency>
  | deploy control-nodes       <FAILED, due to dependency>
  V
  | prepare monitoring-nodes   <SUCCESS>
  | deploy monitoring-nodes    <SUCCESS>
  V
  | prepare compute-nodes-2    <FAILED, due to dependency>
  | deploy compute-nodes-2     <FAILED, due to dependency>
  V
  | prepare compute-nodes-1    <FAILED, due to dependency>
  | deploy compute-nodes-1     <FAILED, due to dependency>
  |
  Finish (failed due to critical group failed)
 If a failure occurred during the deploy of compute-nodes-2, the following would
 result::
  Start
  |
  | prepare ntp-node           <SUCCESS>
  | deploy ntp-node            <SUCCESS>
  V
  | prepare control-nodes      <SUCCESS>
  | deploy control-nodes       <SUCCESS>
  V
  | prepare monitoring-nodes   <SUCCESS>
  | deploy monitoring-nodes    <SUCCESS>
  V
  | prepare compute-nodes-2    <SUCCESS>
  | deploy compute-nodes-2     <FAILED>
  V
  | prepare compute-nodes-1    <SUCCESS>
  | deploy compute-nodes-1     <SUCCESS>
  |
  Finish (success with some nodes/groups failed)
 Schemas
 ~~~~~~~
 A new schema will need to be provided by Shipyard to validate the
 DeploymentStrategy document.
 Documentation
 ~~~~~~~~~~~~~
 The Shipyard action documentation will need to include details defining the
 DeploymentStrategy document (mostly as defined here), as well as the update to
 the DeploymentConfiguration document to contain the name of the
 DeploymentStrategy document.
 .. _NetworkX: https://networkx.github.io/documentation/networkx-1.9/reference/generated/networkx.algorithms.dag.topological_sort.html