diff --git a/doc/source/specs/stx-5.0/approved/_placeholder.rst b/doc/source/specs/stx-5.0/approved/_placeholder.rst deleted file mode 100644 index 635d6d3..0000000 --- a/doc/source/specs/stx-5.0/approved/_placeholder.rst +++ /dev/null @@ -1,8 +0,0 @@ -.. placeholder: - -=========== -Placeholder -=========== - -This file is a placeholder and should be deleted when the first spec is moved -to this directory. \ No newline at end of file diff --git a/doc/source/specs/stx-5.0/approved/security-2007718-vault.rst b/doc/source/specs/stx-5.0/approved/security-2007718-vault.rst new file mode 100644 index 0000000..9a3111e --- /dev/null +++ b/doc/source/specs/stx-5.0/approved/security-2007718-vault.rst @@ -0,0 +1,303 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. http://creativecommons.org/licenses/by/3.0/legalcode + +====================================== +Secret Management with Hashicorp Vault +====================================== + +Storyboard: https://storyboard.openstack.org/#!/story/2007718 + +This story introduces Vault into the StarlingX solution in order to +support secret management. + +Problem description +=================== + +Users want the ability to store and access secrets securely. These secrets can +include credentials, encryption keys, API tokens and other data that should +not be stored in plain text on a system. Vault provides the ability to encrypt +and store secrets with access control via a range of authorization and access +policy configurations. + +Vault's features include dynamic secret generation, data encryption, leasing +and renewal, revocation and audit/logging. Vault is an open source project +and is supported for deployment in a Kubernetes cluster. More information is +available at: https://www.vaultproject.io/docs + +Use Cases +--------- + +* Developer wants to store secrets to be consumed by an application +* Developer wants to submit data for encryption before storing it elsewhere +* End user wants granular access control for secrets with detailed logs +* End user wants to dynamically generate API keys for scripts + +Proposed change +=============== + +Packaging & installation +------------------------ + +StarlingX will contain the Armada manifest and Vault helm charts, which will +be uploaded and applied to pull the container images from a public registry, +configure and launch the Vault pods. + +The Vault application will initially be packaged as a tarball that can be +transferred to the system and activated with system application-upload & +system application-apply. + +In the future, the app will be auto-installed during platform deployment in +in order to provide secret management for other StarlingX applications. + +The Vault application will provide system-overrides and support auto-apply on +system configuration changes. This specifically relates to the Vault HA +configuration parameters which will be set based on the StarlingX deployment +type. + +The user will be provided the ability to override the default configuration +parameters of the application via override parameters. + +Vault integration for platform secrets +-------------------------------------- + +StarlingX platform apps currently using Barbican for secrets storage will be +migrated to use Vault. This will include the exististing IPMI credentials and +external Docker registry credentials currently stored in Barbican. Vault is +being added as a cloud-native/container-based solution that can more easily +provide a robust feature set to hosted container applications. + +Vault storage backend +--------------------- + + +Vault supports several storage backends where encrypted data is persisted. +The suitable options for StarlingX are the Consul backend and the integrated +storage backend with the Raft consensus protocol. Integrated storage with Raft +is preferred for the following reasons: + +- Does not require the integration and deployment of another separate app. + Consul is a service mesh that offers many other unrelated features. This + would significantly broaden the maintenance scope for the Vault app. + +- Supports HA and provides backup/restore workflows. + +- Simple to deploy and understand. The official Vault helm charts automatically + create and mount a data storage PVC for each Vault pod and the Raft consensus + protocol is used to replicate Vault data to each pod. + +- Raft consensus protocol is at the heart of etcd in Kubernetes, so knowledge + is transferrable between the solutions. + +This table illustrates the failure tolerance for various cluster sizes. +The documentation recommends cluster sizes of 3 or 5. + ++---------+-------------+-------------------+ +| Servers | Quorum Size | Failure Tolerance | ++---------+-------------+-------------------+ +| 1 | 1 | 0 | ++---------+-------------+-------------------+ +| 2 | 2 | 0 | ++---------+-------------+-------------------+ +| 3 | 2 | 1 | ++---------+-------------+-------------------+ +| 4 | 3 | 1 | ++---------+-------------+-------------------+ +| 5 | 3 | 2 | ++---------+-------------+-------------------+ +| 6 | 4 | 2 | ++---------+-------------+-------------------+ +| 7 | 4 | 3 | ++---------+-------------+-------------------+ + +For additional information on storage backends, see: +https://www.vaultproject.io/docs/configuration/storage + +Alternatives +------------ + +In the secret management ecosystem, Vault offers what can be described as +'secret service' meaning that it provides secrets as a service to consumers. +This differs from other secret management tools that are focused on file +encyption. Many of the other products in the secret service category are cloud +provider products and cannot be directly integrated into a provider-agnostic +platform. + +Vault is the most fully featured and documented of the available self-hosted +options. Vault also provides official helm charts and will continue to +extend its cloud integration in the future. + +For a helpful comparison of secret management tools, see: +https://amido.com/app/uploads/2019/04/Secret-Management-tooling-Table-Tools.png + +Data model impact +----------------- + +Applications currently consuming secrets from Barbican will require code +changes to make the appropriate REST API calls to access Vault. This will also +require defining a policy schema for accessing the secrets in Vault along with +selecting a method of authorizing the applications to access Vault. Barbican +relies on Keystone as its authorization backend, while Vault supports a range +of authorization backends. Vault's Kubernetes auth method will be used to +authenticate apps running within the k8s cluster. The AppRole auth method will +be used for apps running on the underlying OS. + +REST API impact +--------------- + +Provides a new REST API for setting and consuming secrets and encryption +services. + +For API documentation, see: +https://www.vaultproject.io/api-docs + +Security impact +--------------- + +Vault is a security-focused application which aims to enhance the security +posture of the apps that consume its services. Concerns raised by this app +integration are discussed here. + +Handling of the root token and master key shards +________________________________________________ + +Initialization of a Vault server generates a root token used for logging into +the vault, as well as a configurable number of master key shards used for +unsealing the vault and rekey/recovery operations. + +Vault documentation recommends revoking the initial root token after initial +configuration is completed and other authorization methods have been made +available. Root tokens can be regenerated with a quorum of key shards present. + +The key shards are the linchpin of vault's security and are used to decrypt +the master key in vault's memory during unsealing. These keys must be preserved +but can invalidate vault's security if they are compromised. + +Unsealing Vault server pods +___________________________ + +Vault servers automatically come up in a sealed state where they are unable to +decrypt any of their stored data until a quorum of key shards are provided +during an unseal operation. This means that Vault servers running in a cluster +will be sealed if their pods are restarted for any reason. + +This challenge can be handled by configuring Vault to autounseal by querying +an external Key Management System (ie. from a cloud provider) for unseal keys, +querying another Vault server for unseal keys, or by storing the unseal keys +somewhere and configuring kubernetes to unseal Vault via the REST API when a +pod restarts. + +In the short-term, this integration will store the key shards locally in a +keyring with root-only permissions. This will require creating a custom script +allowing kubernetes to access these keys and provide them to pods via the Vault +REST API upon restart. + +In the longer-term, support will be extended to configure both an external KMS +or a second external Vault instance for auto unseal operations. + +Storing StarlingX system secrets along side user secrets +________________________________________________________ + +The strict policy enforcement and authorization structure of Vault is intended +for use cases where secrets must be isolated and their access must be finely +controlled. + + +Other end user impact +--------------------- + +None + +Developer impact +---------------- + +Developers working on StarlingX may wish to store secrets in Vault for use with +their applications. This will require considering how to set the secret, what +authorization method the application will use and if the secret will be +required during platform deployment. Vault cannot be made available until late +in the deployment because it runs in the kubernetes cluster. + +Developers that are working on kubernetes applications may want to leverage the +vault injection feature which injects secrets into a pod, making them available +at a specified path and does not require the application to be vault-aware. + +Upgrade impact +-------------- + +An upgrade path needs to be determined for migrating secrets from Barbican to +Vault. + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + +* Cole Walker + +Repos Impacted +-------------- + +* vault-armada-app +* ansible-playbooks + +Work Items +---------- + +* Create new repo for the new application 'Vault', define required armada + manifests and import helm charts for app + +* Implement system overrides and define Vault as a mandatory platform app + +* Auto-apply HA configuration based on system deployment type (AIO, Standard, + etc.) + +* Extend ansible role 'bringup-bootstrap-applications' to include Vault + +* Write automation to auto unseal restarted Vault pods using master key shards + + +Dependencies +============ + +* Kubernetes cluster + +Testing +======= + +* Vault pods should return to a ready state after being restarted + as indicated by 'kubectl get pods' + +* User overrides should be available for various parameters including HA + configuration, initialization parameters and audit configuration + +* The deployment should auto apply HA settings based on the system type + +* Users should be able to set and consume secrets + +* Activities should be logged to the audit storage backend + +Documentation Impact +==================== + +Documentation to be updated with user override configuration parameters and +usability of Vault in StarlingX + +References +========== + +* Feature storyboard: https://storyboard.openstack.org/#!/story/2007718 +* Vault: https://www.vaultproject.io/ + +History +======= + +.. list-table:: Revisions + :header-rows: 1 + + * - Release Name + - Description + * - STX 5.0 + - Introduced