specs/doc/source/specs/stx-5.0/approved/security-2007718-vault.rst

11 KiB

Secret Management with Hashicorp Vault

Storyboard: https://storyboard.openstack.org/#!/story/2007718

This story introduces Vault into the StarlingX solution in order to support secret management.

Problem description

Users want the ability to store and access secrets securely. These secrets can include credentials, encryption keys, API tokens and other data that should not be stored in plain text on a system. Vault provides the ability to encrypt and store secrets with access control via a range of authorization and access policy configurations.

Vault's features include dynamic secret generation, data encryption, leasing and renewal, revocation and audit/logging. Vault is an open source project and is supported for deployment in a Kubernetes cluster. More information is available at: https://www.vaultproject.io/docs

Use Cases

  • Developer wants to store secrets to be consumed by an application
  • Developer wants to submit data for encryption before storing it elsewhere
  • End user wants granular access control for secrets with detailed logs
  • End user wants to dynamically generate API keys for scripts

Proposed change

Packaging & installation

StarlingX will contain the Armada manifest and Vault helm charts, which will be uploaded and applied to pull the container images from a public registry, configure and launch the Vault pods.

The Vault application will initially be packaged as a tarball that can be transferred to the system and activated with system application-upload & system application-apply.

In the future, the app will be auto-installed during platform deployment in in order to provide secret management for other StarlingX applications.

The Vault application will provide system-overrides and support auto-apply on system configuration changes. This specifically relates to the Vault HA configuration parameters which will be set based on the StarlingX deployment type.

The user will be provided the ability to override the default configuration parameters of the application via override parameters.

Vault integration for platform secrets

StarlingX platform apps currently using Barbican for secrets storage will be migrated to use Vault. This will include the exististing IPMI credentials and external Docker registry credentials currently stored in Barbican. Vault is being added as a cloud-native/container-based solution that can more easily provide a robust feature set to hosted container applications.

Vault storage backend

Vault supports several storage backends where encrypted data is persisted. The suitable options for StarlingX are the Consul backend and the integrated storage backend with the Raft consensus protocol. Integrated storage with Raft is preferred for the following reasons:

  • Does not require the integration and deployment of another separate app. Consul is a service mesh that offers many other unrelated features. This would significantly broaden the maintenance scope for the Vault app.
  • Supports HA and provides backup/restore workflows.
  • Simple to deploy and understand. The official Vault helm charts automatically create and mount a data storage PVC for each Vault pod and the Raft consensus protocol is used to replicate Vault data to each pod.
  • Raft consensus protocol is at the heart of etcd in Kubernetes, so knowledge is transferrable between the solutions.

This table illustrates the failure tolerance for various cluster sizes. The documentation recommends cluster sizes of 3 or 5.

Servers Quorum Size Failure Tolerance
1 1 0
2 2 0
3 2 1
4 3 1
5 3 2
6 4 2
7 4 3

For additional information on storage backends, see: https://www.vaultproject.io/docs/configuration/storage

Alternatives

In the secret management ecosystem, Vault offers what can be described as 'secret service' meaning that it provides secrets as a service to consumers. This differs from other secret management tools that are focused on file encyption. Many of the other products in the secret service category are cloud provider products and cannot be directly integrated into a provider-agnostic platform.

Vault is the most fully featured and documented of the available self-hosted options. Vault also provides official helm charts and will continue to extend its cloud integration in the future.

For a helpful comparison of secret management tools, see: https://amido.com/app/uploads/2019/04/Secret-Management-tooling-Table-Tools.png

Data model impact

Applications currently consuming secrets from Barbican will require code changes to make the appropriate REST API calls to access Vault. This will also require defining a policy schema for accessing the secrets in Vault along with selecting a method of authorizing the applications to access Vault. Barbican relies on Keystone as its authorization backend, while Vault supports a range of authorization backends. Vault's Kubernetes auth method will be used to authenticate apps running within the k8s cluster. The AppRole auth method will be used for apps running on the underlying OS.

REST API impact

Provides a new REST API for setting and consuming secrets and encryption services.

For API documentation, see: https://www.vaultproject.io/api-docs

Security impact

Vault is a security-focused application which aims to enhance the security posture of the apps that consume its services. Concerns raised by this app integration are discussed here.

Handling of the root token and master key shards

Initialization of a Vault server generates a root token used for logging into the vault, as well as a configurable number of master key shards used for unsealing the vault and rekey/recovery operations.

Vault documentation recommends revoking the initial root token after initial configuration is completed and other authorization methods have been made available. Root tokens can be regenerated with a quorum of key shards present.

The key shards are the linchpin of vault's security and are used to decrypt the master key in vault's memory during unsealing. These keys must be preserved but can invalidate vault's security if they are compromised.

Unsealing Vault server pods

Vault servers automatically come up in a sealed state where they are unable to decrypt any of their stored data until a quorum of key shards are provided during an unseal operation. This means that Vault servers running in a cluster will be sealed if their pods are restarted for any reason.

This challenge can be handled by configuring Vault to autounseal by querying an external Key Management System (ie. from a cloud provider) for unseal keys, querying another Vault server for unseal keys, or by storing the unseal keys somewhere and configuring kubernetes to unseal Vault via the REST API when a pod restarts.

In the short-term, this integration will store the key shards locally in a keyring with root-only permissions. This will require creating a custom script allowing kubernetes to access these keys and provide them to pods via the Vault REST API upon restart.

In the longer-term, support will be extended to configure both an external KMS or a second external Vault instance for auto unseal operations.

Storing StarlingX system secrets along side user secrets

The strict policy enforcement and authorization structure of Vault is intended for use cases where secrets must be isolated and their access must be finely controlled.

Other end user impact

None

Developer impact

Developers working on StarlingX may wish to store secrets in Vault for use with their applications. This will require considering how to set the secret, what authorization method the application will use and if the secret will be required during platform deployment. Vault cannot be made available until late in the deployment because it runs in the kubernetes cluster.

Developers that are working on kubernetes applications may want to leverage the vault injection feature which injects secrets into a pod, making them available at a specified path and does not require the application to be vault-aware.

Upgrade impact

An upgrade path needs to be determined for migrating secrets from Barbican to Vault.

Implementation

Assignee(s)

Primary assignee:

  • Cole Walker

Repos Impacted

  • vault-armada-app
  • ansible-playbooks

Work Items

  • Create new repo for the new application 'Vault', define required armada manifests and import helm charts for app
  • Implement system overrides and define Vault as a mandatory platform app
  • Auto-apply HA configuration based on system deployment type (AIO, Standard, etc.)
  • Extend ansible role 'bringup-bootstrap-applications' to include Vault
  • Write automation to auto unseal restarted Vault pods using master key shards

Dependencies

  • Kubernetes cluster

Testing

  • Vault pods should return to a ready state after being restarted as indicated by 'kubectl get pods'
  • User overrides should be available for various parameters including HA configuration, initialization parameters and audit configuration
  • The deployment should auto apply HA settings based on the system type
  • Users should be able to set and consume secrets
  • Activities should be logged to the audit storage backend

Documentation Impact

Documentation to be updated with user override configuration parameters and usability of Vault in StarlingX

References

History

Revisions
Release Name Description
STX 5.0 Introduced