Add Vault as managed application

This story will add Vault as a mangaged app on StarlingX. This
will provide the ability to manage secrets for both platform
applications and user applications.

Story: 2007718
Task: 40107

Depends-On: https://review.opendev.org/736251
Change-Id: I420649216ef575dedc6f90b486e8dcd78981760a
Signed-off-by: Cole Walker <cole.walker@windriver.com>
This commit is contained in:
Cole Walker 2020-06-17 10:45:05 -04:00
parent e2e5c442b8
commit 5b3e10cb1c
2 changed files with 303 additions and 8 deletions

View File

@ -1,8 +0,0 @@
.. placeholder:
===========
Placeholder
===========
This file is a placeholder and should be deleted when the first spec is moved
to this directory.

View File

@ -0,0 +1,303 @@
..
This work is licensed under a Creative Commons Attribution 3.0 Unported
License. http://creativecommons.org/licenses/by/3.0/legalcode
======================================
Secret Management with Hashicorp Vault
======================================
Storyboard: https://storyboard.openstack.org/#!/story/2007718
This story introduces Vault into the StarlingX solution in order to
support secret management.
Problem description
===================
Users want the ability to store and access secrets securely. These secrets can
include credentials, encryption keys, API tokens and other data that should
not be stored in plain text on a system. Vault provides the ability to encrypt
and store secrets with access control via a range of authorization and access
policy configurations.
Vault's features include dynamic secret generation, data encryption, leasing
and renewal, revocation and audit/logging. Vault is an open source project
and is supported for deployment in a Kubernetes cluster. More information is
available at: https://www.vaultproject.io/docs
Use Cases
---------
* Developer wants to store secrets to be consumed by an application
* Developer wants to submit data for encryption before storing it elsewhere
* End user wants granular access control for secrets with detailed logs
* End user wants to dynamically generate API keys for scripts
Proposed change
===============
Packaging & installation
------------------------
StarlingX will contain the Armada manifest and Vault helm charts, which will
be uploaded and applied to pull the container images from a public registry,
configure and launch the Vault pods.
The Vault application will initially be packaged as a tarball that can be
transferred to the system and activated with system application-upload &
system application-apply.
In the future, the app will be auto-installed during platform deployment in
in order to provide secret management for other StarlingX applications.
The Vault application will provide system-overrides and support auto-apply on
system configuration changes. This specifically relates to the Vault HA
configuration parameters which will be set based on the StarlingX deployment
type.
The user will be provided the ability to override the default configuration
parameters of the application via override parameters.
Vault integration for platform secrets
--------------------------------------
StarlingX platform apps currently using Barbican for secrets storage will be
migrated to use Vault. This will include the exististing IPMI credentials and
external Docker registry credentials currently stored in Barbican. Vault is
being added as a cloud-native/container-based solution that can more easily
provide a robust feature set to hosted container applications.
Vault storage backend
---------------------
Vault supports several storage backends where encrypted data is persisted.
The suitable options for StarlingX are the Consul backend and the integrated
storage backend with the Raft consensus protocol. Integrated storage with Raft
is preferred for the following reasons:
- Does not require the integration and deployment of another separate app.
Consul is a service mesh that offers many other unrelated features. This
would significantly broaden the maintenance scope for the Vault app.
- Supports HA and provides backup/restore workflows.
- Simple to deploy and understand. The official Vault helm charts automatically
create and mount a data storage PVC for each Vault pod and the Raft consensus
protocol is used to replicate Vault data to each pod.
- Raft consensus protocol is at the heart of etcd in Kubernetes, so knowledge
is transferrable between the solutions.
This table illustrates the failure tolerance for various cluster sizes.
The documentation recommends cluster sizes of 3 or 5.
+---------+-------------+-------------------+
| Servers | Quorum Size | Failure Tolerance |
+---------+-------------+-------------------+
| 1 | 1 | 0 |
+---------+-------------+-------------------+
| 2 | 2 | 0 |
+---------+-------------+-------------------+
| 3 | 2 | 1 |
+---------+-------------+-------------------+
| 4 | 3 | 1 |
+---------+-------------+-------------------+
| 5 | 3 | 2 |
+---------+-------------+-------------------+
| 6 | 4 | 2 |
+---------+-------------+-------------------+
| 7 | 4 | 3 |
+---------+-------------+-------------------+
For additional information on storage backends, see:
https://www.vaultproject.io/docs/configuration/storage
Alternatives
------------
In the secret management ecosystem, Vault offers what can be described as
'secret service' meaning that it provides secrets as a service to consumers.
This differs from other secret management tools that are focused on file
encyption. Many of the other products in the secret service category are cloud
provider products and cannot be directly integrated into a provider-agnostic
platform.
Vault is the most fully featured and documented of the available self-hosted
options. Vault also provides official helm charts and will continue to
extend its cloud integration in the future.
For a helpful comparison of secret management tools, see:
https://amido.com/app/uploads/2019/04/Secret-Management-tooling-Table-Tools.png
Data model impact
-----------------
Applications currently consuming secrets from Barbican will require code
changes to make the appropriate REST API calls to access Vault. This will also
require defining a policy schema for accessing the secrets in Vault along with
selecting a method of authorizing the applications to access Vault. Barbican
relies on Keystone as its authorization backend, while Vault supports a range
of authorization backends. Vault's Kubernetes auth method will be used to
authenticate apps running within the k8s cluster. The AppRole auth method will
be used for apps running on the underlying OS.
REST API impact
---------------
Provides a new REST API for setting and consuming secrets and encryption
services.
For API documentation, see:
https://www.vaultproject.io/api-docs
Security impact
---------------
Vault is a security-focused application which aims to enhance the security
posture of the apps that consume its services. Concerns raised by this app
integration are discussed here.
Handling of the root token and master key shards
________________________________________________
Initialization of a Vault server generates a root token used for logging into
the vault, as well as a configurable number of master key shards used for
unsealing the vault and rekey/recovery operations.
Vault documentation recommends revoking the initial root token after initial
configuration is completed and other authorization methods have been made
available. Root tokens can be regenerated with a quorum of key shards present.
The key shards are the linchpin of vault's security and are used to decrypt
the master key in vault's memory during unsealing. These keys must be preserved
but can invalidate vault's security if they are compromised.
Unsealing Vault server pods
___________________________
Vault servers automatically come up in a sealed state where they are unable to
decrypt any of their stored data until a quorum of key shards are provided
during an unseal operation. This means that Vault servers running in a cluster
will be sealed if their pods are restarted for any reason.
This challenge can be handled by configuring Vault to autounseal by querying
an external Key Management System (ie. from a cloud provider) for unseal keys,
querying another Vault server for unseal keys, or by storing the unseal keys
somewhere and configuring kubernetes to unseal Vault via the REST API when a
pod restarts.
In the short-term, this integration will store the key shards locally in a
keyring with root-only permissions. This will require creating a custom script
allowing kubernetes to access these keys and provide them to pods via the Vault
REST API upon restart.
In the longer-term, support will be extended to configure both an external KMS
or a second external Vault instance for auto unseal operations.
Storing StarlingX system secrets along side user secrets
________________________________________________________
The strict policy enforcement and authorization structure of Vault is intended
for use cases where secrets must be isolated and their access must be finely
controlled.
Other end user impact
---------------------
None
Developer impact
----------------
Developers working on StarlingX may wish to store secrets in Vault for use with
their applications. This will require considering how to set the secret, what
authorization method the application will use and if the secret will be
required during platform deployment. Vault cannot be made available until late
in the deployment because it runs in the kubernetes cluster.
Developers that are working on kubernetes applications may want to leverage the
vault injection feature which injects secrets into a pod, making them available
at a specified path and does not require the application to be vault-aware.
Upgrade impact
--------------
An upgrade path needs to be determined for migrating secrets from Barbican to
Vault.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
* Cole Walker
Repos Impacted
--------------
* vault-armada-app
* ansible-playbooks
Work Items
----------
* Create new repo for the new application 'Vault', define required armada
manifests and import helm charts for app
* Implement system overrides and define Vault as a mandatory platform app
* Auto-apply HA configuration based on system deployment type (AIO, Standard,
etc.)
* Extend ansible role 'bringup-bootstrap-applications' to include Vault
* Write automation to auto unseal restarted Vault pods using master key shards
Dependencies
============
* Kubernetes cluster
Testing
=======
* Vault pods should return to a ready state after being restarted
as indicated by 'kubectl get pods'
* User overrides should be available for various parameters including HA
configuration, initialization parameters and audit configuration
* The deployment should auto apply HA settings based on the system type
* Users should be able to set and consume secrets
* Activities should be logged to the audit storage backend
Documentation Impact
====================
Documentation to be updated with user override configuration parameters and
usability of Vault in StarlingX
References
==========
* Feature storyboard: https://storyboard.openstack.org/#!/story/2007718
* Vault: https://www.vaultproject.io/
History
=======
.. list-table:: Revisions
:header-rows: 1
* - Release Name
- Description
* - STX 5.0
- Introduced