Add proposal for enabling TLS on all internal communications
Change-Id: I4b9d28c1e70d2aba39432f27d550b97691493cc2
This commit is contained in:
parent
30bbdf82c9
commit
8bc6eb20bc
@ -21,6 +21,15 @@ Antelope Specifications
|
||||
|
||||
specs/antelope/*
|
||||
|
||||
Zed Specifications
|
||||
------------------
|
||||
|
||||
.. toctree::
|
||||
:glob:
|
||||
:maxdepth: 1
|
||||
|
||||
specs/zed/*
|
||||
|
||||
Xena Specifications
|
||||
-------------------
|
||||
|
||||
|
286
specs/zed/internal-tls.rst
Normal file
286
specs/zed/internal-tls.rst
Normal file
@ -0,0 +1,286 @@
|
||||
Enabling TLS on Internal Communications
|
||||
#######################################
|
||||
:date: 2022-11-16 21:00
|
||||
:tags: ssl, tls, certificates, https, security
|
||||
|
||||
To improve the security of an OpenStack-Ansible deployments all traffic,
|
||||
both internal and external should be encrypted. There is already
|
||||
support for encrypting external traffic from all public endpoints that
|
||||
reside behind haproxy, but this is not the case for all internal traffic.
|
||||
|
||||
|
||||
Problem description
|
||||
===================
|
||||
|
||||
This problem can broadly be split into 3 sections:
|
||||
|
||||
* Securing internal communications to the internal haproxy VIP
|
||||
|
||||
* Securing internal communications from haproxy to backends
|
||||
|
||||
* Securing internal communications between services such as rabbitmq, galera,
|
||||
nova live migration and noVNC
|
||||
|
||||
|
||||
Securing internal communications to the internal haproxy VIP
|
||||
------------------------------------------------------------
|
||||
|
||||
Support for using TLS on in the internal haproxy VIP is already present in
|
||||
haproxy role and is enabled for the AIO deployment, but not enabled for new or
|
||||
upgrades of existing deployments.
|
||||
|
||||
There are no issues with enabling TLS on the internal haproxy VIP for new
|
||||
deployments, but for existing deployments an upgrade process needs to be
|
||||
implemented. The reason an upgrade process is required is because currently
|
||||
if you enabled TLS on the internal haproxy VIP it would cause downtime, until
|
||||
each client is configured to use HTTPS instead of HTTP.
|
||||
|
||||
Problems to resolve:
|
||||
|
||||
* Haproxy configuration to allow TLS to be enabled without downtime of API's on
|
||||
existing deployments
|
||||
|
||||
* OpenStack-Ansible upgrade process and upgrade scripts to enable TLS without
|
||||
downtime of API's on existing deployments
|
||||
|
||||
|
||||
Securing internal communications from haproxy to backends
|
||||
---------------------------------------------------------
|
||||
|
||||
Securing the communications from haproxy to the services backends is as
|
||||
important as securing communication to the internal haproxy VIP.
|
||||
|
||||
A large number of the services used with haproxy use UWSGI, meaning once TLS
|
||||
support is added to the UWSGI role there is only configuration to enable TLS
|
||||
and the generation of certificates required for each of the services.
|
||||
|
||||
For services that do not use USWGI, such a noVNC Proxy further investigation is
|
||||
required.
|
||||
|
||||
As with enabling TLS on the internal haproxy VIP for new deployments, there is
|
||||
no issue with enabling TLS from haproxy to backends, but an upgrade process for
|
||||
existing deployments is required. The reason an upgrade process is required is
|
||||
because if haproxy expects TLS backends, but TLS has not been enabled on the
|
||||
service yet the connection will fail and if you enable TLS on the service the
|
||||
connection will fail as haproxy is not configured for TLS.
|
||||
|
||||
Problems to resolve:
|
||||
|
||||
* Add TLS support to UWSGI
|
||||
|
||||
* Add configuration to role for each service that use UWSGI to enable TLS
|
||||
|
||||
* Add configuration to role for remaining services that do not use UWSGI
|
||||
|
||||
* Add configuration to OpenStack-Ansible to enable TLS on backend of each
|
||||
service
|
||||
|
||||
* OpenStack-Ansible upgrade process and upgrade scripts to enable TLS on
|
||||
backends without downtime of API's on existing deployments
|
||||
|
||||
Securing internal communications between services
|
||||
-------------------------------------------------
|
||||
|
||||
Many OpenStack services communicate directly with each other and do not use
|
||||
haproxy, these communications should also be secured. The work to secure these
|
||||
communications is already complete and enabled in the Yoga release of
|
||||
OpenStack-Ansible, for the following services:
|
||||
|
||||
* RabbitMQ
|
||||
|
||||
* Galera
|
||||
|
||||
* Nova live migrations
|
||||
|
||||
* noVNC (noVNC to compute nodes).
|
||||
|
||||
Problems to resolve:
|
||||
|
||||
* Secure the following services:
|
||||
|
||||
- Memcached
|
||||
|
||||
- etcd
|
||||
|
||||
- OVN/OVS
|
||||
|
||||
* Are there any services missing from the list that do not go via haproxy that
|
||||
need their communications securing?
|
||||
|
||||
Proposed change
|
||||
===============
|
||||
|
||||
Enable TLS on all internal communications.
|
||||
|
||||
Internal communications could be encrypted using a self-signed certificate,
|
||||
but as OpenStack-Ansible has support for issuing certificates from a
|
||||
self-signed private certificate authority using the ansible-role-pki, this
|
||||
should be used instead as it both encrypts the data and allows a client to
|
||||
trust the server.
|
||||
|
||||
In all cases a user should be able to override the certificates issued by a
|
||||
self-signed private certificate authority, allowing them to provide their own
|
||||
certificate which may have been issued by a publicly trusted certificate
|
||||
authority.
|
||||
|
||||
|
||||
Alternatives
|
||||
------------
|
||||
|
||||
None, internal communications should be protected and TLS is an appropriate
|
||||
and well used solution.
|
||||
|
||||
|
||||
Playbook/Role impact
|
||||
--------------------
|
||||
|
||||
Roles:
|
||||
|
||||
* Support for generating certificates using the ansible-role-pki role will be
|
||||
added to each service
|
||||
|
||||
* Configuring to enable/disable TLS will be added
|
||||
|
||||
|
||||
Upgrade impact
|
||||
--------------
|
||||
|
||||
Enabling TLS could be performed during or post upgrade.
|
||||
|
||||
As discussed in the problem description section, enabling TLS on the internal
|
||||
haproxy VIP and service backends for existing deployments will cause downtime
|
||||
during an upgrade if enabled. The reason it will cause downtime is that for both
|
||||
communications from internal client => internal haproxy VIP (server) and
|
||||
haproxy (client) => openstack service backend (server), both the client and
|
||||
server need to be updated to use TLS at the same time.
|
||||
|
||||
To mitigate this issue I propose an intermediate step during an upgrade, where
|
||||
haproxy frontend will accept both HTTP and HTTPS communications.
|
||||
This would be achieved by adding a new TCP frontend to haproxy that accepts
|
||||
both HTTP and HTTPS traffic and redirects to correct frontend for each,
|
||||
and means that openstack clients can carry on using the same well known port
|
||||
and haproxy looks after redirecting them to the correct frontend; HTTP or HTTPS.
|
||||
|
||||
To mitigate issues with haproxy<>backend communication, I suggest implementing
|
||||
"Separated Haproxy Service Config" feature[1] that configures openstack service
|
||||
and its haproxy service in the same playbook.
|
||||
|
||||
The other issue to be aware of is that when user wants to use predefined
|
||||
certificate, this certificate will be used on all VIPs, both internal and
|
||||
external.
|
||||
This means that if TLS is enabled on haproxy's internal VIP, internal clients
|
||||
must be able to trust the presented certificate if it is the same as the
|
||||
external certificate.
|
||||
This limitation does not apply to:
|
||||
- certbot, which can present a separate certificate on external interfaces.
|
||||
- PKI role which installs different certificates for external and internal
|
||||
VIPs by default
|
||||
|
||||
|
||||
Security impact
|
||||
---------------
|
||||
|
||||
This change will encrypt all internal communications, securing any sensitive
|
||||
data being sent, therefore security is improved.
|
||||
|
||||
|
||||
Performance impact
|
||||
------------------
|
||||
|
||||
Implementing TLS on all internal communications will lead to a small increase
|
||||
in the processing requirements and latency of servers and clients, but the
|
||||
increased security outweighs these.
|
||||
|
||||
|
||||
End user impact
|
||||
---------------
|
||||
|
||||
None, if the deployment is done correctly.
|
||||
|
||||
|
||||
Deployer impact
|
||||
---------------
|
||||
|
||||
* Deployer's will need to add monitoring of certificate expiry dates and renew
|
||||
is necessary, if a certificates expires connections between services will be
|
||||
dropped.
|
||||
|
||||
* This change should have no impact to deployer's of new deployments,
|
||||
OpenStack-Ansible will create the certificates, deploy them and
|
||||
configure all services to use them.
|
||||
|
||||
* This change will impact existing deployments and an upgrade process will be
|
||||
implemented to help minimise and possibly prevent this.
|
||||
|
||||
|
||||
Developer impact
|
||||
----------------
|
||||
|
||||
No impact, other that traffic will be encrypted meaning tools like tcpdump
|
||||
may provide less useful as they will not be able to the see the contents of
|
||||
packets.
|
||||
|
||||
|
||||
Dependencies
|
||||
------------
|
||||
|
||||
None.
|
||||
|
||||
|
||||
Implementation
|
||||
==============
|
||||
|
||||
Assignee(s)
|
||||
-----------
|
||||
|
||||
Primary assignee:
|
||||
Damian Dabrowski
|
||||
<damian@dabrowski.cloud>
|
||||
|
||||
|
||||
Work items
|
||||
----------
|
||||
|
||||
* Enable TLS support to UWSGI role
|
||||
|
||||
* Enable TLS backend support to haproxy role
|
||||
|
||||
* Add configuration to openstack services that use UWSGI to create TLS
|
||||
certificate and enable TLS on UWSGI
|
||||
|
||||
* Add configuration to remaining openstack services that do not use USWGI to
|
||||
enable TLS support
|
||||
|
||||
* Add configuration in OpenStack-Ansible to allow TLS for all service to be
|
||||
enabled on both the server and haproxy
|
||||
|
||||
* Update documentation on TLS configuration options
|
||||
|
||||
* Add documentation for upgrade procedure
|
||||
|
||||
* Add script to automate as much as possible of the upgrade
|
||||
|
||||
|
||||
Testing
|
||||
=======
|
||||
|
||||
These changes can be tested using the existing setup, but manual testing of
|
||||
upgrade procedure will be required to make this is does not cause any downtime,
|
||||
as the automated testing only confirms a working upgrade at the end.
|
||||
|
||||
|
||||
Documentation impact
|
||||
====================
|
||||
|
||||
As this change will add extra configuration options these will need to be
|
||||
documented.
|
||||
|
||||
The upgrade procedure for existing deployments will also have be documented,
|
||||
as if this functionality is not deployed correctly it may cause system
|
||||
distribution.
|
||||
|
||||
|
||||
References
|
||||
==========
|
||||
|
||||
[1] https://specs.openstack.org/openstack/openstack-ansible-specs/specs/antelope/separated-haproxy-service-config.html
|
Loading…
x
Reference in New Issue
Block a user