From 9976555a36d103005bfd35a4814f966215e4bf6b Mon Sep 17 00:00:00 2001 From: James Gibson Date: Thu, 23 Dec 2021 16:39:58 +0000 Subject: [PATCH] Add proposal for enabling TLS on all internal communications Change-Id: I4b9d28c1e70d2aba39432f27d550b97691493cc2 --- doc/source/index.rst | 9 ++ specs/zed/internal-tls.rst | 283 +++++++++++++++++++++++++++++++++++++ 2 files changed, 292 insertions(+) create mode 100644 specs/zed/internal-tls.rst diff --git a/doc/source/index.rst b/doc/source/index.rst index 8746d99..b7dbc65 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -21,6 +21,15 @@ Antelope Specifications specs/antelope/* +Zed Specifications +------------------ + +.. toctree:: + :glob: + :maxdepth: 1 + + specs/zed/* + Xena Specifications ------------------- diff --git a/specs/zed/internal-tls.rst b/specs/zed/internal-tls.rst new file mode 100644 index 0000000..b477daa --- /dev/null +++ b/specs/zed/internal-tls.rst @@ -0,0 +1,283 @@ +Enabling TLS on Internal Communications +####################################### +:date: 2022-11-16 21:00 +:tags: ssl, tls, certificates, https, security + +To improve the security of an OpenStack-Ansible deployments all traffic, +both internal and external should be encrypted. There is already +support for encrypting external traffic from all public endpoints that +reside behind haproxy, but this is not the case for all internal traffic. + + +Problem description +=================== + +This problem can broadly be split into 3 sections: +* Securing internal communications to the internal haproxy VIP + +* Securing internal communications from haproxy to backends + +* Securing internal communications between services such as rabbitmq, galera, + nova live migration and noVNC + + +Securing internal communications to the internal haproxy VIP +------------------------------------------------------------ + +Support for using TLS on in the internal haproxy VIP is already present in +haproxy role and is enabled for the AIO deployment, but not enabled for new or +upgrades of existing deployments. + +There are no issues with enabling TLS on the internal haproxy VIP for new +deployments, but for existing deployments an upgrade process needs to be +implemented. The reason an upgrade process is required is because currently +if you enabled TLS on the internal haproxy VIP it would cause downtime, until +each client is configured to use HTTPS instead of HTTP when communicating with +keystone. + +Problems to resolve: + +* Haproxy configuration to allow TLS to be enabled without downtime of API's on + existing deployments + +* OpenStack-Ansible upgrade process and upgrade scripts to enable TLS without + downtime of API's on existing deployments + + +Securing internal communications from haproxy to backends +--------------------------------------------------------- + +Securing the communications from haproxy to the services backends is as +important as securing communication to the internal haproxy VIP. + +A large number of the services used with haproxy use UWSGI, meaning once TLS +support is added to the UWSGI role there is only configuration to enable TLS +and the generation of certificates required for each of the services. + +For services that do not use USWGI, such a noVNC Proxy further investigation is +required. + +As with enabling TLS on the internal haproxy VIP for new deployments, there is +no issue with enabling TLS from haproxy to backends, but an upgrade process for +existing deployments is required. The reason an upgrade process is required is +because if haproxy expects TLS backends, but TLS has not been enabled on the +service yet the connection will fail and if you enable TLS on the service the +connection will fail as haproxy is not configured for TLS. + +Problems to resolve: +* Add TLS support to UWSGI + +* Add configuration to role for each service that use UWSGI to enable TLS + +* Add configuration to role for remaining services that do not use UWSGI + +* Add configuration to OpenStack-Ansible to enable TLS on backend of each + service + +* OpenStack-Ansible upgrade process and upgrade scripts to enable TLS on + backends without downtime of API's on existing deployments + +Securing internal communications between services +------------------------------------------------- + +Many OpenStack services communicate directly with each other and do not use +haproxy, these communications should also be secured. The work to secure these +communications is already complete and enabled in the Yoga release of +OpenStack-Ansible, for the following services: +* RabbitMQ + +* Galera + +* Nova live migrations + +* noVNC (noVNC to compute nodes). + +Problems to resolve: +* Secure the following services: + + - Memcached + + - etcd + + - OVN/OVS + +* Are there any services missing from the list that do not go via haproxy that + need their communications securing? + +Proposed change +=============== + +Enable TLS on all internal communications. + +Internal communications could be encrypted using a self-signed certificate, +but as OpenStack-Ansible has support for issuing certificates from a +self-signed private certificate authority using the ansible-role-pki, this +should be used instead as it both encrypts the data and allows a client to +trust the server. + +In all cases a user should be able to override the certificates issued by a +self-signed private certificate authority, allowing them to provide their own +certificate which may have been issued by a publicly trusted certificate +authority. + + +Alternatives +------------ + +None, internal communications should be protected and TLS is an appropriate +and well used solution. + + +Playbook/Role impact +-------------------- + +Roles: + +* Support for generating certificates using the ansible-role-pki role will be + added to each service + +* Configuring to enable/disable TLS will be added + + +Upgrade impact +-------------- + +Enabling TLS could be performed during or post upgrade. + +As discussed in the problem description section, enabling TLS on the internal +haproxy VIP and service backends for existing deployments will cause downtime +during an upgrade if enabled. The reason it will cause downtime is that for both +communications from internal client => internal haproxy VIP (server) and +haproxy (client) => openstack service backend (server), both the client and +server need to be updated to use TLS at the same time. + +To mitigate this issue I propose an intermediate step during an upgrade, where +keystone frontend will accept both HTTP and HTTPS communications. +This would be achieved by adding a new TCP frontend to haproxy that accepts +both HTTP and HTTPS traffic and redirects to correct frontend for each, +and means that openstack clients can carry on using the same well known port +and haproxy looks after redirecting them to the correct frontend; HTTP or HTTPS. + +To mitigate issues with haproxy<>backend communication, I suggest implementing +"Separated Haproxy Service Config" feature[1] that configures openstack service +and its haproxy service in the same playbook. + +The other issue to be aware of is that when user wants to use predefined +certificate, this certificate will be used on all VIPs, both internal and +external. +This means that if TLS is enabled on haproxy's internal VIP, internal clients +must be able to trust the presented certificate if it is the same as the +external certificate. +This limitation does not apply to: +- certbot, which can present a separate certificate on external interfaces. +- PKI role which installs different certificates for external and internal +VIPs by default + + +Security impact +--------------- + +This change will encrypt all internal communications, securing any sensitive +data being sent, therefore security is improved. + + +Performance impact +------------------ + +Implementing TLS on all internal communications will lead to a small increase +in the processing requirements and latency of servers and clients, but the +increased security outweighs these. + + +End user impact +--------------- + +None, if the deployment is done correctly. + + +Deployer impact +--------------- + +* Deployer's will need to add monitoring of certificate expiry dates and renew + is necessary, if a certificates expires connections between services will be + dropped. + +* This change should have no impact to deployer's of new deployments, + OpenStack-Ansible will create the certificates, deploy them and + configure all services to use them. + +* This change will impact existing deployments and an upgrade process will be + implemented to help minimise and possibly prevent this. + + +Developer impact +---------------- + +No impact, other that traffic will be encrypted meaning tools like tcpdump +may provide less useful as they will not be able to the see the contents of +packets. + + +Dependencies +------------ + +None. + + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + Damian Dabrowski + + + +Work items +---------- + +* Enable TLS support to UWSGI role + +* Enable TLS backend support to haproxy role + +* Add configuration to openstack services that use UWSGI to create TLS + certificate and enable TLS on UWSGI + +* Add configuration to remaining openstack services that do not use USWGI to + enable TLS support + +* Add configuration in OpenStack-Ansible to allow TLS for all service to be + enabled on both the server and haproxy + +* Update documentation on TLS configuration options + +* Add documentation for upgrade procedure + +* Add script to automate as much as possible of the upgrade + + +Testing +======= + +These changes can be tested using the existing setup, but manual testing of +upgrade procedure will be required to make this is does not cause any downtime, +as the automated testing only confirms a working upgrade at the end. + + +Documentation impact +==================== + +As this change will add extra configuration options these will need to be +documented. + +The upgrade procedure for existing deployments will also have be documented, +as if this functionality is not deployed correctly it may cause system +distribution. + + +References +========== + +[1] https://specs.openstack.org/openstack/openstack-ansible-specs/specs/antelope/separated-haproxy-service-config.html