Blueprint for separated haproxy service config

Change-Id: I26febd71333bf2d98e214da19a733bf0964fe865
This commit is contained in:
Damian Dabrowski 2023-01-19 22:36:18 +01:00
parent 2b72a22f77
commit 59e24e4ca8
1 changed files with 155 additions and 0 deletions

View File

@ -0,0 +1,155 @@
Separated Haproxy Service Config
#################################
:date: 2023-01-19 22:00
:tags: separated haproxy service config, internal tls
Currently all haproxy services are configured during exection of haproxy_server
role.
It may cause issues with variables scope and may complately break a service
until service role is executed.
These issues may be avoided if the current behavior will be changed and
haproxy services will be configured separately at the beginning of each
service playbook.
Problem description
===================
Preconfiguring all haproxy services may lead to some issues.
There are 2 examples:
1. Variables scope
Currently service definitions for haproxy roles stored in
inventory/group_vars/haproxy.yml contain variables like `neutron_plugin_type`.
It makes a problem because this is neutron's variable.
If someone wants to change its default value, they will probably set an
override in neutron group variables. It's problematic because haproxy is not
in neutron group, so all neutron's group variables won't have an effect for
haproxy role. In order to make haproxy respect this change, variable needs to
be defined for all hosts so both haproxy and neutron will have an access to it.
Additionally, we are currently working on encrypting traffic between haproxy
and service backends. Proposed PoC [1] does the same thing as described above.
It makes use of `glance_backend_https` variable belonging to glance role.
2. Strong dependency between haproxy role and service roles
Some changes in haproxy service need immediate reaction from service role.
For example: user enables TLS for communication between haproxy and glance
backends. In order to do that, haproxy role needs to be executed.
It will configure glance service to communicate with its backends over TLS,
but at this point backends are not ready to handle TLS connections.
In order to fix it, glance role needs to be executed, but it takes time and
increases downtime. Removing dependencies like this between roles would make
configuration process more reliable.
Please note that downtime will still occur. It will start after haproxy service
config and finish after first backend host will be configured.
In order to provide zero-downtime transition to TLS, further work related to
"internal TLS" project is required.
Proposed change
===============
Add an extra step at the beginning of each service role to configure haproxy
service(s) for it.
In this case haproxy services will be configured separately, so nova playbook
will configure nova haproxy services, glance playbook will configure its own
haproxy services etc.
Haproxy playbook will be only responsible for configuring services not related
to any role(letsencrypt, ceph-rgw, custom user-defined services etc.)
Alternatives
------------
No alternatives.
Playbook/Role impact
--------------------
Each playbook will contain an extra step responsible for configuring haproxy
service role(s) for it.
Upgrade impact
--------------
Some variables may become depracated or their behavior may change.
Security impact
---------------
No impact.
Performance impact
------------------
No impact.
End user impact
---------------
No impact.
Deployer impact
---------------
From now on, haproxy services will be configured separately when running
service playbooks(like os-nova-install.yml)
Developer impact
----------------
No impact.
Dependencies
------------
No dependencies.
Implementation
==============
Assignee(s)
-----------
Primary assignee:
Damian Dabrowski
<damian@dabrowski.cloud>
Work items
----------
- configure haproxy_server role to support separated service config
- configure service playbooks to include an extra step to configure haproxy
service
- solve all corner cases(like dependency between letsencrypt and horizon)
Testing
=======
Special attention is required for gating. Merging this change for all
roles may be complicated.
Documentation impact
====================
Documentation needs to be double checked. For sure we will need to update
it in few places.
References
==========
[1] https://review.opendev.org/c/openstack/openstack-ansible/+/821090