We've noticed that some web and AI crawlers are crawling each backend
directly. This is undesirable because it means search indexes can return
non canonical urls (an alternative approach to address this can be found
in https://review.opendev.org/c/opendev/system-config/+/962826) but also
because it means specific backends may be targetted and overloaded
without the load balancer being aware. Forcing all communication through
the load balancer should help ensure that load is more evenly
distributed across all backends.
We do lose the ability to test individual backends in a trivial manner
(this has been particularly helpful during backend upgrades to verify
the first to upgrade is happy early). Instead we'll need to use ssh -L
and /etc/hosts overrides to ensure that https certs match for proxied
connections.
Given these tradeoffs consider this change a request for comment. I
appreciate any feedback on this proposal.
Note that the haproxy configuration for the test gitea lb is updated to
use the same IP addresses as the iptables rules. In a system-config
ansible context that is host.public_v4 which run-base.yaml sets to
nodepool.private_ipv4 in the Zuul Ansible context. Doing this is
necessary now that we don't allow any traffic to these ports.
Change-Id: Ib910f2d5c70c4462363efc4c7ed3a8e7e44b36bc