octavia/releasenotes/notes/fix-multi-amp-down-failover-952618fb8d3d8ae6.yaml
Michael Johnson 0139f12c2e Fix failover when multiple amphora have failed
If a load balancer loses more than one amphora at the same time
the failover process will fail and leave the load balancer in
provisioning status ERROR.

This patch resolves this by failing over one amphora at a time
marking any amphora that are also failed in status ERROR. The health
manager will then failover the other failed amphora in subsequent checks.

This patch will update multiple healthy amphora in parallel and will
timeout failed amphroa using the new "active_connection_max_retries"
configuration setting used for "fail-fast" connections.

The patch also updates the amphora failover flow documentation to
show the full flow and not just the spares failover flow.

It updates the amphora driver "get_diagnostics" method to pass instead
of error.

It also adds a AmphoraComputeConnectivityWait task to explicitly wait
for a compute instance to come up and be reachable. This allows a longer
timeout and clarifies this may fail due to compute (nova) failures.
Previously the first plug vip task would do this wait.

Change-Id: Ief97ddda8261b5bbc54c6824f90ae9c7a2d81701
Story: 2001481
Task: 6202
2018-07-22 16:08:45 -07:00

6 lines
166 B
YAML

---
fixes:
- |
Fixes an issue where if more than one amphora fails at the same time,
failover might not fully complete, leaving the load balancer in ERROR.