Commit Graph

277 Commits (65e132a734f005f090a384bfa129482d195c6d6e)

Author SHA1 Message Date
Michael Johnson 06ce4777c3 Fix multi-listener load balancers
Load balancers with multiple listeners, running on an amphora image
with HAProxy 1.8 or newer can experience excessive memory usage that
may lead to an ERROR provisioning_status.
This patch resolves this issue by consolidating the listeners into
a single haproxy process inside the amphora.

Story: 2005412
Task: 34744
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: Idaccbcfa0126f1e26fbb3ad770c65c9266cfad5b
4 years ago
Zuul 954025cbc3 Merge "Fix a python3 issue in the amphora-agent" 4 years ago
Michael Johnson dc459e2213 Fix a python3 issue in the amphora-agent
An exception handler in the amphora-agent has a python3 string
comparison bug that will cause a TypeError.
This patch fixes that bug and adds test coverage for the
start_stop_listener.

Change-Id: I6f5d95c5f875edda530f54ae72386d6495235ca6
Story: 2005898
Task: 33760
4 years ago
German Eichberger 686303e79d Amphora logging
Configure rsyslog to forward logs to a target host

Co-Authored-By: Michael Johnson <johnsomor@gmail.com>
Story: 1665069
Task: 33646

Change-Id: I00703f86555cbb574b943794b14a36fbc644f1b2
4 years ago
Michael Johnson 80ddbaeef4 Align logging in the amphora
This patch configures the primary components of the amphora to log
to syslog using consistent logging facilities.
By default, user traffic logs will go to LOG_LOCAL0 and the amphora
processes (haproxy, keepalived, etc.) will log to LOG_LOCAL1.

This is a patch supporting log offloading.

Change-Id: Ifda91e0310e812e34f1e398dd3176af8a9c58f89
Story: 1665069
Task: 5486
4 years ago
Zuul 59660fb365 Merge "Force amp-agent communication to TLSv1.2" 4 years ago
Zuul 09020b6bfc Merge "Add Python 3.7 support" 4 years ago
Zuul 6fda9ff945 Merge "Make sure amphora logging works the same on py2 and py3" 4 years ago
zhulingjie ff50886d79 Update hacking version to latest
This resolves extranous "improper escape sequence" warnings on
python 3.6+[1].

Note, this does not resolve those warnings from pylint. There
is already another proposed patch to address pylint[2].

[1] https://review.opendev.org/494322
[2] https://review.opendev.org/635236

Change-Id: Ie160436913e4d935bab118d31ba10193ac38bd8f
4 years ago
Adam Harwell 5b831f2a5b Force amp-agent communication to TLSv1.2
Also allow configuration of this minimum.
The previous default of SSLv2/3 is very insecure.

Change-Id: If34c7c34d9a6a77685fb177976dc2070760c7b37
4 years ago
Carlos Goncalves c4faac25de Add Python 3.7 support
In order to support Python 3.7, pylint has to be updated to 2.0.0
minimum. Newer versions of Pylint enforce additional checkers which can
be addressed with some code refactoring rather than silently ignoring
them in pylintrc; except useless-object-inheritance which is required to
be silented so that we stay compatible with Python 2.x.

Story: 2004073
Task: 27434

Change-Id: I52301d763797d619f195bd8a1c32bc47f1e68420
4 years ago
Erik Olof Gunnar Andersson 0000412cf4 Make sure amphora logging works the same on py2 and py3
In Python 3.3 IOError is just an alias of OSError. This
causes logging in a very specific scenario to not log
the appropriate message, as one code path is unreachable. 
This is fixed in this patch by merging the two exception paths.

Story: 2005576
Task: 30765

Change-Id: Ie81de8e85753fde1516aea0b084df6a0c513ad7b
4 years ago
Zuul fd816a72e7 Merge "Fix IPv6 in Active/Standby topology on CentOS" 4 years ago
Michael Johnson 41ff43131f Fix the amphora base port coming up
A recent change[1] broke the base port IP address coming up in the
amphora. This would cause active/standby and single topology amphora
with members on the VIP subnet to fail.

This patch resolves this issue by not flushing the eth1:0 address.

Story: 2005383
Task: 30368

[1] https://review.openstack.org/#/c/648504/

Change-Id: I52e7e9f172b7783bae09be76cc137f4e7198165f
4 years ago
Gregory Thiemonge 951afb9a0a Fix IPv6 in Active/Standby topology on CentOS
Avoid duplicated IPv6 VIP addresses on CentOS in Active/Standby mode,
Keepalived will set/unset the VIP address.

Story: 2005365
Task: 30340

Change-Id: I05b31ba628bafeefec36cc5c000dae1aefb63d67
4 years ago
Carlos Goncalves 95a872fcd9 Fix VIP plugging on CentOS-based amphorae
ifup does not provide option -v in CentOS-based amphorae.
This option was added in I0dbb145ab9a0bb8f831c1db28cabd262f9394e7e.

Story: 2005341
Task: 30288

Change-Id: I56947e0d2bb207b59b0b3928efc96546d6410f43
4 years ago
Michael Johnson 23a411413f Fix ifup failures on member interfaces with IPv6
When an older version of ifup is used, there are cases where bringing
up an IPv6 address on an interface will fail with "RTNETLINK answers:
File exists"
This patch corrects this issue by bringing the interface up and
flushing the existing addresses prior to the ifup.
This returns a previous behavior of the ifdown/ifup commands.

Change-Id: I0dbb145ab9a0bb8f831c1db28cabd262f9394e7e
Story: 2005320
Task: 30248
4 years ago
Zuul 93baf20b7d Merge "Resolve amphora agent read timeout issue" 4 years ago
ZhaoBo 7aa115a553 Add 2 new fields into Pool API for support re-encryption
Add tls_ca_container_id and crl_container_id into Pool API.

Story: 2003858
Task: 26672
Co-Authored-By: Michael Johnson <johnsomor@gmail.com>
Change-Id: I6cd6e2ca8e48a5df707a70d22505dec9d752c7eb
4 years ago
ZhaoBo aa7ac7ab73 Pool support sni cert for backend re-encryption
Add 1 fields like Listener does, which is 'tls_container_ref', this
field is introduced into Pool for storage the pool client certificate to
the backend servers, when the traffic willing to bring a cert to the
servers and check for tls connection.

Story: 2003859
Task: 26685
Change-Id: I29b7c7116e6087c942179ed9efdead494ef277a3
4 years ago
ZhaoBo 20509e2337 Add crl-file option for certification
Add crl-file in Listener side.

Story: 2002165
Co-Authored-By: Michael Johnson <johnsomor@gmail.com>
Change-Id: I9e2ec06719fbbfd19482c2b8d39220e7e4ed81e3
4 years ago
ZhaoBo 0cc546a7c7 Add client_ca_tls_container_ref to listener API
This patch add 'client_ca_tls_container_ref' into listener API for front
client authentication.

Story: 2002165
Task: 20018
Co-Authored-By: Michael Johnson <johnsomor@gmail.com>
Change-Id: I8a96d6fdfe53a16d1abcfd09bc6afedd6c490de2
4 years ago
zhulingjie 2a057474a8 Update json module to jsonutils
json is deprecated, should use oslo_serialization.jsonutils
instead.

Change-Id: I1392004e32cc835e803c9a953b4581c75049b950
4 years ago
Michael Johnson 0f0aa02161 Resolve amphora agent read timeout issue
Occasionally the test jobs[1] will fail with:
octavia.amphorae.drivers.haproxy.rest_api_driver [-]
Could not connect to instance. Read timed out. (read timeout=120.0)

This patch increases the default read timeout to 180 and changes the
directory copy that would subsequently fail to be more idempotent.

[1] http://logs.openstack.org/09/613709/14/check/ \
octavia-v2-dsvm-scenario-two-node/d83db12/controller2/logs/ \
screen-o-cw.txt.gz#_Feb_08_21_58_23_919928

Change-Id: Ia0bd6762c2605ce240a549b3e90e5c44b65897a5
4 years ago
Michael Johnson 5d7f10f6b8 Fix flavors support when using spares pool
This patch validates that a flavor is compatible with using spares
pool amphora. It will also update the amphora-agent config after
a spares pool amphora has been allocated.

This patch enables the ability to update a running amphora's agent
configuration and have the mutatable options be adopted.

The following amphora agent configuration options can be updated:
heartbeat_key
controller_ip_port_list
heartbeat_interval
loadbalancer_topology

This patch adds the support to the amphora-agent and the amphora
driver. A follow on patch will expose this capabililty via the
amphora admin API.

Change-Id: I97bdf5188808193516509f20767e82c0f8d2f5a5
4 years ago
Michael Johnson 987d14406c Fix the amphora noop driver
The dual-amp-down fix added an amphora parameter to the amphora driver
interface, but failed to update the driver base and the noop driver.
This patch corrects that oversight.

Depends-On: https://review.openstack.org/634992
Change-Id: I7bd63c933f8e7cd10ff5c89fafbbb09e8cc9e3e1
4 years ago
Michael Johnson 8ec61f4f23 Fix a topology bug in flavors support
This patch fixes an oversight in the addition of flavors support in the
amphora driver[1]. The amphora-agent configuration file was still getting
the topology selected in the configuration file as opposed to the
topology selected in the flavor.

This is an additional patch at the end of the chain as it leverages
changes that were made in later flavor patches that pass the flavor
into the taskflow flow.

A follow on patch will address spares pool amphora.

[1] https://review.openstack.org/#/c/621323

Change-Id: I4c2b398b562970f128e06794690ffd7c2977db08
4 years ago
Hang Yang b2162c39a2 Fix prefix for vip_ipv6
Currently we calculate prefix based on netmask when writing the vip
interface file. Since netmask has been converted to prefix in ipv6,
this patch will avoid converting it to prefix twice which could
result in a wrong prefix length.

Also fix a bug in another test that relies on osutils, but wasn't
mocking correctly.

Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: I9ee0cce12a975f4ab8f3df2707b355aab35c6cb3
4 years ago
Zuul 24f40a80a1 Merge "Add missing ws separator between words" 5 years ago
Michael Johnson b616cd5a33 Bring up secondary IPs on member networks
Octavia is plugging member networks, but only bringing up the first
fixed IP address on that network. This can mean that a secondary
fixed IP on the network, such as the IPv6 address is not brought
up at member creation time.

Change-Id: Ic5b19a303e53ab62875c4fc4be6ac03f926a6832
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Story: 2004113
Task: 27535
5 years ago
Nir Magnezi fbb9397979 Fix IPv6 in Active/Standby topology
Load balancers with IPv6 VIP addresses would fail to create due to
a duplicate address detection issue. The keepalived process would also
crash with a segfault due to a known bug[1].

This patch resolves both issues and allows load balancers with IPv6
VIP addresses to be created in active/standby topology.

[1] https://github.com/acassen/keepalived/issues/457

Story: 2003451
Task: 24657
Co-Authored-By: Michael Johnson <johnsomor@gmail.com>

Change-Id: I15a4be05740e2657f998902d468e57763c3ed52e
5 years ago
zhufl 5bdc67c1f9 Add missing ws separator between words
This is to add missing ws separator between words.

Change-Id: Ib1f553b401092c97fef9cc3c48a6c1c80d7c4f4b
5 years ago
Michael Johnson 63b5cfc14e Fix VIP plug failure if netns directory exists
In rare cases, the network namespace may already exist in the amphroa.
VIP plug should not fail if the directory is already present.

Change-Id: I33c2e1740bff1313ba6b8d3ef2ea4fe494263751
Story: 2004300
Task: 27856
5 years ago
Zuul 3cf910baa8 Merge "Simplify keepalived lvsquery parsing for UDP" 5 years ago
Zuul 687a0e8472 Merge "Separate the thread pool for health and stats update" 5 years ago
Michael Johnson 9b6aa47c03 Fix an upgrade issue for CentOS 7 amphora
A recent patch[1] (stein master) added the http-reuse option to the
haproxy template for pools. This feature is not available in the HAProxy
version included with CentOS 7, 1.5.x. This could cause an upgrade issue
if the control plane was upgraded to Stein, but the cloud still had older
CentOS based amphora.

This patch corrects that issue by checking the HAProxy version in the
amphora and adjusting the template if it finds an older HAProxy.

This patch also updates the test_health_check_stale_amphora test to
not wait (sleep) for the full heartbeat_timeout.

[1] https://review.openstack.org/#/c/598379/

Change-Id: I3d990d1d3cd93dbeced9edc53f9c166610dafcd0
Story: 2003901
Task: 26775
5 years ago
Adam Harwell 8e886959c0 Simplify keepalived lvsquery parsing for UDP
The healthcheck parsing can be simplified a lot, which also fixes some
testing issues on OSX.
Fix a missing mock in the amphora server tests too, which was causing
the centos distro tests to run improperly / miss.

Change-Id: I7606822b820b501cd983e8975bcb3b7d5c58d904
5 years ago
Michael Johnson 2170cc6c45 Update amphora-agent to report UDP listener health
Currently the amphora-agent is not reporting UDP listener health
when the UDP listener does not have a pool and members.
This patch changes that behavior to report the listener as healthy
if the keepalived process is started and running in the amphora.

This patch also introduces message versioning for the health
heartbeat messages.

It also corrects a few assertEqual tests that had the reference and
actual values backwards.

Change-Id: Ifc28b4991852e59c0d27b4ab3d1afc4e9965e88b
Story: 2003592
Task: 24911
5 years ago
Michael Johnson c8074cd18a Fix the amphora noop driver
The amphora no-op driver did not get updated properly for the multi-amphora
failover fix.
This patch fixes that issue and corrects the doc strings for the
haproxy amphora driver update_amphora_listeners method.

Change-Id: Ib0d63da7c5599069f5ea50f0dfbc59eefba58c84
5 years ago
Tatsuma Matsuki ad69363fc7 Separate the thread pool for health and stats update
When queue_event_streamer driver is used and RabbitMQ
is down, stats update processes occupy the thread pool
which is shared with health update processes. Then,
RabbitMQ down unexpectedly leads to delete all existing
amphorae. This commit separates the thread pool and aims
to keep the existing amphorae working even when RabbitMQ
is down.

Change-Id: I576687f5b646496ff3a00787cf5e8c27f36b9448
Task: 22929
Story: 2002937
5 years ago
Zuul 4d867f623d Merge "Remove user_group option" 5 years ago
Michael Johnson 1f73119b7c Fix Octavia for host host routes
If the subnet attached to an Octavia load balancer had a host route defined
that was actually a host, the load balancer would go into ERROR.
This patch fixes that issue by checking the host route and handling the
netns route additions properly.

Change-Id: I95e8ed377d4ed12aab4ecb2142896b13a9b21079
Story: 2003441
Task: 24637
5 years ago
Nir Magnezi 100858fa79 Remove user_group option
In Pike[1], we introduced a user_group auto detection for haproxy.
The default user group name is auto-detected for any OS distribution
we support as a base for Amphorae.

user_group remained as an option for admins but was also
marked deprecated in Pike[2].

This patch removes that option altogether.

Story: 2003323
Task: 24357

[1] Ia8fede9d7da4709a48661d1fc595a16d04fcbfa9
[2] https://review.openstack.org/#/c/429398/45/octavia/common/config.py@175

Change-Id: Iddd4162674f116705d2b47062cbf7ca88f2677a6
5 years ago
Michael Johnson cc97397d1c Followup patch for UDP support
1. Removes the misc_dynamic setting from the UDP-CONNECT health monitor
   as our script does not use it.
2. Adds a release note for the UDP features.
3. Updates the API reference for UDP support.
4. Adds a comment to the keepalived config with the LB ID.
5. Updates the status message type to be the correct UDP protocol.
6. Fix error during deleting a listener if there are multiple amphoraes.
7. Refactors systemd service script handling.

Story: 2003306
Task: 24258
Change-Id: I09240023d066ac5a71836d01045cda6ce5678712
5 years ago
ZhaoBo a890f2ba35 UDP for [2]
These files will split with the current Octavia repo, before other parts
are ok.

Patch List:

[1] Finish keepalived LVS jinja template for UDP support
[2] Extend the ability of amp agent for upload/refresh the keepalived
process
[3] Extend the db model and db table with necessary fields for met the new
udp backend
[4] Add logic/workflow elements process in UDP cases
[5] Extend the existing API to access udp parameters in Listener API
[6] Extend the existing pool API to access the new option in
session_persistence fields

Change-Id: Ib4924e602d450b1feadb29e830d715ae77f5bbfe
5 years ago
ZhaoBo 008ccb652d UDP jinja template
This is the jinja template[1] for keepalived to enable lvs configuration.
And including some transform function from obj to rendered configuration.

These files will split with the current Octavia repo, before other parts are
ok.

Patch List:

[1] Finish keepalived LVS jinja template for UDP support
[2] Extend the ability of amp agent for upload/refresh the keepalived
process
[3] Extend the db model and db table with necessary fields for met the new udp backend
[4] Add logic/workflow elements process in UDP cases
[5] Extend the existing API to access udp parameters in Listener API
[6] Extend the existing pool API to access the new option in
session_persistence fields

Story: 1657091
Task: 23208
Change-Id: Ib23edb7190ffb777e4a95f45a253e8a632beb046
5 years ago
Zuul fbefbcd843 Merge "Fix failover when multiple amphora have failed" 5 years ago
Zuul b483af1ebb Merge "Automatically set Barbican ACLs" 5 years ago
Adam Harwell c3813d9313 Automatically set Barbican ACLs
Story: 2002973
Task: 22981

Co-Authored-By: Carlos Goncalves <cgoncalves@redhat.com>

Change-Id: I51121c599f19a91a6755571abf1c6bd854e7d50f
5 years ago
Michael Johnson 0139f12c2e Fix failover when multiple amphora have failed
If a load balancer loses more than one amphora at the same time
the failover process will fail and leave the load balancer in
provisioning status ERROR.

This patch resolves this by failing over one amphora at a time
marking any amphora that are also failed in status ERROR. The health
manager will then failover the other failed amphora in subsequent checks.

This patch will update multiple healthy amphora in parallel and will
timeout failed amphroa using the new "active_connection_max_retries"
configuration setting used for "fail-fast" connections.

The patch also updates the amphora failover flow documentation to
show the full flow and not just the spares failover flow.

It updates the amphora driver "get_diagnostics" method to pass instead
of error.

It also adds a AmphoraComputeConnectivityWait task to explicitly wait
for a compute instance to come up and be reachable. This allows a longer
timeout and clarifies this may fail due to compute (nova) failures.
Previously the first plug vip task would do this wait.

Change-Id: Ief97ddda8261b5bbc54c6824f90ae9c7a2d81701
Story: 2001481
Task: 6202
5 years ago