This patch refactors the failover flows to improve the performance
and reliability of failovers in Octavia.
Specific improvements are:
* More tasks and flows will retry when other OpenStack services are
failing.
* Failover can now succeed even when all of the amphora are missing
for a given load balancer.
* It will check and repair the load balancer VIP should the VIP
port(s) become corrupted in neutron.
* It will cleanup extra resources that may be associated with a
load balancer in the event of a cloud service failure.
This patch also removes some dead code.
Change-Id: I04cb2f1f10ec566298834f81df0cf8b100ca916c
Story: 2003084
Task: 23166
Story: 2004440
Task: 28108
Now that we are python3 only, we should move to using the built
in version of mock that supports all of our testing needs and
remove the dependency on the "mock" package.
This patch moves all references to "import mock" to
"from unittest import mock". It also cleans up some new line
inconsistency.
Change-Id: I72520a2ca010c2c27315d9dff839a4f9d7540b6b
The amphora no-op driver had the wrong method signature for the
update_amphora_agent_config method.
This patch corrects that issue.
Change-Id: Ib1b0df3b7227d8a8dd68276e279cae1c4974ded2
Load balancers with multiple listeners, running on an amphora image
with HAProxy 1.8 or newer can experience excessive memory usage that
may lead to an ERROR provisioning_status.
This patch resolves this issue by consolidating the listeners into
a single haproxy process inside the amphora.
Story: 2005412
Task: 34744
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: Idaccbcfa0126f1e26fbb3ad770c65c9266cfad5b
This patch validates that a flavor is compatible with using spares
pool amphora. It will also update the amphora-agent config after
a spares pool amphora has been allocated.
This patch enables the ability to update a running amphora's agent
configuration and have the mutatable options be adopted.
The following amphora agent configuration options can be updated:
heartbeat_key
controller_ip_port_list
heartbeat_interval
loadbalancer_topology
This patch adds the support to the amphora-agent and the amphora
driver. A follow on patch will expose this capabililty via the
amphora admin API.
Change-Id: I97bdf5188808193516509f20767e82c0f8d2f5a5
The dual-amp-down fix added an amphora parameter to the amphora driver
interface, but failed to update the driver base and the noop driver.
This patch corrects that oversight.
Depends-On: https://review.openstack.org/634992
Change-Id: I7bd63c933f8e7cd10ff5c89fafbbb09e8cc9e3e1
The amphora no-op driver did not get updated properly for the multi-amphora
failover fix.
This patch fixes that issue and corrects the doc strings for the
haproxy amphora driver update_amphora_listeners method.
Change-Id: Ib0d63da7c5599069f5ea50f0dfbc59eefba58c84
If a load balancer loses more than one amphora at the same time
the failover process will fail and leave the load balancer in
provisioning status ERROR.
This patch resolves this by failing over one amphora at a time
marking any amphora that are also failed in status ERROR. The health
manager will then failover the other failed amphora in subsequent checks.
This patch will update multiple healthy amphora in parallel and will
timeout failed amphroa using the new "active_connection_max_retries"
configuration setting used for "fail-fast" connections.
The patch also updates the amphora failover flow documentation to
show the full flow and not just the spares failover flow.
It updates the amphora driver "get_diagnostics" method to pass instead
of error.
It also adds a AmphoraComputeConnectivityWait task to explicitly wait
for a compute instance to come up and be reachable. This allows a longer
timeout and clarifies this may fail due to compute (nova) failures.
Previously the first plug vip task would do this wait.
Change-Id: Ief97ddda8261b5bbc54c6824f90ae9c7a2d81701
Story: 2001481
Task: 6202
This patch updates the haproxy service scripts to handle the case
where the network interfaces have not yet been plugged. This can
occur in a failover situation.
This patch also makes sure we don't move the management lan interface
into the network namespace.
Closes-Bug: #1509706
Closes-Bug: #1577963
Change-Id: I04d267bd3cdedca11f0350c5255086233cba14ec
EvenStream will be used to serialize messages from the octavia
database to neutron-lbaas database via oslo_messaging. Also
renaming update mixin class since its not really a mixin. The
health manager will make changes to the octavia database when
members etc are marked as down and up etc which would result
in databases that were not in sync between neutron-lbaas and
octavia. A mechanism to communicate database changes from
octavia back to neutron is required so this CR attempts
to use a oslo_messaging system to communicate those changes,
Docimpact - /etc/octavia.conf the user can set the option
event_streamer_driver = neutron_event_streamer
to setup a queue to connect to neutron-lbaas.
if this option is left blank it will default to
the noop_event_streamer which will do nothing
effectively turning the Queue off.
Co-Authored-By: Brandon Logan <brandon.logan@rackspace.com>
Change-Id: I77a049dcc21e3ee6287e661e82365ab7b9c44562
The goal of this patch is to add the function that once we detect an
amphora's cert will expire in 2 weeks from utcnow, we will update its
cert with a new one and update its db information at the same time.
In order to achieve this target, I did the following changes:
Add 2 new columns cert_busy and cert_expiration in amphora table
Add methods to get cert expiration date from PEM server_pem and
update db info
Use the new REST agent method to perform cycling
Add process in housekeeping to facilitate rotation
Add unit tests
Change-Id: I28578a3e560ee09ba300788a5423863c893b8638
The noop drivers suffered from not being updated and just not being tested
with being called within flows like normal real op drivers. This gets the noop
drivers to succeed when called liked other drivers. They do not do anything
and will return fake information whenever it is required they return data.
This can be improved later so that they actually do their own data store, but
that would require a much larger update and I'm not sure there's much value
in it.
Change-Id: I2627ed35c0f576f8cfa258b542e5bdb4be03dac8
Closes-Bug: #1501190
The amphora driver's post_vip_plug method required network specific information
about the amphora to complete implementations. It was not accepting anything
other than a loadbalancer object. Now it takes a dictionary of
AmphoraNetworkConfig objects that is keyed of amphora id. Each instance
contains network specific information required to set up the correct routing
on the amphora.
This patch sets up the routing correctly to solve the vip blackholing issue but
only for the ssh_driver. The argument has only been added to the post_vip_plug
method of the rest_api_driver but will need to be updated to handle this new
information and to also fix the vip blackholing problems.
Change-Id: I17ce89b6c050a2a36e0a802920e2dedb063f615b
Closes-Bug: #1453951
Co-Authored-By: German german.eichberger@hp.com
Create data models and repositories for healthmanager
Change the health manager's property---lastupdated to be datetime type
Change the code based on the previous comment
Design a new class healthmanager and implement the update health method
and unit test class test_health_mixin
Add try and exception for get member in update_health_mixin
Delete the pool part when the member gets offline status
Add get_all method for AmphoraHealthRepository so we can pass non equality comparation in it,
also make a test for it in test_repositories.py
Changed the name of test_all and get_all to be test_all_filter and get_all_filter
Change-Id: Ic356dee139e743a9617d401f9658cfefcb49d15f
Add unit test and subclasses ,exceptions for amphora driver base class
Delete those useless file in the prevous commit, such as .idea/encodings.xml
Changed the code based on the comment
Rename the name of directory and filenames with lowercase words separated by _
Co-Authored-By:German Eichberger (german.eichberger@hp.com)
Co-Authored-By:Min Wang(swiftwangster@gmail.com, min.wang6@hp.com)
Change-Id: I1384d771b6b8dfa743bbaf18304a4cd994fe8dba