Commit Graph

315 Commits (d15cccff2fd382475f971a646f5579bd43ef190b)

Author SHA1 Message Date
Adam Harwell d15cccff2f Change amphora statistics to use deltas
Amphora statistics packets should report deltas instead of absolutes for
all relevant metrics.

Change-Id: I5cf6f1f20f2c6f1da39982b2d88e036eefe48b2f
Co-Authored-By: Anushka Singh <anushka.singh.2511@gmail.com>
Co-Authored-By: Stephanie Djajadi <stephanie.djajadi@gmail.com>
2020-07-30 23:13:18 +00:00
Carlos Goncalves 41c628a084 Fix missing params in amphora base and noop driver
Running amphora failover against the amphora noop driver was raising a
TypeError (reload() takes from 2 to 3 positional arguments but 4 were
given).

Change-Id: I64172d6995959cf377364584ad9a2395f9ec0605
2020-06-24 12:05:05 +02:00
Carlos Goncalves 89123c0fc1 Add missing reload method in amphora noop driver
The reload method was also missing in the abstract class.

Task: 40140
Story: 2007847

Change-Id: I2328b3dc4d5b95c8771a305d3d4bb1dee6019117
2020-06-23 10:58:22 +02:00
Michael Johnson 955bb88406 Refactor the failover flows
This patch refactors the failover flows to improve the performance
and reliability of failovers in Octavia.

Specific improvements are:
* More tasks and flows will retry when other OpenStack services are
  failing.
* Failover can now succeed even when all of the amphora are missing
  for a given load balancer.
* It will check and repair the load balancer VIP should the VIP
  port(s) become corrupted in neutron.
* It will cleanup extra resources that may be associated with a
  load balancer in the event of a cloud service failure.

This patch also removes some dead code.

Change-Id: I04cb2f1f10ec566298834f81df0cf8b100ca916c
Story: 2003084
Task: 23166
Story: 2004440
Task: 28108
2020-06-18 16:25:21 -07:00
Gregory Thiemonge 6354f92ecc Fix netcat option in udp_check.sh for CentOS/RHEL
-w (timeout) option doesn't do anything in nmap-ncat (default netcat in
CentOS/RHEL) for UDP datagrams, and nmap-ncat has a default idle timeout
set to 2 seconds.
We can get the same behavior as netcap-openbsd (Debian/Ubuntu) by
setting that idle timeout (-i) option to 1 second.

This commit detects the flavor of the netcat binary (nmap vs other) and
uses it to adapt the parameters.

Story: 2007688
Task: 39800

Change-Id: I0100aaa428477f011bd39a90dd4ec98199b4bebc
2020-05-19 13:55:08 +02:00
Brian Haley 9a1d6d3585 Fix E741 pep8 errors
E741 ambiguous variable name 'l'

Change 'l' to another variable in affected code.

Also had to set the latex_engine to 'xelatex' in doc/source/conf.py
in order to get past an openstackdocstheme change the broke the pdf
doc build.

Change-Id: Idd176e40ccf2a79832a5c99140bd30e5e1f9c0d8
2020-05-15 10:58:22 -04:00
Zuul 1b52ccd20f Merge "Support HTTP and TCP checks in UDP healthmonitor" 2020-04-17 23:11:06 +00:00
ZhaoBo 6e61991833 Support HTTP and TCP checks in UDP healthmonitor
This patch introduces 2 macros in lvs.

1. Support HTTP GET, allow users create HTTP healthmonitor for udp pool.
2. Support TCP check, allow users create TCP healthmonitor for udp pool.

Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: I61c7d8d4df54710a92b8c055be84bba29bf3d7e6
Story: 2003200
Task: 23356
Story: 2003199
Task: 23355
2020-04-15 16:18:35 +00:00
Adam Harwell 96a4482dff Fix py3 amphora-agent cert-rotation type bug
Flask's stream always returns bytes, file write always takes string.
This causes py3 amps to return 500 on cert rotation AND wipe out the
certificate, so the amphora are no longer controllable and go to ERROR
state. Anyone running py3 amps prior to this patch will experience
amphorae breaking on a timer due to housekeeping cert rotation!

Change-Id: I831b0b48d719397c14d80f8ebcbad997c50c7795
2020-04-14 06:48:43 -07:00
Dawson Coleman cd176e55c5 Add ability to set TLS cipher list for listeners
Listeners will now be able to each be assigned their own OpenSSL
cipher string with a new field: tls_ciphers.  There is also a new
configuration option, default_listener_ciphers, which specifies the
cipher string to assign to new listeners when one is not explicitly
specified.

Change-Id: I77da6f14063877af0077f2c12df1aab5d5ead187
Depends-On: Id5f4c20abd40dd092558a711987953012d4ae67f
Story: 2006627
Task: 36839
2020-04-06 17:06:32 -07:00
Adam Harwell d27ee3f0ee Fix padding logic for UDP health daemon
Should have done "pad to 8 characters" on the hex conversion, but it was
instead hardcoded to pad a single `0`, which is right in a lot of cases
but not all.

For example:
>>> ip1 = ipaddress.ip_address('98.136.140.23')
>>> ip2 = ipaddress.ip_address('10.1.1.1')
>>> "%X" % ip1._ip
'62888C17'
>>> "%X" % ip2._ip
'A010101'

Change-Id: Ia9fec4e72c00f7086489b245d9dc50ed9c27f12a
2020-03-20 16:51:44 -07:00
Brian Haley f6b957e8ee Remove all usage of six library
Convert all code to not require six library and instead
use python 3.x logic.

Created one helper method in common.utils for binary
representation to limit code changes.

Change-Id: I2716ce93691d11100ee951a3a3f491329a4073f0
2020-03-18 17:15:26 -04:00
Ann Kamyshnikova 6bca6bef1b [Amphorav2] Fix noop driver case
Fix amphora creation with amphora_noop_driver
and network_noop_driver.

Change-Id: I5d3c4d5280916916e95120cfba6fc076a1650cf4
Story: 2005072
2020-03-16 10:50:28 +00:00
Zuul 9f89da5f22 Merge "Support haproxy development snapshot version parsing" 2020-02-01 11:38:17 +00:00
Gregory Thiemonge 7dc54eb9c9 Support haproxy development snapshot version parsing
Fix haproxy version parser in amphora-agent for beta versions (i.e
2.2-dev0)

Change-Id: I5baf7d18105494259361be1f0f411412071f1d36
2020-01-27 15:07:05 +01:00
Zuul 741a47763d Merge "Fix the interface filenames for Red Hat amphora images" 2020-01-24 17:08:13 +00:00
Ann Kamyshnikova dad38f61f8 Transition amphora flows to dicts
Fixed endpoints logs for listener, pool and member as well.

Rework _get_create_amp_for_lb_subflow due to issue with taskflow
decider and retry subflow.
Retry subflow was not ignored properly for spare amphorae case,
so _get_create_amp_for_lb_subflow has been split to
3 separate subflows each of them linked to graph flow.
This is work around and can be removed when proper mechanism
implemeneted in taskflow library. (added several todos about it).

Change-Id: Ibd114fa14123e6de6c5d6f260e32cf7f2b28805a
Story: 2005072
Task: 30814
2020-01-17 12:18:08 +04:00
Paul Peereboom 93cd9fc075 Fix the interface filenames for Red Hat amphora images
Code was not using the correct filenames for the 'route',
'route6', 'rule' and 'rule6' files on Red Hat images.
Changed to use config option 'agent_server_network_file'
if it's specified, else the file of the correct name, and
added unit tests for each.

Change-Id: I335287da66524d026f0c42086d885b478c568bbd
Task: 37881
Story: 2007051
2020-01-10 16:09:27 +01:00
Michael Johnson cccd47e05a Fix multi-listener LB client auth/re-encryption
This patch corrects a bug with mutli-listener load balancers that
are using either TLS client authentication and/or backend
re-encryption.

Change-Id: Ib7b083e1dfbfd7afcca870ed6f60a871b2e19253
Story: 2006822
Task: 37394
2019-12-09 15:49:03 +00:00
Michael Johnson 7d23a711dd Fix multi-listener LB with missing certificate
This patch allows listeners on a load balancer to continue to
operate should one listener fail to access secret content in
barbican. Previously if one listener failed to access barbican
content, all of the listeners would be impacted.
This patch also cleans up some unused code and unnecessary comments.

Change-Id: I300839fe7cf88763e1e0b8c484029662beb64f0a
Story: 2006676
Task: 36951
2019-12-09 07:48:49 -08:00
Carlos Goncalves 3740b67854 Add support for CentOS 8 amphora images
Change-Id: Ic3b1dab418cfd95fe261ca19528ec969ee57610e
2019-12-06 09:24:33 +00:00
Ann Taraday 314b43af9a Use retry for AmphoraComputeConnectivityWait
Use taskflow retry for connectivity wait. [1]

This reqired for redis jobboard implementation as each retry expand
claim for job on worker. This means that worker is proccesing job and
it should not be released for other workers to work on it.

Adopted for v2 flows.

[1] - https://docs.openstack.org/taskflow/latest/user/atoms.html#retry

Story: 2005072
Task: 33477

Change-Id: I2cf241ea965ad56ed70ebde83632ab855f5d859e
2019-11-29 06:51:47 +04:00
Zuul 01a3ed55db Merge "Fix listeners with SNI certificates" 2019-11-14 20:43:30 +00:00
Colin Gibbons 0682fb977a ipvsadm '--exact' arg to ensure outputs are ints
Currently the keepalivedlvs_query script calls ipvsadm -Ln --stats
to query the local lvs for connection information. If any of these
values grow large enough they will be abbreviated with human-
friendly suffixes (K, M, G) and cause the get_ipvsadm_info func
to raise an exception when it receives a non-integer value from
its command output. By using the --exact argument in addition to
the existing arguments, we can ensure the output is always expanded
numbers, per the ipvsadm man page, and will only ever offer integer
outputs to the get_ipvsadm_info command.

Change-Id: I2e8c0be2221c0c23b752fdf2cdff065cddf830a5
Story: 2006791
Task: 37331
2019-11-06 09:24:30 -08:00
Michael Johnson 3c05ce1297 Fix listeners with SNI certificates
The single process patch changed the way listeners and load balancers
are deployed inside the amphora. This caused listeners with SNI
enabled to load all of the certificates for all of the TLS enabled
listeners on a load balancer.
This patch corrects that by configuring each listener with a
specific list of certificates.

Change-Id: I2f3c7ab4137dbd84d77a6a6b675975af406249d0
Story: 2006758
Task: 37252
2019-10-25 14:15:48 -07:00
Zuul cb214ad13e Merge "Fix healthmonitor message v2 for UDP listeners" 2019-10-01 08:51:55 +00:00
Gregory Thiemonge d5ffd2ca40 Fix new pep8/pylint errors
With new pylint release (2.4.1), new warnings were triggered:
- unnecessary-comprehension
- no-else-break
- no-else-continue
- import-outside-toplevel

Change-Id: I301cc9fc6b41e9e97f051df29d768b172cade636
2019-09-25 15:36:55 +02:00
Zuul 3ae9acb4f5 Merge "Fix building configs for multiple listeners" 2019-09-19 19:51:21 +00:00
Gregory Thiemonge cad80a6c7d Fix healthmonitor message v2 for UDP listeners
Multi-listener LB commit (Idaccbcfa0126f1e26fbb3ad770c65c9266cfad5b)
introduced a v2 message for octavia healthmonitor.

This commit fixes an issue with healthmonitor messages for UDP
listeners, they didn't follow the v2 message specification: pools
dictionaries were stored in listener objects (v1 format) instead of
being stored as in the root dictionary of the message.

Story: 2005736
Task: 33394

Change-Id: I93e5eb5bc69fe4de4c450c09367b319769ef07db
2019-09-17 11:57:03 +02:00
Michael Johnson 5defc1e8a4 Fix the amphora no-op driver
The amphora no-op driver had the wrong method signature for the
update_amphora_agent_config method.

This patch corrects that issue.

Change-Id: Ib1b0df3b7227d8a8dd68276e279cae1c4974ded2
2019-09-17 00:13:09 +00:00
Ann Kamyshnikova 42df031e89 Fix building configs for multiple listeners
Currently jinja_combo.build_config method expect to use single
tls cert, though with multiple listeners there could be multiple
certs. Also in case of HTTP and TERMINATED_HTTPS listeners on the
same loadbalancer - creation of the second listener will fail.

Change-Id: Iad3b55e5add4283256f7836c3d4a501aa57ffc2f
Story: 2006513
Task: 36510
2019-09-10 22:22:56 +00:00
Rene Luria 905499162b
Fix template that generates vrrp check script
Correct the inline comment to not include an empty new line at the start
of generated /var/lib/octavia/vrrp/check_script.sh that leads to this
kind of error:

>  Aug 26 11:49:32 amphora-12184e15-1ec3-4d80-98a7-c7d1ddb6716f
> Keepalived_vrrp[15265]: Error exec-ing command
> '/var/lib/octavia/vrrp/check_script.sh', error 8: Exec format error

Change-Id: Icddd2873abeb56a389a35356995df6dde70872b2
2019-08-26 13:50:42 +02:00
Michael Johnson 2529fa33ab Lookup interfaces by MAC directly
Currently the amphora agent will lookup interfaces using the
interface name determined earlier in the plug method. This can
lead to a race condition with the udev interface renaming rule.
This patch changes the interface lookup to use the MAC address
directly and not rely on the interface name.

Story: 2006300
Task: 36013

Change-Id: I5bc21d5abdeb67a3a8ae88456735643463f15694
2019-08-09 23:08:33 +00:00
Zuul c65329391a Merge "Fixed down server issue after reloading keepalived" 2019-08-07 17:59:23 +00:00
Zuul 7b7e6deb64 Merge "Fixed pool and members status with UDP loadbalancers" 2019-08-07 17:52:36 +00:00
Gregory Thiemonge b1a4758f58 Fix listener deletion in ACTIVE/STANDBY topology
When removing listeners, listeners are removed from the load balancer's
listener list just before reconfiguring each amphora.
In case of ACTIVE/STANDBY topology, the code is performed on both
amphorae, so the listener is removed twice from the list.
This commit ensures that we don't remove an already removed listener.

Story: 2006329
Task: 36065

Change-Id: I426255f587f36b415eb999a9eb28cf0f91de94b0
2019-08-06 15:38:30 +00:00
Gregory Thiemonge 0a9f587015 Fixed down server issue after reloading keepalived
When removing a UDP health monitor, keepalived is reloaded with a
configuration without any checkers.
But if keepalived has previously detected a down server, the state of
the server is unchanged and it will never be added to the list of IPVS
servers.

Restarting keepalived on configuration change works around this issue.

This issue is fixed in keepalived (>=2.0.14):
https://github.com/acassen/keepalived/issues/1163

Story: 2005774
Task: 33491

Change-Id: Iaa34db6cb1dfed98e96a585c5d105e263c7efa65
2019-07-30 12:19:08 +02:00
Gregory Thiemonge 4decb6d53c Fixed pool and members status with UDP loadbalancers
This commit fixes pool and members status when using UDP loadbalancers.

Story: 2005736
Task: 33394

Change-Id: I75cde3ff820f085aebbdffd1e40c5ff40f16835d
2019-07-30 12:19:08 +02:00
Michael Johnson 06ce4777c3 Fix multi-listener load balancers
Load balancers with multiple listeners, running on an amphora image
with HAProxy 1.8 or newer can experience excessive memory usage that
may lead to an ERROR provisioning_status.
This patch resolves this issue by consolidating the listeners into
a single haproxy process inside the amphora.

Story: 2005412
Task: 34744
Co-Authored-By: Adam Harwell <flux.adam@gmail.com>
Change-Id: Idaccbcfa0126f1e26fbb3ad770c65c9266cfad5b
2019-07-23 14:28:49 -07:00
Zuul 954025cbc3 Merge "Fix a python3 issue in the amphora-agent" 2019-06-18 07:42:58 +00:00
Michael Johnson dc459e2213 Fix a python3 issue in the amphora-agent
An exception handler in the amphora-agent has a python3 string
comparison bug that will cause a TypeError.
This patch fixes that bug and adds test coverage for the
start_stop_listener.

Change-Id: I6f5d95c5f875edda530f54ae72386d6495235ca6
Story: 2005898
Task: 33760
2019-06-17 15:21:35 +03:00
German Eichberger 686303e79d Amphora logging
Configure rsyslog to forward logs to a target host

Co-Authored-By: Michael Johnson <johnsomor@gmail.com>
Story: 1665069
Task: 33646

Change-Id: I00703f86555cbb574b943794b14a36fbc644f1b2
2019-06-14 09:02:26 -07:00
Michael Johnson 80ddbaeef4 Align logging in the amphora
This patch configures the primary components of the amphora to log
to syslog using consistent logging facilities.
By default, user traffic logs will go to LOG_LOCAL0 and the amphora
processes (haproxy, keepalived, etc.) will log to LOG_LOCAL1.

This is a patch supporting log offloading.

Change-Id: Ifda91e0310e812e34f1e398dd3176af8a9c58f89
Story: 1665069
Task: 5486
2019-06-13 12:42:18 -07:00
Zuul 59660fb365 Merge "Force amp-agent communication to TLSv1.2" 2019-06-03 13:56:23 +00:00
Zuul 09020b6bfc Merge "Add Python 3.7 support" 2019-05-16 06:30:28 +00:00
Zuul 6fda9ff945 Merge "Make sure amphora logging works the same on py2 and py3" 2019-05-15 19:46:54 +00:00
zhulingjie ff50886d79 Update hacking version to latest
This resolves extranous "improper escape sequence" warnings on
python 3.6+[1].

Note, this does not resolve those warnings from pylint. There
is already another proposed patch to address pylint[2].

[1] https://review.opendev.org/494322
[2] https://review.opendev.org/635236

Change-Id: Ie160436913e4d935bab118d31ba10193ac38bd8f
2019-05-14 17:38:58 -07:00
Adam Harwell 5b831f2a5b Force amp-agent communication to TLSv1.2
Also allow configuration of this minimum.
The previous default of SSLv2/3 is very insecure.

Change-Id: If34c7c34d9a6a77685fb177976dc2070760c7b37
2019-05-14 14:02:57 -07:00
Carlos Goncalves c4faac25de Add Python 3.7 support
In order to support Python 3.7, pylint has to be updated to 2.0.0
minimum. Newer versions of Pylint enforce additional checkers which can
be addressed with some code refactoring rather than silently ignoring
them in pylintrc; except useless-object-inheritance which is required to
be silented so that we stay compatible with Python 2.x.

Story: 2004073
Task: 27434

Change-Id: I52301d763797d619f195bd8a1c32bc47f1e68420
2019-05-14 17:11:22 +00:00
Erik Olof Gunnar Andersson 0000412cf4 Make sure amphora logging works the same on py2 and py3
In Python 3.3 IOError is just an alias of OSError. This
causes logging in a very specific scenario to not log
the appropriate message, as one code path is unreachable. 
This is fixed in this patch by merging the two exception paths.

Story: 2005576
Task: 30765

Change-Id: Ie81de8e85753fde1516aea0b084df6a0c513ad7b
2019-05-04 19:19:33 +00:00