When an error occurs in a flow, the provisioning status of the load
balancer should be set to ERROR in the revert method of the first task
of the flow. This update acts as an unlock of the LB object and cannot
occur in any other revert method because the API might consider the LB
as mutable before finishing a task/flow.
Remove all occurrences of mark_loadbalancer_prov_status_error calls in
revert method of tasks that are not specifically designed for unlocking
the load balancers. Add a LoadBalancerToErrorOnRevertTask task in the
amphora failover flow to prevent a LB to be in an immutable state.
Story 2009651
Task 43810
Story 2009652
Task 43811
Note for stable/train: the code of the amphorav2 is not updated in this
backport, the source files exist in train but the feature was added
ussuri. Backporting this patch creates many merge conflicts and doesn't
provide anything for train users.
Conflicts:
octavia/controller/worker/v1/tasks/database_tasks.py
octavia/controller/worker/v2/flows/amphora_flows.py
octavia/controller/worker/v2/tasks/amphora_driver_tasks.py
octavia/controller/worker/v2/tasks/database_tasks.py
octavia/tests/unit/controller/worker/v2/flows/test_amphora_flows.py
octavia/tests/unit/controller/worker/v2/tasks/test_amphora_driver_tasks.py
octavia/tests/unit/controller/worker/v2/tasks/test_database_tasks.py
Change-Id: I48b0f5a773209b1c1b056d71c0da05d6fd82ca73
(cherry picked from commit 4b8b198fec)
(cherry picked from commit 4039d35ce2)
(cherry picked from commit 844f1348ea)
(cherry picked from commit cf2a8bdf88)
(cherry picked from commit 95690e251d)
In the database, timeout_client_data, timeout_member_connect,
timeout_member_data, timeout_tcp_inspect have an integer type with
a maximum size of 2147483647. Now MAX_TIMEOUT in API does not exceed
this value and is equal to 24 days (2073600000 seconds).
Story: 2009193
Task: 43248
Change-Id: I990b6af1ff880b25e54f6c41ae0c966007f3f098
(cherry picked from commit 3a8f056306)
(cherry picked from commit 9a25a643e2)
In cases where the listener protcol port is same as the peer port
and allowed_cidr set to 0.0.0.0/0 explicitly, the listener is not
provisioned due to duplicate security group creation for peer port
with None as remote_ip_prefix. Neutron SG defaults remote_ip_prefix
to 0.0.0.0/0 if not specified or None and hence the error SG rule
already exists.
Remove the duplicate entry from the updated_ports.
Story: #2009117
Change-Id: I9dbdb71e9b94bbcc75766a8687a996d5358f3381
(cherry picked from commit 151a943210)
publish-openstack-octavia-amphora-image* jobs started failing because
ubuntu no longer provides yum-utils package.
Now dependencies have been cleaned up for the ubuntu job, and the centos
job uses a centos node. The zuul playbook now works on Ubuntu and
RedHat/Centos nodes.
Change-Id: Ifca01d91d8eb92115d56744f4963e91ac537dd8e
(cherry picked from commit d81a0556f5)
(cherry picked from commit 80ff90e34e)
(cherry picked from commit 41b2aad609)
Our periodic amphora image build jobs started failing around mid June
with:
"The conditional check 'install_packages|success' failed. The error was:
template error while templating string: no filter named 'success'.
String: {% if install_packages|success %} True {% else %} False {% endif
%}"
Filters have changed in Ansible 2.9 after a deprecation period.
Additionally, install python3-venv and set virtualenv_command (defaults
to Python 2 "virtualenv") as it seems to be required now too.
Change-Id: I3efa89992cc4a8e2645803dd867d7d2f6e39b966
(cherry picked from commit 9df5f75d49)
(cherry picked from commit 20c0a88bf4)
Using haproxy 2.x, the ideal rlimit value for nr_open is close to
connection_limit * 2.5 (see compute_ideal_* in src/haproxy.c).
Set this limit to 2,600,000 in the amphora to support a loadbalancer
with maxconn 1M.
This prevents the following warning messages when launching/reloading
haproxy:
* "Cannot raise FD limit to 2375058, limit is 2097152."
* "FD limit (2097152) too low for maxconn=950000/maxsock=2375058. Please
raise 'ulimit-n' to 2375058 or more to avoid any trouble."
Change-Id: I6251cd17bd6fa9faf5109e50c2190dda3614908d
(cherry picked from commit 4174f4a5a4)
(cherry picked from commit f55376bf5d)
(cherry picked from commit c177987f81)
(cherry picked from commit e4300558e1)
Configure the lo interface in the amphora-haproxy namespace.
It fixes issues with the synchronization of haproxy peer/stick tables
when updating a load balancer configuration.
Story 2009005
Task 42682
Change-Id: I15997acaf12258ec483286dad676efdea8963611
(cherry picked from commit 4443596e29)
(cherry picked from commit 9ba1182579)
(cherry picked from commit 74c0ff2a40)
(cherry picked from commit e84d947876)
Fix the member list in the haproxy configuration files, it prevents
DELETED members to be in the list.
This patch is specific to the stable/train branch. The fix for other
releases was introduced in stable/ussuri but it is part of a patch
(Ic44019b8877f008e6d7a75ceed1b7fd958e051d0) that is not backportable to
stable/train. Only the diffs related to the jinja templates have been
included in this new commit.
One part of the fix was already backported to stable/train in
Ib7b083e1dfbfd7afcca870ed6f60a871b2e19253
Story 2008871
Task 42402
Change-Id: I4c85425a774594c52a0bb743fd5b787706201425
The default nodeset has been updated to Ubuntu Focal in [1]. Train CI
jobs need to be explicitely set to Bionic-based nodeset or else Devstack
will fail to deploy with the following message:
"WARNING: this script has not been tested on focal"
[1] https://review.opendev.org/c/openstack/octavia-tempest-plugin/+/788954
Change-Id: I1281e74a398d03eac8db7a468e71ea4889dd8229
The directive [certificates]/ca_certificates_file currently has a
confusing comment. This tries to fix it and make it more easy for
Octavia operators to configure the directive.
Change-Id: I99ce408ec886820c056b69696b26be9521740f1c
(cherry picked from commit ee0da827b1)
(cherry picked from commit 20db85fd52)
(cherry picked from commit d69d066606)
(cherry picked from commit a8977b01ad)
Fix an invalid rsyslog configuration when disabling both log offloading
and local log storage.
Story 2008782
Task 42173
Change-Id: I3125cfc7b3bd12eed139b058fc2bb7f23f12abc7
(cherry picked from commit e26664de59)
(cherry picked from commit 2195a45a1b)
(cherry picked from commit 130bfcd4ae)
A user was able to create a LB using a vip_subnet_id from another user
(by passing the UUID).
Now, the vip_subnet_id parameter is validated using the user context,
so an error is returned if the subnet doesn't belong to the user.
I479019a911b5a1acfc1951d1cbbc2a351089cb4d was a previous attempt to fix
that bug but vip_subnet_id check was missing.
Story: 2008586
Task: 41741
Depends-On: https://review.opendev.org/774157
Change-Id: I602418264e171a2b1a926eff0b1f9e6dc186295a
(cherry picked from commit 8d86187c0a)
(cherry picked from commit 7d1b81d78f)
(cherry picked from commit 73db7b0762)
'lb_algo rr' in keepalived won't work correctly with weight
for UDP listeners, it should be 'lb_algo wrr'.
'wrr' is superset of 'rr' algorithm, as round-robin is
a specical instance of the weighted round-robin scheduling,
in which all the weights are equal. [1]
Algorithm in HAProxy is set to 'roundrobin', which also support
weights, but in keepalived must be set to 'wrr' to work with
the weighted round-robin scheduling, as it is different to 'rr'.
[1] https://www.keepalived.org/doc/scheduling_algorithms.html
Story 2008462
Task 41491
Conflicts:
octavia/tests/unit/common/jinja/lvs/test_jinja_cfg.py
octavia/tests/unit/common/sample_configs/sample_configs_combined.py
octavia/tests/unit/common/jinja/lvs/test_lvs_jinja_cfg.py
NOTE(Xing Zhang): The first file does not exist in stable/victoria
due to unit test structure was fixed, file renamed from test_lvs_jinja_cfg.py,
(patch I6d84047b3481a2bf6bf9bd17d482fb504dbc752b) was introduced in stable/wallaby.
The second file has conflict due to the SCTP is supported by the amphora provider
in the Wallaby release. (patch I30997ae6cc6b8ec724f0e9dcfdfe49356b320ff4)
The third file has conflict due to HTTP and TCP checks in UDP healthmonitor
(patch I61c7d8d4df54710a92b8c055be84bba29bf3d7e6) was introduced in stable/ussuri.
This backport removes SCTP in commit message and releasenote due to SCTP
was introduced in the Wallaby release.
Change-Id: Ic63929d8864e5285baf70dd85e6362988bf2863f
(cherry picked from commit 5352a10f62)
(cherry picked from commit d9603b3d21)
(cherry picked from commit c7640b90ad)
(cherry picked from commit a8266a9e4f)
Some IPv6 UDP members were incorrectly marked in ERROR status because of
a formatting issue between the keepalived configuration file and the
ipvsadm output. Both are used to compute the state of the members and
when a member's address contained '*:0:*', parsing was incorrect. Now
the health message generation function uses only the compressed IPv6
notation instead of mixing notations.
Story: 2008604
Task: 41783
Conflicts:
octavia/amphorae/backends/utils/keepalivedlvs_query.py
Change-Id: I2fe94cd4c000f143c59c69e82d03c690acf5e0c3
(cherry picked from commit e5f9f6708c)
(cherry picked from commit 8b210bb52a)
(cherry picked from commit 4b8de86cf3)
When provider drivers registered a load balancer object delete,
the driver agent was not decrementing the project quota.
This patch corrects that by decrementing the proper quota
when a DELETED status is received from the provider driver.
Conflicts:
octavia/tests/unit/api/drivers/driver_agent/test_driver_updater.py
Change-Id: I7d705c9f4f0217c6fbe332f45b15892bf1d4a90b
Story: 2008268
Task: 41133
(cherry picked from commit fc8ee42dfc)
(cherry picked from commit 0a8254e04e)
(cherry picked from commit dcce548ed6)
When a load balancer failover was performed on a load balancer where
the VIP address is on a subnet that has no IP addresses available,
the VIP address may be deactivated.
This patch corrects the failover flow to not deallocate the VIP
address on a failover revert flow due to the subnet being out of
IP addresses.
Story: 2008625
Task: 41827
Change-Id: I1fe342d2bdf1301dd89ab7dfaa8e6a23e69c252b
(cherry picked from commit e3ab49c60a)
This topic was discussed on the ML and QA team proposed to
to drop testing lower-constraints [1].
Proposing to drop this test because the complexity and
recurring pain needed to maintain it now exceeds the
benefits provided by it on the stable branches.
[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-December/019390.html
Change-Id: Ia419b0ca986735f24a8ee48c315d9cb74620eff3
The default value for timeout parameters in the BaseListenerType was
not correctly set because the class was defined before reading the
config file.
Conflicts:
octavia/db/prepare.py
Story 2008666
Task 41953
Change-Id: Ia4aa2047a79ad6fc3e33c7ebe2da9438914f7a88
(cherry picked from commit b95fbe9ed4)
(cherry picked from commit bf7632e65b)
(cherry picked from commit 27230ba862)
Some network parameters can be validated in the API, it would avoid to
handle exceptions in the worker when plugging networking resources.
This commit validates that port_security_enabled is True on the VIP
network when using the amphora driver.
Story: 2008449
Task: 41422
Change-Id: I1236d3c6231a657b2aa53b1e488a4d0fe3215070
(cherry picked from commit dda1d8665c)
(cherry picked from commit 250302bf03)
(cherry picked from commit a0cb2a0df5)
{admin,tenant}_log_targets options are configured with
MGMT_PORT_IP in devstack, which contains the IP address
of the local management interface. In multinode setup,
it means that the second node should run a rsyslog
service to receive logs from amphorae that have been
spawned by its worker.
Change-Id: If2841720009c2e402127e2e0080efdd56b68f6c9
(cherry picked from commit f45092a876)
(cherry picked from commit 0e593f4d26)
(cherry picked from commit e9bc5d9f71)
There was a bug that would cause a pool to go into ERROR if you attempted
to update the CRL or client certificate on the pool.
Conflicts:
octavia/api/drivers/amphora_driver/v2/driver.py
Change-Id: I736816247131715f5c385b4680614ec3218a2ad7
Story: 2008295
Task: 41180
(cherry picked from commit 370aa4e61c)
(cherry picked from commit 24e3a4d93a)
(cherry picked from commit ebe234eb25)
NetworkManager in Centos images configures new network devices as soon
as they appear in the default namespace, it means that we might have
conflicts between the management interface's routes and address and the
new VIP or member interfaces' routes and addresses during a small period
of time before they are moved to the amphora-haproxy namespace.
Now, the "no-auto-default=*" option is enabled in NetworkManager, it
disables the configuration of new interfaces, while the management
interface is still enabled/configured through cloud-init.
Story 2008599
Task 41773
Change-Id: I6dd8e99b07ff557674871cb503dece96a9df3ada
(cherry picked from commit a518cefda1)
(cherry picked from commit b9f622b335)
(cherry picked from commit 7b2424685e)
Use bash instead of sh to avoid the error "shopt: not found"
Change-Id: Ib089affa229531cd72f6853105d74b446687ae86
Story: 2008437
Task: 41399
(cherry picked from commit 1a86e454e7)
(cherry picked from commit b339d00975)
DIB dropped Python 2 support in its version 3.0.0 major release and as
so we had to pin it to 2.30.0 in stable/stein and stable/train branches.
However, upstream get-pip.sh recently introduced a regression where it
does not work well with Python <-3.5. Sadly, DIB 2.30.0 points to master
get-pip.sh [1] and DIB is branchless so the fix cannot be backported and
we are in a deadlock scenario where Python <=3.5 amphora images can't be
built.
To work-around this issue on stable/train, we have to make a compromise:
either build amphora images using master DIB and Python 3 only or be
stuck (unless someone else can come up with another approach, all ears).
Ussuri and newer releases are not impacted by this issue as they are
Python >=3.6 only releases.
The same issue also applies to Stein and that has an impact on the
octavia-grenade CI job in stable/train. Since Stein Octavia is EOL [2],
we no longer have to test upgrades.
[1] https://review.opendev.org/c/openstack/diskimage-builder/+/772254
[2] https://review.opendev.org/c/openstack/releases/+/772847
Change-Id: I13da9a49ace803e957dc52e4882b78141bc7bef3
Add support for removal of all tags by PUTing empty tags array.
Also moved assignment after initial session query for the
object in the listener update path.
Conflicts:
octavia/db/repositories.py
Task: #41009
Story: #2008220
Change-Id: I7488f2fae61917f6d4a56cedd05bace7c5e2bc70
Signed-off-by: Andrew Karpow <andrew.karpow@sap.com>
(cherry picked from commit 7ad022379f)
(cherry picked from commit adb4e78e44)
(cherry picked from commit 95a97cf1eb)
Setting nf_conntrack_buckets in the amphora namespace fails because this
sysctl can only be set in the initial namespace (cf kernel doc at
https://www.kernel.org/doc/Documentation/networking/nf_conntrack-sysctl.txt)
This commit allows to set nf_conntrack_buckets in the initial namespace,
the value is then inherited by other namespaces.
Conntrack is not enabled in the main namespace, the new default value
doesn't affect this namespace behavior.
Conflicts:
elements/haproxy-octavia/post-install.d/20-haproxy-tune-kernel
Story: 2008028
Task: 40682
Change-Id: Ie6ccc4bf0017587df8e8e29d8ee3bf5c19e6d615
(cherry picked from commit 64a301d4ec)
(cherry picked from commit dd1580d332)
(cherry picked from commit 9f1307affc)
When adding a new UDP member or a UDP-CONNECT health-monitor to a UDP
pool, there can be a race condition in the first heartbeat message
sent to the health-manager service.
This message might contain a DOWN status for a working member that
hasn't been checked yet.
This commit introduces a new member status between the amphora-agent and
the health-manager: it indicates that the UDP pool has been updated and
that the status of a member is a transitional state, preventing an
incorrect ERROR status.
Story: 2007792
Task: 40042
Change-Id: Id9e19375ebca6a720e6a85006f5e8948d3aed760
(cherry picked from commit 9fb58eb9f4)
(cherry picked from commit ea47a0efad)