The hook starts a DB transaction and should be covered with
DB retry decorator.
For Rocky backport had to import neutron_lib.db for retry_db_errors.
Closes-Bug: #1777965
Closes-Bug: #1771293
Change-Id: I044980a98845edc7b0a02e3323a1e62eb54c10c7
(cherry picked from commit ab286bcdac)
(cherry picked from commit 3ec7aed8a3)
(cherry picked from commit 22250e783b)
During e.g. migration or shelve of VM it may happend that
port update event will be send to the ovs agent and in the almost
the same time, port will be removed from br-int.
In such case during update_port_filter method openvswitch firewall
driver will not find port in br-int, and it will do nothing with it.
That will lead to leftover rules for this port in br-int.
So this patch adds calling remove_port_filter() method if port was
not found in br-int. Just to be sure that there is no any leftovers
from the port in br-int anymore.
Conflicts:
neutron/agent/linux/openvswitch_firewall/firewall.py
Change-Id: I06036ce5fe15d91aa440dc340a70dd27ae078c53
Closes-Bug: #1850557
(cherry picked from commit b01e0c2aa9)
Only check sg object is not enough, we should also
check sg'ports is {} or not. Otherwise the old conjunction
will still exist.
Change-Id: I10588e73a9da7fdd43677f9247c176811dd68c62
Closes-Bug: #1854131
(cherry picked from commit 5cb0ff418a)
Refactored RevisionPlugin to operate upon sets of objects at
once. In the case of "related" objects, the bump operation
no longer makes use of compare-and-swap and instead updates
version numbers directly without testing for their previous value.
This removes the issue of StaleDataErrors being
prevalent within the related object update phase, given the assumption
that the compare-and-swap logic is only desireable for the primary object
being updated.
Change-Id: I2fef298041c59a03dfd06912764973995b80690c
(cherry picked from commit d841ce72bf)
When DHCP agent reports to the neutron-server which ports are
ready, it was using call() method from rpc client.
That caused blocking dhcp agent's dhcp_ready_ports_loop thread which
blocks to send info about other ready ports if there are any.
Method call() should be used when RPC caller returns a value to the
caller but that's not the case here. On neutron server side this
RPC method is only calling provisioning_complete() method to
finish provisioning of ports. And is not returning anything.
So to make sending dhcp ready ports to neutron-server much faster
this patch switch to use cast() method from rpc client.
This method don't block to wait for return value from RPC caller.
Change-Id: Ie119693854aa283b863a1eac2bdae3330c2b6a9d
Closes-Bug: #1850864
(cherry picked from commit 1a686fb401)
In the reported bug, a regression was introduced in [1] when limiting
the time to have a "dnsmasq" process enabled. It has been reported, as
documented in the bug, that in older versions (Queens), using Python 2
[2] and older versions of "ip_lib" (not implementing most of the
commands using Pyroute2), that the time needed to spawn a "dnsmasq"
process exceeds the default 60 seconds defined in
"common_utils.wait_until_true".
This patch increases this time by a reasonable 300 seconds.
[1] https://review.opendev.org/#/c/643732
[2] https://bugs.python.org/issue35757
Change-Id: I2d8693145da72825876b951f2d10afe9ca28ff6e
Closes-Bug: #1849676
(cherry picked from commit aedc099176)
In order to assist debugging of OVS flows involving conjunctions, log
the conjunction ID and other pertinent details. Without this, there is
no good way to verify the port was added to the correct conjunction.
Change-Id: Ie9c3eaa9c828ef5a0a68a286bc0465f2bcd00a4f
(cherry picked from commit 4b67a06403)
This will prevent ovs agent from endless fail loop when dealing
with unbound port: like when port was created in neutron before
agent become alive, then agent gets online and and starts processing
devices.
This patch adds exception handling to prepare_port_filter() -
same as done in update_port_filter().
Change-Id: I1137eb18efaf51c67fab145e645f58cbd3772e40
Closes-Bug: #1849098
(cherry picked from commit e801159003)
In some cases (I don't know exactly how) it may happend that
when new subnet, e.g. IPv6 is added to the network, subnets
can change their order based on uuid.
As before this patch we were using in dnsmasq options tags like
"tagN" for subnets (where N was just number based on position of
the subnet in the sorted list) it could happend sometimes that
dnsmasq ended up with mismatch of tags configured in "dhcp-range"
cmd option and set in "opts" file. That caused problem with serving
proper DHCP options to the vms.
This patch fixes this issue by using tags with format:
"subnet-<uuid>" where uuid is id of the subnet. That was it's not
based on order of subnets in the list and will always match with tag
configured in opts file for specific subnet.
As we was currently using port id as tag for "per port" DHCP options,
this patch changes that to use tags like "port-<uuid>" to make it
consistent with options configured "per subnet" and to make it easier
to debug from where each option comes.
Conflicts:
neutron/agent/linux/dhcp.py
neutron/tests/unit/agent/linux/test_dhcp.py
Change-Id: Idaea33d62fa31edd7149ec916ec314438375724a
Partial-Bug: #1848738
(cherry picked from commit 88f2073526)
(cherry picked from commit a0730e684d)
(cherry picked from commit ba12b9e369)
- This change adds a max priority flow to drop
all traffic that is associated with the
DEAD VLAN 4095.
- This change is part of a partial mitigation of
bug 1734320. Without this change vlan 4095 traffic
will be dropped via a low priority flow after being
processed by part/all of the openflow pipeline.
By raising the priorty and droping in table 0
we drop invalid packets as soon as they enter
the pipeline.
Conflicts:
neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/openflow/native/test_br_int.py
Change-Id: I3482c7c4f00942828cc9396cd2f3d646c9e8c9d1
Partial-Bug: #1734320
(cherry picked from commit e3dc447b90)
When updating a port with the fixed_ips request the
fixed_configured argument should be set to true when
calling _ipam_get_subnets() so that all subnets are
returned if host is not set.
Otherwise the ip allocation will be deffered and an
empty list of possible subnets for the port is
returned. Which in turn led to raising an error that
the network requires subnets to allocate an IP
address.
Closes-Bug: #1844124
Change-Id: I2e690ea0cf5fa0614e39be2b0e83afad3daa7f48
(cherry picked from commit def8e95aad)
Test test_wsgi.TestWSGIServer.test_start_random_port_with_ipv6
is trying to spawn server and bind it to local IPv6 address.
In case when this test is run on host with disabled support
for IPv6 it shouldn't fails but should be skipped.
This patch adds such skip if IPv6 is disabled on host.
Change-Id: I3f7fc20a063757db78fa44c62312a9fcf96a3071
(cherry picked from commit 6906c4017f)
Some traffic does not work if the OVS flows to permit custom ethertypes
are not set on the base egress table. If the rule is added to the base
egress table then both ingress and egress work properly. Also move
initialization code to the function to initialize egress.
Related-Bug: #1832758
Change-Id: Ia312fe75df58723bf41804eec4bd918d223bd60c
(cherry picked from commit fb859966f7)
Type of lvm.vlan is int and other_config.get('tag') is a string,
they can never be equal. We should do type conversion before
comparing to avoid unnecessary operation of ovsdb and flows.
Change-Id: Ib84da6296ddf3c95be9e9f370eb574bf92ceec15
Closes-Bug: #1843425
(cherry picked from commit 0550c0e1f6)
In case when vlan network was created with segmentation_id=0 and without
physical_network given, it was passing validation of provider segment
and first available segmentation_id was choosen for network.
Problem was that in such case all available segmentation ids where
allocated and no other vlan network could be created later.
This patch fixes validation of segmentation_id when it is set to value 0.
Change-Id: Ic768deb84d544db832367f9a4b84a92729eee620
Closes-bug: #1840895
(cherry picked from commit f01f3ae5dd)
DHCP agent sends to neutron-server information about
ports for which DHCP configration is finished.
There was no logged any information about ports which
has got finished this DHCP ports configuration.
This patch adds such log with INFO level. It is the same as
it is currently done in e.g. neutron-ovs-agent.
Change-Id: I9506f855af118bbbd45b55a711504d6ad0f863cc
(cherry picked from commit 6367141155)
Increased timeouts for OVSDB connection:
- ovsdb_timeout = 30
This patch will mitigate the intermittent timeouts the CI is
experiencing while running the functional tests.
Change-Id: I97a1d170926bb8a69dc6f7bb78a785bdea80936a
Closes-Bug: #1815142
(cherry picked from commit 30e901242f)
We are hitting sometimes a problem in "test_port_ip_update_revises" [1].
This happens because the port created doesn't belong to the previously
created subnet. We need to enforce that the port is created in the
subnet specifically created in this test.
[1]http://logs.openstack.org/69/650269/12/check/openstack-tox-lower-constraints/7adf36e/testr_results.html.gz
Conflicts:
neutron/tests/unit/services/revisions/test_revision_plugin.py
Change-Id: I399f100fe30b6a03248cef5e6026204d4d1ffb2e
Closes-Bug: #1828865
(cherry picked from commit 872dd7f484)
Currently there is a delay (around 20 seconds) between the the agent
update call and the server reply, due to the testing servers load. This
time should be higher than the agent-server communication delay but
still short enough to detect when the DHCP agent is dead during the
active wait during the DHCP agent network rescheduling.
"log_agent_heartbeats" is activated to add information about when the
server has processed the agent report status call. This log will allow
to check the different between the server updating time and the previous
agent heartbeat timestamp.
Conflicts:
neutron/tests/fullstack/resources/config.py
Change-Id: Icf9a8802585c908fd4a70d0508139a81d5ac90ee
Related-Bug: #1799555
(cherry picked from commit d7c5ae8a03)
In patch [1] as partial fix for bug 1828375 retries mechanism
was proposed.
We noticed that sometimes in have loaded environments 3 retries
defined in [1] can be not enough.
So this patch switches to use neutron_lib.db.api.MAX_RETRIES constant
as number of retries when processing trunk subport bindings.
This MAX_RETRIES constant is set to 20 and in our cases it "fixed"
problem.
[1] https://review.opendev.org/#/c/662236/
Conflicts:
neutron/tests/unit/services/trunk/rpc/test_server.py
Change-Id: I016ef3d7ccbb89b68d4a3d509162b3046a9c2f98
Related-Bug: #1828375
(cherry picked from commit d1f8888843)
The openSUSE 42.3 distribution is eol, remove this experimental job so
that the job can be removed from Zuul.
Note that master has a job for newer openSUSE running.
Change-Id: I0d26d1b1d1c4ca64c7a1dd077752d191fd3a28fb
Neutron-ovs-agent configures physical bridges that they works
in fail_mode=secure. This means that only packets which match some
OpenFlow rule in the bridge can be processed.
This may cause problem on hosts with only one physical NIC
where same bridge is used to provide control plane connectivity
like connection to rabbitmq and data plane connectivity for VM.
After e.g. host reboot bridge will still be in fail_mode=secure
but there will be no any OpenFlow rule on it thus there will be
no communication to rabbitmq.
With current order of actions in __init__ method of OVSNeutronAgent
class it first tries to establish connection to rabbitmq and later
configure physical bridges with some initial OpenFlow rules.
And in case described above it will fail as there is no connectivity
to rabbitmq through physical bridge.
So this patch changes order of actions in __init__ method that it first
setup physical bridges and than configure rpc connection.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
Closes-Bug: #1840443
(cherry picked from commit d41bd58f31)
(cherry picked from commit 3a2842bdd8)
Concurrent calls to _bind_port_if_needed may lead to a missing RPC
notification which can cause a port stuck in a DOWN state. If the only
caller that succeeds in the concurrency does not specify that an RPC
notification is allowed then no RPC would be sent to the agent. The
other caller which needs to send an RPC notification will fail since the
resulting PortContext instance will not have any binding levels set.
The failure has negative effects on consumers of the L2Population
functionality because the L2Population mechanism driver will not be
triggered to publish that a port is UP on a given compute node. Manual
intervention is required in this case.
This patch proposes to handle this by populating the PortContext with
the current binding levels so that the caller can continue on and have
an RPC notification sent out.
Closes-Bug: #1755810
Story: 2003922
Change-Id: Ie2b813b2bdf181fb3c24743dbd13487ace6ee76a
(cherry picked from commit 0dc730c7c0)
Looks like by default OVS tunnels inherit skb marks from
tunneled packets. As a result Neutron IPTables marks set in
qrouter namespace are inherited by VXLAN encapsulating packets.
These marks may conflict with marks used by underlying networking
(like Calico) and lead to VXLAN tunneled packets being dropped.
This patch ensures that skb marks are cleared by OVS before entering
a tunnel to avoid conflicts with IPTables rules in default namespace.
Closes-Bug: #1839252
Change-Id: Id029be51bffe4188dd7f2155db16b21d19da1698
(cherry picked from commit 7627735252)
In TestOVSAgent, there are two tests where the OVS agent is
configured and started twice per test. Before the second call,
the agent should be stopped first.
Depends-On: https://review.opendev.org/667216/
Change-Id: I30c2bd4ce3715cde60bc0cd3736bd9c75edc1df3
Closes-Bug: #1830895
(cherry picked from commit b77c79e5e8)
(cherry picked from commit ff66205081)
The current code will remove the port from sg_port_map, but then it
won't be added into the map, when we resize/migrate this instance,
the related openflow won't be deleted, this will cause vm connectivity
problem.
Closes-Bug: #1825295
Change-Id: I94ddddda3c1960d43893c7a367a81279d429e469
(cherry picked from commit 82782d3763)
In case when new external network is set as gateway network for
dvr router, neutron tries to create floating IP agent gateway port.
There should be always max 1 such port per network per L3 agent but
sometimes when there are 2 requests to set external gateway for 2
different routers executed almost in same time it may happend that
there will be 2 such ports created.
That will cause error with configuration of one of routers on L3 agent
and this will cause e.g. problems with access from VMs to metadata
service.
Such issues are visible in DVR CI jobs from time to time. Please check
related bug for details.
This patch adds lock mechanism during creation of such FIP gateway port.
Such solution isn't fully solving exising race condition as if 2
requests will be processed by api workers running on 2 different nodes
than this race can still happend.
But this should mitigate the issue a bit and solve problem in U/S gates
at least.
For proper fix we should probably add some constraint on database level
to prevent creation of 2 such ports for one network and one host but
such solution will not be easy to backport to stable branches so I would
prefer first to go with this easy workaround.
Conflicts:
neutron/db/l3_dvr_db.py
Change-Id: Iabab7e4d36c7d6a876b2b74423efd7106a5f63f6
Related-Bug: #1830763
(cherry picked from commit 7b81c1bc67)
(cherry picked from commit f7532f0c92)
(cherry picked from commit 5c1afcaf2b)