In case when related dvr router is configured by L3 agent, it is first
added to the tasks queue and then processed as any other router hosted
on the L3 agent.
But if L3 agent will ask neutron server about details of such router,
it wasn't returned back as this router wasn't really scheduled to the
compute node which was asking for it. It was "only" related to some
other router scheduled to this compute node. Because of that router's
info wasn't found in reply from the neutron-server and L3 agent was
removing it from the compute node.
Now _get_router_ids_for_agent method from the l3_dvrscheduler_db module
will check router serviceable ports for each dvr router hosted on the
compute node and will then find all routers related to it. Thanks to
that it will return routers which are on the compute node only because
of other related routers scheduled to this host and such router will not
be deleted anymore.
Change-Id: I689d5135b7194475c846731d846ccf6b25b80b4a
Closes-Bug: #1884527
(cherry picked from commit 38286dbd2e)
When the vlan and vxlan both exist in env, and l2population
and arp_responder are enabled, if we update a port's ip address
from vlan network, there will be arp responder related flows
added into br-tun, this will cause too many arp reply for
one arp request, and vm connections will be unnormal.
Closes-Bug: #1824504
Change-Id: I1b6154b9433a9442d3e0118dedfa01c4a9b4740b
(cherry picked from commit 5301ecf41b)
This option allows to configure Number of times nova or ironic client
should retry on any failed http call.
Default value for this new option is "3".
Conflicts:
neutron/notifiers/ironic.py
neutron/notifiers/nova.py
neutron/tests/unit/notifiers/test_nova.py
Change-Id: I795ee7ca729646be0411a1232bf218015c65010f
Closes-Bug: #1883712
(cherry picked from commit e94511cd25)
During the CI meeting we agreed that non-voting jobs in the branches
which are in Extened Maintenance (EM) phase should be moved to from
the check to the experimental queue.
This patch is doing exactly that.
Change-Id: Ie8a63eacf479ac6871af448a1741598584de8de8
We observe an excessive amount of routers created on
compute node on which some virtual machines got a fixed
ip on floating network.
Rpc servers should filter out those unnecessary routers
during syncing.
Change-Id: I299031a505f05cd0469e2476b867b9dbca59c5bf
Partial-Bug: #1840579
(cherry picked from commit 480b04ce04)
In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I7c7c8a4c371d6f4afdaab51ed50950e2b20db30f
Related-Bug: #1864822
(cherry picked from commit 63c45b3766)
Block traffic between br-int and br-physical is over kill
and will at least
1. interrupt vlan flow during startup, and is particularly
so if dvr enabled
2. if let's rabbitmq is not stable, it is possible data plane
will be affected and vlan will never work.
Using openstack on k8s particularly amplifies the problem
because pod could be killed pretty easily by liveness
probes.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I51050c600ba7090fea71213687d94340bac0674a
Closes-Bug: #1869808
(cherry picked from commit 90212b12cd)
In some cases it may be useful to log new vlan tag which is found
on the port when it losts old vlan tag which should is expected to
be there.
So this patch adds such value to the log message.
TrivialFix
Depends-On: https://review.opendev.org/735615
Change-Id: I231e624f460510decc6d2237040c8bef207e2e8e
(cherry picked from commit 3ac63422ea)
In case when physical bridge is removed and created again it
is initialized by neutron-ovs-agent.
But if agent has enabled distributed routing, dvr related
flows wasn't configured again and that lead to connectivity issues
in case of DVR routers.
This patch fixes it by adding configuration of dvr related flows
if distributed routing is enabled in agent's configuration.
It also adds reset list of phys_brs in dvr_agent. Without that there
were different objects used in ovs agent and dvr_agent classes thus
e.g. 2 various cookie ids were set on flows in physical bridge.
This was also the same issue in case when openvswitch was restarted and
all bridges were reconfigured.
Now in such case there is correctly new cookie_id configured for all
flows.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I710f00f0f542bcf7fa2fc60800797b90f9f77e14
Closes-Bug: #1864822
(cherry picked from commit 91f0bf3c85)
1. Make grenade jobs experimental for EM branches
As discussed in ML thread[1], we are going to
make grenade jobs as non voting for all EM stable and
oldest stable. grenade jobs are failing not and it might take
time to fix those if we are able to fix. Once it jobs are
working depends on project team, they can bring them back to
voting or keep non-voting.
If those jobs are failing consistently and no one is fixing them
then removing those n-v jobs in future also fine.
Additionally, it was proposed in neutron CI meeting [2] that non-voting
jobs would be moved to experimental, so move grenade jobs there instead
of keeping them non-voting
[1] http://lists.openstack.org/pipermail/openstack-discuss/2020-June/015499.html
[2] http://eavesdrop.openstack.org/meetings/neutron_ci/2020/neutron_ci.2020-07-01-15.00.log.html#l-101
StableOnly
Conflicts:
.zuul.yaml
(cherry picked from commit 9313dce459)
2. Install pip2 for functional/fullstack/neutron-tempest-iptables_hybrid
Else both jobs fail with "sudo: pip: command not found"
3. Add ensure-tox for functional/fullstack/neutron-tempest-iptables_hybrid
Similar error message for tox
4. Disable OVS compilation for fullstack and move job to experimental
Compilation fails similarly to recent master failures:
/opt/stack/new/ovs/datapath/linux/geneve.c:943:15: error: ‘const struct ipv6_stub’ has no member named ‘ipv6_dst_lookup’
But branch 2.9 is not updated anymore. Use official package
This triggers a few tests failures, so move it to experimental (instead
of marking non-voting), same as grenade jobs
Change-Id: Ie846a8cb481da65999b12f5547b407cc7bdc3138
When a Port is deleted, the QoS extension will reset any rule (QoS
and Queue registers) applied on this port or will reset the
related Interface policing parameters.
If the Port and the related Interface are deleted during the QoS
extension operation, those commands will fail. This patch makes those
operations more resiliant by not checking the errors when writing on
the Port or the Interface register.
NOTE: this patch is squashed with [1]. That will fix the problem
with empty "vsctl" transactions when using this OVS DB implementation.
[1]https://review.opendev.org/#/c/738574/
Change-Id: I2cc4cdf5be25fab6adbc64acabb3fffebb693fa6
Closes-Bug: #1884512
(cherry picked from commit e2d1c2869a)
(cherry picked from commit 84ac8cf9ff)
(cherry picked from commit 3785868bfb)
(cherry picked from commit 7edfb0ef4a)
Neutron-ovs-agent can now enable IGMP snooping in integration bridge
if config option "igmp_snooping_enable" in OVS section in config will
be set to True.
It will also set mcast-snooping-disable-flood-unregistered=true
so flooding of multicast packets to all unregistered ports will be
disabled also.
Both changes are applied on integration bridge.
Change-Id: I12f4030a35d10d1715d3b4bfb3ed5efb9aa28f2b
Closes-Bug: #1840136
(cherry picked from commit 5b341150e2)
There is a race condition between nova-compute boots instance and
l3-agent processes DVR (local) router in compute node. This issue
can be seen when a large number of instances were booted to one
same host, and instances are under different DVR router. So the
l3-agent will concurrently process all these dvr routers in this
host at the same time.
For now we have a green pool for the router ResourceProcessingQueue
with 8 greenlet, but some of these routers can still be waiting, event
worse thing is that there are time-consuming actions during the router
processing procedure. For instance, installing arp entries, iptables
rules, route rules etc.
So when the VM is up, it will try to get meta via the local proxy
hosting by the dvr router. But the router is not ready yet in that
host. And finally those instances will not be able to setup some
config in the guest OS.
This patch adds a new measurement based on the router quantity to
indicate the L3 router process queue green pool size. The pool size
will be limit from 8 (original value) to 32, because we do not want
the L3 agent cost too much host resource on processing router in the
compute node.
Conflicts:
neutron/tests/functional/agent/l3/test_legacy_router.py
Related-Bug: #1813787
Change-Id: I62393864a103d666d5d9d379073f5fc23ac7d114
(cherry picked from commit 837c9283ab)
In the patch [1] we changed definition of the abstract method
"plug" in the LinuxInterfaceDriver class.
That broke e.g. 3rd-party drivers which still don't accept this
new parameter called "link_up" in the plug_new method.
So this patch fixes this to make such legacy drivers to be still working
with the new base interface driver class.
This commit also marks such definition of the plug_new method as
deprecated. Possibility of using it without accepting link_up parameter
will be removed in the "W" release of the OpenStack.
[1] https://review.opendev.org/#/c/707406/
Change-Id: Icd555987a1a57ca0b31fa7e4e830583d6c69c861
Closes-Bug: #1879307
(cherry picked from commit 30d573d5ab)
(cherry picked from commit 9c242a0329)
(cherry picked from commit bc8c38bda8)
Although notify_nova_on_port_status_changes defaults to true, it
could be to false, making the nova_notifier attribute unsafe to
use without checking.
This patch checks both the config option and that the attribute
exists, since the config could be changed after the plugin is
already initialized without the nova_notifier attribute being set.
Change-Id: Ide0f93275e60dffda10b7da59f6d81c5582c3849
Closes-bug: #1843269
(cherry picked from commit ab4320edb4)
The 2.6.0 version introduces some checks that cause failures
with the current code. To avoid that, cap pycodestyle to a
version that had been tested without errors.
In Rocky we already had tests with pycodestyle, but proper listing in
test-requirements.txt was only added in
I33be4f5d4ae48c6bd48d80e3f1185ef8307a2a0c
Conflicts:
test-requirements.txt
Change-Id: I00a35884b14af3e2cf751c04312c847ecfe658c7
(cherry picked from commit 719cae183a)
[0] introduced the concept of connected routers: routers that are
connected to the same subnets. When a L3 agent is synching a router
with connected routers, the data of the entire set should be returned
to the agent by the Neutron server.
However, if an agent tries to synch a router with
no connected routers when the same agent has other routers that are
connected among them, the Neutron server returns the former and the
latter. For details of how this bug can manifest itself, please see [1].
This change prevents this situation: only the synched router is
returned.
[0] https://review.opendev.org/#/c/597567
[1] https://bugs.launchpad.net/neutron/+bug/1838449/comments/15
Change-Id: Ibbf35d0f4a0bf9281f0bc8c411e8527eed75361d
Closes-Bug: #1838449
(cherry picked from commit 48ea7da6c5)
In order to reduce the number of elements retrieved from the DB, this
patch, before processing the VLAN allocations per physical network,
deleted those registers belonging to any unconfigured physical network.
The VLAN registers per physical network are deleted using a bulk delete
operation, to speed up the process.
Those missing VLAN registers per network are now created using a bulk
insert operation, available in the ORM. This bulk operation speeds up
the sync process.
Conflicts:
neutron/plugins/ml2/drivers/type_vlan.py
Change-Id: I8568e2277e157754aaff87a059a40e34e6a43e2b
Partial-Bug: #1862178
(cherry picked from commit 016e7826f1)
(cherry picked from commit 651eb12bec)
(cherry picked from commit 4fff732b76)
Patch [1] introduced new mechanism which only brings UP interfaces
on master node of HA router. It works fine with keepalived 1.x
but it is broken when keepalived 2.x was used (e.g. on Centos 8) as
in this new version of keepalived by default all interfaces of VIPs
and routes are tracked, and if one of them is DOWN, keepalived is
going to FAULT state. Because of that router will never be
transitioned to MASTER on any node.
This patch fixes it by adding "no_track" option to all VIPs
and routes in keepalived's config file.
This "no_track" option isn't added to ha interface so this one
is still tracked by keepalived.
[1] https://review.opendev.org/#/c/707406/
Closes-bug: #1874211
Change-Id: Ic16cf83fe1d1576d91047adb2d4f9e07d57185b6
(cherry picked from commit dc9084a8ec)
Operators may want to see how long it takes in the port
processing procedure since DEBUG log does not enable
basically in the production envrionment.
Related-Bug: #1813703
Related-Bug: #1813707
Related-Bug: #1813706
Related-Bug: #1813709
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I43733546abf5421d0e3f4cd5a959d279e1b89d1e
(cherry picked from commit 8e73de8bc4)