In case when neutron-ovs-agent will notice that any of physical
bridges was "re-created", we should also ensure that stale Open
Flow rules (with old cookie id) are cleaned.
This patch is doing exactly that.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I7c7c8a4c371d6f4afdaab51ed50950e2b20db30f
Related-Bug: #1864822
(cherry picked from commit 63c45b3766)
Operators may want to see how long it takes in the port
processing procedure since DEBUG log does not enable
basically in the production envrionment.
Related-Bug: #1813703
Related-Bug: #1813707
Related-Bug: #1813706
Related-Bug: #1813709
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I43733546abf5421d0e3f4cd5a959d279e1b89d1e
(cherry picked from commit 8e73de8bc4)
Do not flood the packets to bridge, since we have the
bridge port list, we can add a simple direct flow to
the right port only.
Conflicts:
neutron/agent/linux/openvswitch_firewall/firewall.py
neutron/conf/plugins/ml2/drivers/ovs_conf.py
Closes-Bug: #1732067
Related-Bug: #1841622
Change-Id: I14fefe289a19b718b247bf0740ca9bc47f8903f4
(cherry picked from commit efa8dd0895)
Patch https://review.opendev.org/#/c/697655/ cannot be backported
because it includes an RPC version change. This patch is for the
stable branches.
Currently the ovs agent calls update_device_list with the
agent_restarted flag set only on the first loop iteration. Then the
server knows to send the l2pop flooding entries for the network to
the agent. But when a compute node with many instances on many
networks reboots, it takes time to readd all the active devices and
some may be readded after the first loop iteration. Then the server
can fail to send the flooding entries which means there will be no
flood_to_tuns flow and broadcasts like dhcp will fail.
This patch fixes that by also setting the agent_restarted flag if
the agent has not received the flooding entries for a network.
Change-Id: Iccc4fe4a785ee042fd76a663d0e76a27facd1809
Closes-Bug: #1853613
(cherry picked from commit bc0ab0fcd7)
(cherry picked from commit aee87e72b1)
The OVS agent processes the port events in a polling loop. It could
happen (and more frequently in a loaded OVS agent) that the "removed"
and "added" events can happen in the same polling iteration. Because
of this, the same port is detected as "removed" and "added".
When the virtual machine is restarted, the port event sequence is
"removed" and then "added". When both events are captured in the same
iteration, the port is already present in the bridge and the port is
discharted from the "removed" list.
Because the port was removed first and the added, the QoS policies do
not apply anymore (QoS and Queue registers, OF rules). If the QoS
policy does not change, the QoS agent driver will detect it and won't
call the QoS driver methods (based on the OVS agent QoS cache, storing
port and QoS rules). This will lead to an unconfigured port.
This patch solves this issue by detecting this double event and
registering it as "removed_and_added". When the "added" port is
handled, the QoS deletion method is called first (if needed) to remove
the unneded artifacts (OVS registers, OF rules) and remove the QoS
cache (port/QoS policy). Then the QoS policy is applied again on the
port.
NOTE: this is going to be quite difficult to be tested in a fullstack
test.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I51eef168fa8c18a3e4cee57c9ff86046ea9203fd
Closes-Bug: #1845161
(cherry picked from commit 50ffa5173d)
(cherry picked from commit 3eceb6d2ae)
(cherry picked from commit 6376391b45)
- This change updates _set_bridge_name to set
the bridge name field in the vif binding details.
- This change adds the integration_bridge name
to the agent configuration report.
Closes-Bug: #1788009
Closes-Bug: #1856152
(cherry picked from commit 995744c576)
Change-Id: I454efcb226745c585935d5bd1b3d378f69a55ca2
Type of lvm.vlan is int and other_config.get('tag') is a string,
they can never be equal. We should do type conversion before
comparing to avoid unnecessary operation of ovsdb and flows.
Change-Id: Ib84da6296ddf3c95be9e9f370eb574bf92ceec15
Closes-Bug: #1843425
(cherry picked from commit 0550c0e1f6)
Neutron-ovs-agent configures physical bridges that they works
in fail_mode=secure. This means that only packets which match some
OpenFlow rule in the bridge can be processed.
This may cause problem on hosts with only one physical NIC
where same bridge is used to provide control plane connectivity
like connection to rabbitmq and data plane connectivity for VM.
After e.g. host reboot bridge will still be in fail_mode=secure
but there will be no any OpenFlow rule on it thus there will be
no communication to rabbitmq.
With current order of actions in __init__ method of OVSNeutronAgent
class it first tries to establish connection to rabbitmq and later
configure physical bridges with some initial OpenFlow rules.
And in case described above it will fail as there is no connectivity
to rabbitmq through physical bridge.
So this patch changes order of actions in __init__ method that it first
setup physical bridges and than configure rpc connection.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I41c02b0164537c5b1c766feab8117cc88487bc77
Closes-Bug: #1840443
(cherry picked from commit d41bd58f31)
(cherry picked from commit 3a2842bdd8)
In case when physical bridge was recreated on host, ovs agent
is trying to reconfigure it.
If there will be e.g. timeout while getting bridge's datapath_id,
RuntimeError will be raised and that caused crash of whole agent.
This patch changes that to not crash agent in such case but try to
reconfigure everything in next rpc_loop iteration once again.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/test_ovs_neutron_agent.py
Change-Id: Ic9b17a420068c0c76748e2c24d97be1ed7c460c7
Closes-Bug: #1837380
(cherry picked from commit b63809715a)
Ovs-agent will scan and process the ports during the
first rpc_loop, and a local port update notification
will be sent out. This will cause these ports to
be processed again in the ovs-agent next (second)
rpc_loop.
This patch passes the restart flag (iteration num 0)
to the local port_update call trace. After this patch,
the local port_update notification will be ignored in
the first RPC loop.
Related-Bug: #1813703
Change-Id: Ic5bf718cfd056f805741892a91a8d45f7a6e0db3
(cherry picked from commit eaf3ff5786)
In the OVS agent, when setting up the ancillary bridges, the parameter
external_id:bridge-id is retrieved. If this parameter is not defined
(e.g.: manually created bridges), ovsdbapp writes an error in the logs.
This information is irrelevant and can cause confusion during debugging time.
Change-Id: Ic85db65f651eb67fcb56b937ebe5850ec1e8f29f
Closes-Bug: #1815912
(cherry picked from commit 769e971293)
In order to avoid inaccurate agent_boot_time setting,
this patch suggests to consider agent as "started" only
after completion of initial sync with server.
Change-Id: Icba05288889219e8a606c3809efd88b2c234bef3
Closes-Bug: #1799178
(cherry picked from commit 8f20963c5b)
In one specific compute node, the security group rules
can be enormous quantity. This patch adds a step-by-step
processing method to deal with the large number of the
security group rules. And also changes or adds some LOG.
Related-Bug: #1813703
Related-Bug: #1813704
Related-Bug: #1813707
Conflicts:
neutron/common/constants.py
Change-Id: I57bf27ec75cf848271c5a28b22beee12b8bd5faa
(cherry picked from commit 6ac420df7e)
Ovs-agent can be very time-consuming in handling a large number
of ports. At this point, the ovs-agent status report may have
exceeded the set timeout value. Some flows updating operations
will not be triggerred. This results in flows loss during agent
restart, especially for hosts to hosts of vxlan tunnel flow.
This fix will let the ovs-agent explicitly, in the first rpc loop,
indicate that the status is restarted. Then l2pop will be required
to update fdb entries.
Conflicts:
neutron/plugins/ml2/rpc.py
Closes-Bug: #1813703
Closes-Bug: #1813714
Closes-Bug: #1813715
Closes-Bug: #1794991
Closes-Bug: #1799178
Change-Id: I8edc2deb509216add1fb21e1893f1c17dda80961
(cherry picked from commit a5244d6d44)
The dump-flows action will get a very large sets of flow information
if there are enormous ports or openflow security group rules. For now
we can meet some known exception during such action, for instance,
memory issue, timeout issue.
So after this patch, the cleanup action of the bridge stale flows
will be done one table by one table. But note, this only supports
for 'native' OpenFlow interface driver.
Related-Bug: #1813703
Related-Bug: #1813712
Related-Bug: #1813709
Related-Bug: #1813708
Change-Id: Ie06d1bebe83ffeaf7130dcbb8ca21e5e59a220fb
(cherry picked from commit f898ffd71f)
The native OVS/ofctl controllers talk to the bridges using a
datapath-id, instead of the bridge name. The datapath ID is
auto-generated based on the MAC address of the bridge's NIC.
In the case where bridges are on VLAN interfaces, they would
have the same MACs, therefore the same datapath-id, causing
flows for one physical bridge to be programmed on each other.
The datapath-id is a 64-bit field, with lower 48 bits being
the MAC. We set the upper 12 unused bits to identify each
unique physical bridge
This could also be fixed manually using ovs-vsctl set, but
it might be beneficial to automate this in the code.
ovs-vsctl set bridge <mybr> other-config:datapath-id=<datapathid>
You can change this yourself using above command.
You can view/verify current datapath-id via
ovs-vsctl get Bridge br-vlan datapath-id
"00006ea5a4b38a4a"
(please note that other-config is needed in the set, but not get)
Closes-Bug: #1697243
Co-Authored-By: Rodolfo Alonso Hernandez <ralonsoh@redhat.com>
Change-Id: I575ddf0a66e2cfe745af3874728809cf54e37745
(cherry picked from commit 379a9faf62)
This fixes race condition leading to lack of fdb entries
on agent after OVS restart, if agent managed to handle all ports
before sending state report with start_flag set to True.
Change-Id: I943f8d805630cdfbefff9cff1fb4bce89210618b
Closes-Bug: #1808136
(cherry picked from commit 3995abefb1)
When ovs-vswitchd process is restarted neutron-ovs-agent will
handle it and reconfigure all ports and openflows in bridges.
Unfortunatelly when tunnel networks are used together with
L2pop mechanism driver, this driver will not notice that agent
lost all openflow config and will not send all fdb entries which
should be added on host.
In such case L2pop mechanism driver should behave in same way like
when neutron-ovs-agent is restarted and send all fdb_entries to
agent.
This patch adds "simulate" of agent start flag when ovs_restart is
handled thus neutron-server will send all fdb_entries to agent and
tunnels openflow rules can be reconfigured properly.
Change-Id: I5f1471e20bbad90c4cdcbc6c06d3a4412db55b2a
Closes-bug: #1804842
(cherry picked from commit ae031d1886)
The Neutron OVS agent logs can get flooded with KeyErrors as the
'_get_port_info' method skips the added/removed dict items if no
ports have been added/removed, which are expected to be present,
even if those are just empty sets.
This change ensures that those port info dict fields are always set.
Closes-Bug: #1783556
Change-Id: I9e5325aa2d8525231353ba451e8ea895be51b1ca
(cherry picked from commit da5b13df2b)
As part of the implementation of multiple port bindings [1], add binding
activation support to the OVS agent. This will enable the execution in
OVS agents of the complete sequence of steps outlined in [1] during an
instance migration:
1) Create inactive port bindings for destination host
2) Migrate the instance to the destination host and plug its VIFs
3) Activate the port bindings in the destination host
4) Delete the port bindings for the source host
[1] https://review.openstack.org/#/c/309416/
Change-Id: Iabca39364ec95633b2a8891fc295b3ada5f4f5e0
Partial-Bug: #1580880
To support multiple port bindings, the L2 agents have to have the
capability to handle binding deactivation notifications from the
Neutron server. This patch adds the necessary code to the OVS agent.
After receiving the notification, the agent un-plugs the corresponding
VIF from the integration bridge.
Change-Id: I78178de2039ccabc649558de4f6549a38de90418
Partial-Bug: #1580880
In case when external bridge configured in OVS agent's bridge_mappings
will be destroyed and created again (for example by running ifup-ovs
script on Centos) bridge wasn't configured by OVS agent.
That might cause broken connectivity for OpenStack's dataplane if
dataplane network also uses same bridge.
This patch adds additional ovsdb-monitor to monitor if any
of physical bridges configured in bridge_mappings was created.
If so, agent will reconfigure it to restore proper openflow rules
on it.
Change-Id: I9c0dc587e70327e03be5a64522d0c679665f79bd
Closes-Bug: #1768990
Fixed all pep8 E265 errors and changed tox.ini to no longer
ignore them. Also removed an N536 comment missed from a
previous change.
Change-Id: Ie6db8406c3b884c95b2a54a7598ea83476b8dba1
A bulk of the public APIs that are part of neutron.plugins.common.utils
were rehomed into neutron-lib with [1] and the remaining with [2].
This patch consumes [1] by:
- Removing the rehomed code from neutron.
- Removing the UTs that are no longer applicable.
- Leaving the functions in [2][3] in neutron.plugins.common.utils until
we release [2][3] and can consume it at which point we should be able to
remove the utils module.
NeutronLibImpact
[1] Iabb155b5d2d0ec6104ebee5dd42cf292bdf3ec61
[2] I2c0e4ef03425ba0bb2651ae3e68d6c8cde7b8f90
[3] I73f5e8ad7a1a83392094db846d18964d811b8bb2
Change-Id: I1d63cbea463e92e1d2e053f8e1a564ed52cb84f8
Fix W503 (line break before binary operator) pep8 warnings
and no longer ignore new failures.
Trivialfix
Change-Id: I7539f3b7187f2ad40681781f74b6e05a01bac474
Instead of allowing an error to bubble up and exit from rpc_loop, catch
it and assume the switch is dead which will make the agent to wait until
the switch is back without failing the service.
Change-Id: Ic3095dd42b386f56b1f75ebb6a125606f295551b
Closes-Bug: #1731494
New releases of oslo.config support a 'mutable' parameter to Opts.
This is only respected when the new method mutate_config_files is
called instead of reload_config_files. Neutron delegates making this
call to oslo.service. This was provided in patchset
Icec3e664f3fe72614e373b2938e8dee53cf8bc5e
Further patches will be needed to make select config options be
marked as mutable. This change enables support for oslo provided
config options to be updated via SIGHUP such as log level.
Task: 6389
Story: 2001545
Change-Id: I9442965607f3248706464643c6d87a04edcae24e
The neutron.common.topics module was rehomed into neutron-lib with
commit Ie88b84949cbd55a4e7ad06341aab77b286cdc485
This patch consumes it by removing the rehomed module from neutron
and using the module from neutron-lib instead.
NeutronLibImpact
Change-Id: Ia4a4604c259ce862597de80c6deeb3d408bf0e95
The oslo.messaging bug that triggered adding the hack is fixed in
Queens, so we don't need it anymore.
Change-Id: I775b466dce19c4168985b19e1ddb938118f48dcb
Related-Bug: #1705351
Adding ability to set DSCP field in OVS tunnels outer header, or
inherit it from the inner header's DSCP value for OVS and linuxbridge.
Change-Id: Ia59753ded73cd23019605668e60cfbc8841e803d
Closes-Bug: #1692951
When Openvswitch agent will get "port_update" event
(e.g. to set port as unbound) and port is already removed
from br-int when agent tries to get vif port in
treat_devices_added_updated() method (port is removed
because e.g. nova-compute removes it) then resources set
for port by L2 agent extension drivers (like qos) are not
cleaned properly.
In such case port is added to skipped_ports and is set
as DOWN in neutron-db but ext_manager is not called then
for such port so it will not clear stuff like bandwidth
limit's QoS and queue records and also DSCP marking
open flow rules for this port.
This patch fixes this issue by adding call of
ext_manager.delete_port() method for all skipped ports.
Change-Id: I3cf5c57c7f232deaa190ab6b0129e398fdabe592
Closes-Bug: #1737892
neutron-lib contains a number of the plugin related constants from
neutron.plugins.common.constants. This patch consumes those constants
from neutron-lib and removes them from neutron. In addition the notion
of the dummy plugin service type is moved strictly into the test
package of neutron since it's not a real service plugin.
NeutronLibImpact
Change-Id: I767c626f3fe6159ab3abd6a7ae3cb9893b79bf66
The neutron-lib commit I360545b6ee4291547e0c5c8e668ad03d3efa4725 moved
the externally consumed globals from neutron.common.constants into lib.
With the exception of PROVISIONAL_IPV6_PD_PREFIX all other constants
in neutron.common.constants should only be used in neutron, and will
hopefully remain that way. External consumers needing access to other
common constants should move them into lib first.
NeutronLibImpact
Change-Id: Ie4bcffccf626a6e1de84af01f3487feb825f8b65
When the OVS agent skips processing a port because it was
not found on the integration bridge, it doesn't send back
any status to the server to notify it. This can cause the
port to get stuck in the BUILD state indefinitely, since
that is the default state it gets before the server tells
the agent to update it.
The OVS agent will now notify the server that any skipped
device should be considered DOWN if it did not exist.
Change-Id: I15dc55951cdb75c6d87d7c645f8e2cbf82b2f3e4
Closes-bug: #1719011
Otherwise we don't see some of them for the agent, for example,
AGENT.root_helper is missing.
To make sure the logging is as early as possible, and to make sure that
options that may be registered by extensions are also logged, some
refactoring was applied to the code to move the extension manager
loading as early as possible, even before agent's __init__ is called.
Related-Bug: #1718767
Change-Id: I823150cf6406f709d1e4ffa74897d598e80f5329