It may happen that subnet is connected to dvr router using IP address
different than subnet's gateway_ip.
So in br-tun arp to dvr router's port should be dropped instead of
dropping arp to subnet's gateway_ip (or mac in case of IPv6).
Conflicts:
neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/test_ovs_neutron_agent.py
Change-Id: Ida6b7ae53f3fc76f54e389c5f7131b5a66f533ce
Closes-bug: #1831575
(cherry picked from commit ae3aa28f5a)
In order to avoid inaccurate agent_boot_time setting,
this patch suggests to consider agent as "started" only
after completion of initial sync with server.
Change-Id: Icba05288889219e8a606c3809efd88b2c234bef3
Closes-Bug: #1799178
(cherry picked from commit 8f20963c5b)
In the OVS agent, when setting up the ancillary bridges, the parameter
external_id:bridge-id is retrieved. If this parameter is not defined
(e.g.: manually created bridges), ovsdbapp writes an error in the logs.
This information is irrelevant and can cause confusion during debugging time.
Change-Id: Ic85db65f651eb67fcb56b937ebe5850ec1e8f29f
Closes-Bug: #1815912
(cherry picked from commit 769e971293)
The native OVS/ofctl controllers talk to the bridges using a
datapath-id, instead of the bridge name. The datapath ID is
auto-generated based on the MAC address of the bridge's NIC.
In the case where bridges are on VLAN interfaces, they would
have the same MACs, therefore the same datapath-id, causing
flows for one physical bridge to be programmed on each other.
The datapath-id is a 64-bit field, with lower 48 bits being
the MAC. We set the upper 12 unused bits to identify each
unique physical bridge
This could also be fixed manually using ovs-vsctl set, but
it might be beneficial to automate this in the code.
ovs-vsctl set bridge <mybr> other-config:datapath-id=<datapathid>
You can change this yourself using above command.
You can view/verify current datapath-id via
ovs-vsctl get Bridge br-vlan datapath-id
"00006ea5a4b38a4a"
(please note that other-config is needed in the set, but not get)
Closes-Bug: #1697243
Co-Authored-By: Rodolfo Alonso Hernandez <ralonsoh@redhat.com>
Change-Id: I575ddf0a66e2cfe745af3874728809cf54e37745
(cherry picked from commit 379a9faf62)
(cherry picked from commit c02b1148db)
(cherry picked from commit c7031e2cd3)
Ovs-agent can be very time-consuming in handling a large number
of ports. At this point, the ovs-agent status report may have
exceeded the set timeout value. Some flows updating operations
will not be triggerred. This results in flows loss during agent
restart, especially for hosts to hosts of vxlan tunnel flow.
This fix will let the ovs-agent explicitly, in the first rpc loop,
indicate that the status is restarted. Then l2pop will be required
to update fdb entries.
Conflicts:
neutron/plugins/ml2/rpc.py
Conflicts:
neutron/plugins/ml2/drivers/l2pop/mech_driver.py
Closes-Bug: #1813703
Closes-Bug: #1813714
Closes-Bug: #1813715
Closes-Bug: #1794991
Closes-Bug: #1799178
Change-Id: I8edc2deb509216add1fb21e1893f1c17dda80961
(cherry picked from commit a5244d6d44)
(cherry picked from commit cc49ab5501)
(cherry picked from commit 5ffca49668)
The dump-flows action will get a very large sets of flow information
if there are enormous ports or openflow security group rules. For now
we can meet some known exception during such action, for instance,
memory issue, timeout issue.
So after this patch, the cleanup action of the bridge stale flows
will be done one table by one table. But note, this only supports
for 'native' OpenFlow interface driver.
Related-Bug: #1813703
Related-Bug: #1813712
Related-Bug: #1813709
Related-Bug: #1813708
Change-Id: Ie06d1bebe83ffeaf7130dcbb8ca21e5e59a220fb
(cherry picked from commit f898ffd71f)
Instead of allowing an error to bubble up and exit from rpc_loop, catch
it and assume the switch is dead which will make the agent to wait until
the switch is back without failing the service.
Change-Id: Ic3095dd42b386f56b1f75ebb6a125606f295551b
Closes-Bug: #1731494
(cherry picked from commit 544597c6ef)
If the switch misbehaves, we may receive None from db_get_val. In this
case, int() on the return value will raise TypeError which is not
expected by callers and may result in ovs agent crash.
Instead of bubbling up the TypeError exception, we raise RuntimeError if
datapath id is None.
Change-Id: I53bea00b9a7302d694b8066e969c894bf64cb2d4
Closes-Bug: #1731494
(cherry picked from commit 38d0b2b52d)
This fixes race condition leading to lack of fdb entries
on agent after OVS restart, if agent managed to handle all ports
before sending state report with start_flag set to True.
Change-Id: I943f8d805630cdfbefff9cff1fb4bce89210618b
Closes-Bug: #1808136
(cherry picked from commit 3995abefb1)
When ovs-vswitchd process is restarted neutron-ovs-agent will
handle it and reconfigure all ports and openflows in bridges.
Unfortunatelly when tunnel networks are used together with
L2pop mechanism driver, this driver will not notice that agent
lost all openflow config and will not send all fdb entries which
should be added on host.
In such case L2pop mechanism driver should behave in same way like
when neutron-ovs-agent is restarted and send all fdb_entries to
agent.
This patch adds "simulate" of agent start flag when ovs_restart is
handled thus neutron-server will send all fdb_entries to agent and
tunnels openflow rules can be reconfigured properly.
Change-Id: I5f1471e20bbad90c4cdcbc6c06d3a4412db55b2a
Closes-bug: #1804842
(cherry picked from commit ae031d1886)
Fix the mac address format for backward compatibility with
vsctl ovs api
Closes-Bug: #1756406
Change-Id: I3ba11fae433b437d9d3a0b12dd8a11fe1b35046a
(cherry picked from commit 6b13cf0bee)
On changing the port-admin-state to false, the port goes down.
Change-Id: Ica46e39d8858f4235a8a1b9caeb696346a86f38b
Closes-bug: #1672629
(cherry picked from commit ce01b70ef8)
The Neutron OVS agent logs can get flooded with KeyErrors as the
'_get_port_info' method skips the added/removed dict items if no
ports have been added/removed, which are expected to be present,
even if those are just empty sets.
This change ensures that those port info dict fields are always set.
Closes-Bug: #1783556
Change-Id: I9e5325aa2d8525231353ba451e8ea895be51b1ca
(cherry picked from commit da5b13df2b)
On hosts with dvr_snat agent mode, after restarting OVS agent,
sometimes the SNAT port is processed first instead of the distributed port.
The subnet_info is cached locally via get_subnet_for_dvr when either of these ports
are processed. However, it returns the MAC address of the port used to query
as the gateway for the subnet. Using the SNAT port, this puts the wrong
MAC as the gateway, causing some flows such as the DVR flows on br-int
for local src VMs to have the wrong MAC.
This patch fixes the get_subnet_for_dvr with fixed_ips as None for the csnat port,
as that causes the server side handler to fill in the subnet's actual gateway
rather than using the port's MAC.
Change-Id: If045851819fd53c3b9a1506cc52bc1757e6d6851
Closes-Bug: #1783470
(cherry picked from commit c6de172e58)
When Openvswitch agent will get "port_update" event
(e.g. to set port as unbound) and port is already removed
from br-int when agent tries to get vif port in
treat_devices_added_updated() method (port is removed
because e.g. nova-compute removes it) then resources set
for port by L2 agent extension drivers (like qos) are not
cleaned properly.
In such case port is added to skipped_ports and is set
as DOWN in neutron-db but ext_manager is not called then
for such port so it will not clear stuff like bandwidth
limit's QoS and queue records and also DSCP marking
open flow rules for this port.
This patch fixes this issue by adding call of
ext_manager.delete_port() method for all skipped ports.
Change-Id: I3cf5c57c7f232deaa190ab6b0129e398fdabe592
Closes-Bug: #1737892
(cherry picked from commit a8271e978a)
In case when external bridge configured in OVS agent's bridge_mappings
will be destroyed and created again (for example by running ifup-ovs
script on Centos) bridge wasn't configured by OVS agent.
That might cause broken connectivity for OpenStack's dataplane if
dataplane network also uses same bridge.
This patch adds additional ovsdb-monitor to monitor if any
of physical bridges configured in bridge_mappings was created.
If so, agent will reconfigure it to restore proper openflow rules
on it.
Change-Id: I9c0dc587e70327e03be5a64522d0c679665f79bd
Closes-Bug: #1768990
(cherry picked from commit 85b46cd51e)
Add error handling for get_network_info_for_id rpc call in the
ovs_dvr_neutron_agent.
Closes-Bug: #1758093
Change-Id: I44a5911554c712c89cdc8901cbc7b844c4b0a363
(cherry picked from commit c331b898e1)
Inter Tenant Traffic between two different networks that belong
to two different Tenants is not possible when connected through
a shared network that are internally connected through DVR
routers.
This issue can be seen in multinode environment where there
is network isolation.
The issue is, we have two different IP for the ports that are
connecting the two routers and DVR does not expose the router
interfaces outside a compute and is blocked by ovs tunnel bridge
rules.
This patch fixes the issue by not applying the DVR specific
rules in the tunnel-bridge to the shared network ports that
are connecting the routers.
Closes-Bug: #1751396
Change-Id: I0717f29209f1354605d2f4128949ddbaefd99629
(cherry picked from commit d019790fe4)
When we delete vm port with attached QoS policy,
it is just doing nothing if vif_port does not exist.
This is fine for egress bandwidth limit as it is configured
directly on vif_port in OVS.
For ingress bw limit however it uses additional records in
Openvswitch database: qos and queue. Those records are not
cleaned up in such case.
This patch also records port in self.ports in the case of
bandwidth limit rules, just as in the case of dscp rules.
Never execute port clear if vif_port not exists. Finally, ovs
driver can clean such qos and queue records
Change-Id: Iddeb49e1e6538a178ca468df0fdf9e0617ca4f1c
Closes-Bug: #1726732
(cherry picked from commit ee423e1fa0)
When the OVS agent skips processing a port because it was
not found on the integration bridge, it doesn't send back
any status to the server to notify it. This can cause the
port to get stuck in the BUILD state indefinitely, since
that is the default state it gets before the server tells
the agent to update it.
The OVS agent will now notify the server that any skipped
device should be considered DOWN if it did not exist.
Change-Id: I15dc55951cdb75c6d87d7c645f8e2cbf82b2f3e4
Closes-bug: #1719011
(cherry picked from commit a789d23b02)
As ingres traffic to instance ports when using DVR uses same matching
openflow rule as openvswitch firewall driver, it happens that setting
admin_state_up of router deletes firewall rules.
This patch makes the deletion more strict because DVR and ovs-firewall
flows differ in priority. Thus using priority when removing DVR flows
won't affect ovs-firewall flows.
Closes-bug: #1721084
Change-Id: I4eb61b2824579a4f8ba219cd1b1dcf57d38ebc89
(cherry picked from commit 0456515a7a)
Previously, DP ID was converted to integer and then back to string. As a
consequence of the conversion, DP IDs like 000123 were converted to 123
losing leading zeros. In case self._get_dp_by_dpid() method raises a
RuntimeError exception current DP ID of the bridge was compared to
cached DP ID and if IDs were different, original exception coming from
ryu library was swallowed. As conversion for cached DP ID removes
leading zeros, original exception was always swallowed if bridge's DP ID
started with zero.
This patch uses the integer for comparison between current and cached
bridge DP ID hence any exception coming from ryu is not swallowed.
Closes-bug: #1718235
Conflicts:
neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/openflow/native/test_ovs_bridge.py
Change-Id: I445aa61acc758b56c51a9403df4d92d9c1d40ace
(cherry picked from commit d739d01b6c)
Otherwise we don't see some of them for the agent, for example,
AGENT.root_helper is missing.
To make sure the logging is as early as possible, and to make sure that
options that may be registered by extensions are also logged, some
refactoring was applied to the code to move the extension manager
loading as early as possible, even before agent's __init__ is called.
Related-Bug: #1718767
Change-Id: I823150cf6406f709d1e4ffa74897d598e80f5329
(cherry picked from commit 45be804b40)
In I77650be5f04775a72e2bdf694f93988825a84b72 we added
vnic_type direct to the ovs mechanism drivers supported
vnic_types. This cause problems when working with ovs and sriovnicswitch
mechanism drivers in that order. In this case the ovs will bind
the direct port instead of the sriovnicswitch.
This change make ovs mech driver to bind the direct port only
if user requested --binding-profile '{"capabilities": ["switchdev"]}'
in the direct port if a user don't request this capability the SR-IOV
legacy NIC mode is used.
When enable-sriov-nic-features will be implemented in nova and
libvirt will expose the switchdev capability then nova will be
able to select a host which supports SR-IOV nic with switchdev
mode.
[1] - https://review.openstack.org/#/c/435954/11/specs/pike/approved/enable-sriov-nic-features.rst
[2] - https://www.redhat.com/archives/libvir-list/2017-August/msg00583.html
Closes-Bug: #1713590
Change-Id: I0b5f062bcbf02381bdf4f694fc039f9bb17a2db5
(cherry picked from commit b184558ab6)
Added datapath_type to vif_details returned by OVS
mech driver.
Depends-On: Ie523c821995c046c7f77783a34e75053fc0abb3d
Partial-Bug: #1632372
Change-Id: Ief83150caf1a32a2c043b0245b36e5ebc3a16379
This was deprecated over a year ago in [1] so let's
get rid of it to clean up some code.
1. Ib63ba8ae7050465a0786ea3d50c65f413f4ebe38
Change-Id: I6039fb7e743c5d9a1a313e3c174ada36c9874c70
In Kernel 4.8 we introduced Traffic Control (TC see [1]) hardware offloads
framework for SR-IOV VFs which allows us to configure the NIC [2].
Subsequent OVS patches [3] allow us to use the TC framework
to offload OVS datapath rules.
This patch allow OVS mech driver to bind direct (SR-IOV) port.
This will allow to offload the OVS flows using tc to the SR-IOV NIC
and gain accelerate OVS.
[1] https://linux.die.net/man/8/tc
[2] http://netdevconf.org/1.2/papers/efraim-gerlitz-sriov-ovs-final.pdf
[3] https://mail.openvswitch.org/pipermail/ovs-dev/2017-April/330606.html
DocImpact: Add SR-IOV offload support for OVS mech driver
Partial-Bug: #1627987
Depends-On: I6bc2539a1ddbf7990164abeb8bb951ddcb45c993
Change-Id: I77650be5f04775a72e2bdf694f93988825a84b72
DVR flows are not compatible with OVS firewall flows as firewall flows
have higher priority. As a consequence, rules for DVR were never match
as firewall uses output directly.
This patch replaces flows using normal or output actions and resends
packets to TRANSIENT table instead. This transient table then uses
either those normal or output action rules. With this split, we will be
able to match egress/ingress flows in TRANSIENT table instead of
LOCAL_SWITCHING putting DVR pipeline in front of OVS firewall pipeline.
Change-Id: I9f738047f131b42d11a90f539435006d16ea7883
Closes-bug: #1696983
Now with the merge of push notifications, processing a port update
no longer automatically implies a transition from ACTIVE to BUILD
to ACTIVE again.
This resulted in a bug where Nova would unplug and replug an interface
quickly during rebuild and it would never get a vif-plugged event.
Nothing in the data model was actually being updated that resulted in
the status being set to DOWN or BUILD and the port would return before
the agent would process it as a removed port to mark it as DOWN.
This fixes the bug by making the agent force the port to DOWN whenever
it loses its VLAN. Watching for the VLAN loss was already introduced
to detect these fast unplug/plug events before so this just adds the
status update.
Closes-Bug: #1694371
Change-Id: Ice24eea2534fd6f3b103ec014218a65a45492b1f
There were a few places left in the code (with TODOs) that were still
mocking out the callback manager. This patch switches them over to
the neutron-lib callback fixture.
Change-Id: I9e710db6a5103436b0f098e8f73625e3941df492
Add support for QoS ingress bandwidth limiting in
openvswitch agent.
It uses default ovs QoS policies on bandwidth limiting
mechanism.
DocImpact: Ingress bandwidth limit in QoS supported by
Openvswitch agent
Change-Id: I9d94e27db5d574b61061689dc99f12f095625ca0
Partial-Bug: #1560961
The ml2 MechanismDriver is now in neutron-lib along with its associated
constants. This patch switches over to the lib versions of those, but
leaves a shim of the MechanismDriver that just ref's the driver from
lib. This shim allows our broad consumer base of the driver to switch
over at their leisure.
NeutronLibImpact
Change-Id: I99e3de6d933a1bb341394f85415fb07306a82a01
Replace the calls to the OVSPluginAPI info retrieval functions
with reads directly from the push notification cache.
Since we now depend on the cache for the source of truth, the
'port_update'/'port_delete'/'network_update' handlers are configured
to be called whenever the cache receives a corresponding resource update.
The OVS agent will no longer subscribe to topic notifications for ports
or networks from the legacy notification API.
Partially-Implements: blueprint push-notifications
Change-Id: Ib2234ec1f5d328649c6bb1c3fe07799d3e351f48