If a user specifies a header in their request for metadata,
it could override what the proxy would have inserted on their
behalf. Make sure to remove any headers we don't want, and
override something that might be present in the request.
If the agent somehow gets a request with both headers it will
silently drop it.
Change-Id: Id6c103b7bcebe441c27c6049d349d84ba7fd15a6
Closes-bug: #1865036
(cherry picked from commit 5af046fd4e)
If the DVR+HA router has external gateway, the snat-namespace will be
initialized twice during agent restart. And that ns initialization
function will run many external resource processing actions which will
definitely increase the starting time of L3 agent. This patch addresses
this issue.
Change-Id: I7719491275fa1ebfa7e881366e5cb066e3d4185c
Closes-Bug: #1850779
(cherry picked from commit 7a9d6d2641)
Patch https://review.opendev.org/#/c/697655/ cannot be backported
because it includes an RPC version change. This patch is for the
stable branches.
Currently the ovs agent calls update_device_list with the
agent_restarted flag set only on the first loop iteration. Then the
server knows to send the l2pop flooding entries for the network to
the agent. But when a compute node with many instances on many
networks reboots, it takes time to readd all the active devices and
some may be readded after the first loop iteration. Then the server
can fail to send the flooding entries which means there will be no
flood_to_tuns flow and broadcasts like dhcp will fail.
This patch fixes that by also setting the agent_restarted flag if
the agent has not received the flooding entries for a network.
Change-Id: Iccc4fe4a785ee042fd76a663d0e76a27facd1809
Closes-Bug: #1853613
(cherry picked from commit bc0ab0fcd7)
(cherry picked from commit aee87e72b1)
Port deletion triggers disassociate_floatingips. This patch ensures
that method not only clears the port association for a Floating IP,
but also removes any DNS record associated with it.
Change-Id: Ia6202610c09811f240af35e2523126447bf02ca5
Closes-Bug: #1812168
(cherry picked from commit 4379310846)
Without this mock UT related configure_ipv6 method were failing
e.g. on systems when IPv6 was disabled or on systems where such
check should be done differently, like MacOS.
Change-Id: I6d13ab1db1d5465b2ff6abf6499e0d17e1ee8bbb
(cherry picked from commit c20b5e347d)
During the ha router state change event, the gateway port only
changed the L2 binding host. So l3 agent has the entire gateway
port information. It is not necessary to send a router_update
message to l3 agent again.
Depends-On: https://review.opendev.org/708825/
Closes-Bug: #1795127
Change-Id: Ia332421aff995f42e7a6e6e96b74be1338d54fe1
(cherry picked from commit 452b282412)
If both are run under the same process, and api_workers >= 2, the server
process will instantiate two oslo_service.ProcessLauncher instances
This should be avoided [0], and indeed causes issues on subprocess and
signal handling: killed RPC workers not respawning, SIGHUP on master
process leading to unresponsive server, signal not properly sent to all
child processes, ...
To avoid this, use the wsgi ProcessLauncher instance if it exists
[0] https://docs.openstack.org/oslo.service/latest/user/usage.html#launchers
Change-Id: Ic821f8ca84add9c8137ef712031afb43e491591c
Closes-Bug: #1780139
(cherry picked from commit 13aa00026f)
Security group can have a state of empty ports but non-empty members. So
we need skip the flow update only when members dict is empty.
Change-Id: I429edb3d2dea5fa97441909b4d2c776f97f0516f
Closes-Bug: #1862703
Related-Bug: #1854131
(cherry picked from commit 6dbba8d5ce)
Low port delete priority may lead to duplicate entries in network
cache if IPs are reused frequently.
Also can't find a strict reason why it should be of lower priority.
Change-Id: I55f858d50e636eb9091570b256380330b9ce9cb3
Related-bug: #1862315
Related-bug: #1828423
(cherry picked from commit a0bb5763b2)
Common neutron resource(e.g, Port) consists of:
1. Resource Attributes, e.g: Port.mac_address, etc.
2. Standard Attributes, e.g: created_at, and are shared among all
neutron resources.
The `sort` opt only supports limited attributes. We need to filter
attributes that are defined with `is_sort_key=True` and it's preferred
to explicitly warn CLI & API users of illegal sort keys rather than
just accept without check, pass forward and then hit a internal error
which's quite confusing.
Depends-on: https://review.opendev.org/#/c/660097/
Change-Id: I8d206f909b09f1279dfcdc25c39989a67bff93d5
Closes-Bug: #1659175
(cherry picked from commit 335ac4e2d9)
When I'm trying to introduce a central sort-keys validation within
patch [1] to `get_sorts` method in `neutron.api.api_common` module,
I get blocked by some resource schemas and test cases. After reading
neutron API docs and some inspection, I believe there exists uses of
improper sort keys in test cases and some resource schemes need to
keep aligned with offical documents.
* Schemas of resource SecurityGroups/SG Rules/Segments don't provide
`is_sort_key` flag for their sort key properties claimed in offical
docs as neutron-lib does. See [2] for more details.
* Test cases of resource NetworkSegmentRange use unsupported sort keys,
e.g: physical_network. Replace it with `name` property. See [2] for more
details.
[1] https://review.opendev.org/#/c/653903/
[2] https://developer.openstack.org/api-ref/network/v2/index.html
Conflicts:
neutron/tests/unit/extensions/test_network_segment_range.py
Change-Id: I45a51736e4075e3dbc16827486869d70b659622d
(cherry picked from commit ed3f1087fa)
(cherry picked from commit 874ffe0d6b)
In neutron.agent.linux.openvswitch_firewall.firewall make the method
update_port_filter catch OVSFWTagNotFound and log it to avoid
traceback in log files.
Conflicts:
neutron/agent/linux/openvswitch_firewall/firewall.py
Change-Id: I584d867f0e1c47447cb8790fd715fa01ec902438
Closes-Bug: #1811405
(cherry picked from commit 22f55822aa)
The OVS agent processes the port events in a polling loop. It could
happen (and more frequently in a loaded OVS agent) that the "removed"
and "added" events can happen in the same polling iteration. Because
of this, the same port is detected as "removed" and "added".
When the virtual machine is restarted, the port event sequence is
"removed" and then "added". When both events are captured in the same
iteration, the port is already present in the bridge and the port is
discharted from the "removed" list.
Because the port was removed first and the added, the QoS policies do
not apply anymore (QoS and Queue registers, OF rules). If the QoS
policy does not change, the QoS agent driver will detect it and won't
call the QoS driver methods (based on the OVS agent QoS cache, storing
port and QoS rules). This will lead to an unconfigured port.
This patch solves this issue by detecting this double event and
registering it as "removed_and_added". When the "added" port is
handled, the QoS deletion method is called first (if needed) to remove
the unneded artifacts (OVS registers, OF rules) and remove the QoS
cache (port/QoS policy). Then the QoS policy is applied again on the
port.
NOTE: this is going to be quite difficult to be tested in a fullstack
test.
Conflicts:
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
Change-Id: I51eef168fa8c18a3e4cee57c9ff86046ea9203fd
Closes-Bug: #1845161
(cherry picked from commit 50ffa5173d)
(cherry picked from commit 3eceb6d2ae)
(cherry picked from commit 6376391b45)
When trunk port deletion is attempted but fails because the driver does
not permit it, the logs do not contain the driver error message that
specifies the precise rationale for preventing the trunk port deletion.
Log it explicitly.
Change-Id: I7ed1a742849dfce9e65b8eb36566112501fb0e39
OVS agent is a single thread module executed on a os-ken AppManager
context. os-ken uses, by default (and no other implementation is
available today [1]), "eventlet" threads. Those threads are scheduled
manually by the code itself; the context switch is done through
yielding. The easiest way to do this is by executing:
eventlet.sleep()
If the assigned thread is not ready to take the GIL and do not yield
back the executor, other threads will starve and eventually will
timeout.
This patch removes the "sleep" command during the DP retrieval. This
will keep the executor on the current thread and will prevent the
execution timeouts, as seen in the bug related.
[1]1f751b2d7d/os_ken/lib/hub.py
Closes-Bug: #1861269
Change-Id: I19e1af1bda788ed970d30ab251e895f7daa11e39
(cherry picked from commit 740741864a)
For large scale deployment, the dvr router will be installed to
the scheduled DHCP host. This will definitely increase the l3
agent service pressure, especially in large number of concurrent
updates, creation, or agent restart.
This patch adds a config ``host_dvr_for_dhcp`` for the DHCP port
device_owner filter during DVR host query. Then if we set
``host_dvr_for_dhcp = False``, L3-agent will not host the DVR router
namespace in its connected networks' DHCP agent hosts.
Closes-Bug: #1609217
Change-Id: I53e20be9b306bf9d3b34ec6a31e3afabd5a0fd6f
(cherry picked from commit 8f057fb49a)
This is to fix race conditions on neutron server init.
Please see bug for details.
Conflicts:
neutron/db/models/flavor.py
Change-Id: I943e6397319b9a4a7fc1a5b3acb721920ddffb02
Partial-Bug: #1824299
(cherry picked from commit 38daf9eaae)
(cherry picked from commit d7945c60b0)
FirewallDriver.process_trusted_ports" is called with many ports,
"_initialize_egress_no_port_security" retrieves the VIF ports
("Interface" registers in OVS DB), one per iteration, based in the
port_id. Instead of this procedure, if the DB is called only once to
retrieve all the VIF ports, the performance increase is noticeable.
E.g.: bridge with 1000 ports and interfaces.
Retrieving 100 ports:
- Bulk operation: 0.08 secs
- Loop operation: 5.6 secs
Retrieving 1000 ports:
- Bulk operation: 0.08 secs
- Loop operation: 59 secs
Closes-Bug: #1836095
Related-Bug: #1836023
Change-Id: I5b259717c0fdb8991f1df86b1ef4fb8ad0f18e70
(cherry picked from commit ae1d36fa9d)
In case when user's security group contains rules created e.g.
by admin, and such rules has got admin's tenant as tenant_id,
owner of security group should be able to see those rules.
Some time ago this was addressed for request:
GET /v2.0/security-groups/<sec_group_id>
But it is also required to behave in same way for
GET /v2.0/security-group-rules
So this patch fixes this behaviour for listing of security
group rules.
To achieve that this patch also adds new policy rule:
ADMIN_OWNER_OR_SG_OWNER which is similar to already existing
ADMIN_OWNER_OR_NETWORK_OWNER used e.g. for listing or creating
ports.
Conflicts:
neutron/conf/policies/security_group.py
Change-Id: I09114712582d2d38d14cf1683b87a8ce3a8e8c3c
Closes-Bug: #1824248
(cherry picked from commit b898d2e3c0)
Usually Neutron stops neutron-keepalived-state-change-monitor process
gracefully with SIGTERM.
But in case if this will not stop process for some time, Neutron will
try to kill this process with SIGKILL (-9).
That was causing problem with rootwrap as kill filters for this process
allowed to send only "-15" to it.
Now it is possible to kill this process with "-9" too.
Conflicts:
etc/neutron/rootwrap.d/l3.filters
Change-Id: Id019fa7649bd1158f9d56e63f8dad108d0ca8c1f
Closes-bug: #1860326
(cherry picked from commit d6fccd247f)
(cherry picked from commit f4d05266d2)
In fetch_and_sync_all_routers method is used python's range function.
Range function accepts integers.
This patch is fixing divide behaviour in py3 where result number is float,
by retyping float to int as it is represented in py2.
Change-Id: Ifffdee0d4a3226d4871cfabd0bdbf13d7058a83e
Closes-Bug: #1824334
(cherry picked from commit 49a66dba31)
(cherry picked from commit 7039113990)
In [1] retry of trunk update was added to avoid StaleDataError
exceptions to fail to set trunk port or subports to ACTIVE state.
But it was only partial fix for the issue descibed in related bug
and from [2] we know that it still can happen on high load systems
from time to time.
So I was checking this issue and reported bug again and I found out
that retry was added only in _process_trunk_subport_bindings()
method. But StaleDataError can be raised also in other cases where
the same trunk is updated, e.g. in update_trunk_status() method.
So this commit adds same retry mechanism to all trunk.update() actions
in services.trunk.rpc.server module.
[1] https://review.opendev.org/#/c/662236/
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1733197
Conflicts:
neutron/services/trunk/rpc/server.py
Change-Id: I10e3619d5f3600ea97ed695321bb691dece3181f
Partial-Bug: #1828375
(cherry picked from commit ade35a233e)
It may happend that one router's port is going to be
removed and another one (same IP but new subnet) is going to be added
to the router in short time.
That can lead to the problem that IP which is allocated to the new
port is not added to keepalived's vips list because same IP address
is already in this list (this exising IP address belongs to old port).
But few seconds later old port is removed and finally router ends
up with new port configured without IP address.
To avoid such case, this patch switches order of processing new
and deleted ports in _process_internal_ports() method in RouterInfo
class.
So now first old ports will be removed and than new ports will be
configured so there will be no case when IP address is already added
to VIPs list when it is going to be removed in few seconds.
Change-Id: I72dc4a06a806731ec5124fa11c9f69c7dd6cbbb0
Closes-Bug: #1857021
(cherry picked from commit 3faba7cae0)
- This change updates _set_bridge_name to set
the bridge name field in the vif binding details.
- This change adds the integration_bridge name
to the agent configuration report.
Closes-Bug: #1788009
Closes-Bug: #1856152
(cherry picked from commit 995744c576)
Change-Id: I454efcb226745c585935d5bd1b3d378f69a55ca2
skb mark is not supported when using ovs hw-offload and using it
breaks the vxlan offload.
This patch clear skb mark only if ovs hw-offload is disabled.
This should be fine as ovs with hw-offload runs on the
compute node (DVR is not supported), so clear the skb mark for
the qrouter is not needed.
Closes-Bug: #1855888
Conflicts:
neutron/agent/common/ovs_lib.py
Change-Id: I71f45fcd9b7e7bdacaafc7fa96c775e88333ab48
(cherry picked from commit a75ec08ddb)
(cherry picked from commit bee5059ccb)
This patch fixes 3 problems to fix gate jobs in Rocky.
1. In Stein the docs target started to fail when new release of
neutron-lib appeared. This is because tox installs neutron and its
requirements without any constraints. To fix this both the upper
constraints and neutron requirements needs to be added to
dependencies of docs target.
2. Cap hacking in test-requirements.txt
hacking as a linter is in global requirements blacklist and so is
not in constraints. Recent release introduced new rules that
required fix on master, on stable branches we should rather cap to
the version that was in use during this release development.
3. Update sphinx requirements
Requirement check is failing on sphinx:
Requirement(package='sphinx', location='', specifiers='!=1.6.6,>=1.6.2', markers='', comment='# BSD', extras=frozenset()) 'markers': '' does not match "python_version>='3.4'"
Could not find a global requirements entry to match package {}. If the package is already included in the global list, the name or platform markers there may not match the local settings.
Conflicts:
tox.ini
Closes-Bug: #1856156
Change-Id: Iea61238f37fdf24c0264f96d104ee0b3b6aec8e2
(cherry picked from commit 07be793435)
(cherry picked from commit d1c4ba5810)
(cherry picked from commit 34239888eb)
The hook starts a DB transaction and should be covered with
DB retry decorator.
For Rocky backport had to import neutron_lib.db for retry_db_errors.
Closes-Bug: #1777965
Closes-Bug: #1771293
Change-Id: I044980a98845edc7b0a02e3323a1e62eb54c10c7
(cherry picked from commit ab286bcdac)
(cherry picked from commit 3ec7aed8a3)