Currently, the dhcp Provisioning of ports is the crucial bottleneck
of that concurrently boot multiple VM.
The root cause is that these ports will be processed one by one by dhcp
agent when they belong to the same network, And the 'Provisioning complete'
port is still blocked other port's processing in other dhcp agents. The
patch aim to optimize the dispatch strategy of the port cast to agent to
improve the Provisioning process.
In server side, I classify messages to multi levels. Especially, I classify
the port_update_end or port_create_end message to two levels, the high-level
message only cast to one agent, the low-level message cast to all agent. In
agent side I put these messages to `resource_processing_queue`, with the queue,
We can delete `_net_lock` and process these messages in order of priority.
Additonally, I modified the `resource_processing_queue` for my demand. I update
`_queue` from LIST to PriorityQueue in `ExclusiveResourceProcessor`, by this
way, we can sort all message which cached in `ExclusiveResourceProcessor` by
priority.
Related-Bug: #1760047
Change-Id: I255caa0571c42fb012fe882259ef181070beccef
Current DHCP port management in Neutron makes the server to clear the
device_id while the agent is responsible for setting it.
This may cause a potential race condition, for example during network
rescheduling. The server aims to clear the device_id on a DHCP port and
assign the network to another agent while the old agent might just be
taking possession of the port. If the DHCP agent takes possession of the
port (i.e., update port...set the device_id) before the server clears
it, then there is no issue. However, if this happens after the clear
operation by server then the DHCP port would be updated/marked to be
owned by the old agent.
When the new agent takes over the network scheduled to it, it won't be
able to find a port to reuse so that an extra port might need to be
created. This leads to two issues:
1) an extra port is created and never deleted;
2) the extra port creation may fail if there are no available IP
addresses.
This patch proposes a validation check to prevent an agent from updating
a DHCP port unless the network is bound to that agent.
Co-authored-by: Allain Legacy <Allain.legacy@windriver.com>
Closes-Bug: #1795126
Story: 2003919
Change-Id: Ie619516c07fb3dc9d025f64c0e1e59d5d808cb6f
The common rpc and exceptions were rehomed into
neutron-lib with [1]. This patch shims those rehomed
modules in neutron to switch over to neutron-lib's
versions under the covers.
To do so:
- The rpc and common exceptions are changed to
reference their counterpart in neutron-lib effectively
swapping the impl over to neutron-lib.
- The fake_notifier is removed from neutron and lib's
version is used instead.
- The rpc tests are removed; they live in lib now.
- A few unit test related changes are required
including changing mock.patch to mock.patch.object,
changing the mock checks for a few UTs as they don't
quite work the same with the shim in place.
- Using the RPC fixture from neutron-lib rather than
that setup in neutron's base test class.
With this shim in place, consumers are effectively using
neutron-lib's RPC plumbing and thus we can move consumers
over to neutron-lib's version at will. Once all
consumers are moved over we can come back and remove
the RPC logic from neutron and follow-up with a consumption
patch.
NeutronLibImpact
[1] https://review.openstack.org/#/c/319328/
Change-Id: I87685be8764a152ac24366f13e190de9d4f6f8d8
The remainder of the neutron.plugins.common.utils were rehomed into
neutron-lib with [1][2]. This patch consumes them by using the functions
from neutron-lib, and removing the neutron.plugins.common.utils module
all together as it's fully rehomed now.
NeutronLibImpact
[1] https://review.openstack.org/#/c/560950/
[2] https://review.openstack.org/#/c/554546/
Change-Id: Ic0f7b37861f078ce8c5ee92d97e977b8d2b468ad
According to [1], when a network contains more that one IPv4
subnet, they are returned in the 'classless-static-routes'
DHCP option, regardless of whether DHCP is enabled for them
or not.
However, the get_active_networks_info() method used for
synchronizing networks after the dhcp agent restarts filters
subnets with "enable_dhcp=True", which differs from the
get_network_info() method. This will block VM access to
other VMs in the dhcp disabled subnets, even though they are
in the same network. This is visible by looking at the "opts"
file before and after a restart.
Change the dhcp agent to ask for all subnets in its
get_active_networks_info() RPC call by adding an
enable_dhcp_filter argument to toggle the behavior, with the
default being True to not break backwards compatibility.
Based on https://review.openstack.org/#/c/352530/ by Quan Tian.
[1] https://review.openstack.org/#/c/125043/
Change-Id: I11ca1d1a603d02587f3b8d4a5a52a96b0587d61f
Closes-Bug: #1652654
The neutron.common.topics module was rehomed into neutron-lib with
commit Ie88b84949cbd55a4e7ad06341aab77b286cdc485
This patch consumes it by removing the rehomed module from neutron
and using the module from neutron-lib instead.
NeutronLibImpact
Change-Id: Ia4a4604c259ce862597de80c6deeb3d408bf0e95
The is_extension_supported function now lives in neutron-lib. This patch
removes the function from neutron and uses lib's version instead.
NeutronLibImpact
Change-Id: Iccb72e00f85043b3dff0299df7eb1279655e313e
Inter Tenant Traffic between two different networks that belong
to two different Tenants is not possible when connected through
a shared network that are internally connected through DVR
routers.
This issue can be seen in multinode environment where there
is network isolation.
The issue is, we have two different IP for the ports that are
connecting the two routers and DVR does not expose the router
interfaces outside a compute and is blocked by ovs tunnel bridge
rules.
This patch fixes the issue by not applying the DVR specific
rules in the tunnel-bridge to the shared network ports that
are connecting the routers.
Closes-Bug: #1751396
Change-Id: I0717f29209f1354605d2f4128949ddbaefd99629
registry.subscribe() was marked deprecated for Ocata, remove it
now that we're in Rocky.
Trivialfix
Change-Id: Ibdfa39dfb569a804b7612a17516dc41a0a8879bc
This patch switches to use _get_subnet_object() instead of
_get_subnet() method in ipam_backend_mixin module to use Subnet
OVO instead of db model class directly.
Change-Id: Ibdc91ba20b560dc42319565b5f4aeadba0dfcb29
Partially-Implements: blueprint adopt-oslo-versioned-objects-for-db
Neutron will now use own registry for versionedobjects.
It avoids problems with loading wrong OVO objects from
different projects (like os_vif) when names are the same.
Change-Id: I9d4fab591fbe52271c613251321a6d03078976f7
Closes-Bug: #1731948
This change adds a hook so that additional classes
for OVObjects can be registered beyond the base classes.
The intent is to allow stadium project to register classes
for the OVObjects they use.
Change-Id: Icb6a73aeca0286725c78a3bf403e1df895a34d4e
Needed-By: Ie0f1c9f3f2bd1cf317b921ee1c1691c2816d8832
The neutron-lib commit I360545b6ee4291547e0c5c8e668ad03d3efa4725 moved
the externally consumed globals from neutron.common.constants into lib.
With the exception of PROVISIONAL_IPV6_PD_PREFIX all other constants
in neutron.common.constants should only be used in neutron, and will
hopefully remain that way. External consumers needing access to other
common constants should move them into lib first.
NeutronLibImpact
Change-Id: Ie4bcffccf626a6e1de84af01f3487feb825f8b65
In Pike, the agent side of security_groups_provider_updated()
RPC code was changed to a NOOP when the provider rules were
changed to be static, https://review.openstack.org/#/c/432506
Now that we're in Queens we can deprecate it.
Change-Id: Ie018ff653633d3524f0e80c5e172a5d01bdad437
Firewall drivers check if port security is enabled. After ovo is sent
over the wire, the port_security_enabled is part of 'security' field.
The patch translates the RPC call from agent to server so the payload
containing port_security_enabled is at the same place.
We may consider implementing change of OVO field to contain boolean
directly.
Change-Id: I647343e84b41da63d7ffcc5a87f3dfa2036adc56
Closes-bug: #1605654
There are several places where base class setUp() method call was
called unnecessary. In this patchset, they are removed.
TrivialFix
Change-Id: I2961fa4a0216f7f1223ab87a249151f0feb91518
Calculate all security group info on the agent from
the push notification cache.
Partially-Implements: blueprint push-notifications
Change-Id: I5c74ba17223a431dad924d31bbe08ad958de3877
In reviews we usually check import grouping but it is boring.
By using flake8-import-order plugin, we can avoid this.
It enforces loose checking so it sounds good to use it.
This flake8 plugin is already used in tempest.
Note that flake8-import-order version is pinned to avoid unexpected
breakage of pep8 job.
Setup for unit tests of hacking rules is tweaked to disable
flake8-import-order checks. This extension assumes an actual file exists
and causes hacking rule unit tests.
Change-Id: Ib51bd97dc4394ef2b46d4dbb7fb36a9aa9f8fe3d
The well known service type constants are in
neutron_lib.plugins.constants, but for legacy reasons a few still exist
and are referenced from neutron_lib.constants that we'd like to remove.
This patch switches references over to neutron_lib's plugin constants.
Change-Id: I1861448cec303725b30cef8f42029f467f9e03a3
In order to allow the DHCP agent to service other subnets
on the network in other segments via DHCP relay, we need to
return all subnets regardless of segment association.
However, this behavior will break older DHCP agents that
then try to get IPs on the subnets belonging to other segments.
This patch adds a new subnet attribute, 'non_local_subnets'
that will be returned in the DHCP RPC calls, so that agents
that can deal with off-link subnets can handle them
accordingly.
Change-Id: I9cce7b8a19c1201435df0c6baac7be57c57847e6
Partial-Bug: #1692486
This code was needed for some Mitaka to Newton upgrade scenarios, but we
are now in Ocata to Pike, so killing the code is overdue.
Change-Id: Ida3fc4ce2d18c50e603b2b71dfb8f884845bf78a
The callback modules have been available in neutron-lib since commit [1]
and are ready for consumption.
As the callback registry is implemented with a singleton manager
instance, sync complications can arise ensuring all consumers switch to
lib's implementation at the same time. Therefore this consumption has
been broken down:
1) Shim neutron's callbacks using lib's callback system and remove
existing neutron internals related to callbacks (devref, UTs, etc.).
2) Switch all neutron's callback imports over to neutron-lib's.
3) Have all sub-projects using callbacks move their imports over to use
neutron-lib's callbacks implementation.
4) Remove the callback shims in neutron-lib once sub-projects are moved
over to lib's callbacks.
5) Follow-on patches moving our existing uses of callbacks to the new
event payload model provided by neutron-lib.callback.events
This patch implements #2 from above, moving all neutron's callback
imports to use neutron-lib's callbacks.
There are also a few places in the UT code that still patch callbacks,
we can address those in step #4 which may need [2].
NeutronLibImpact
[1] fea8bb64ba7ff52632c2bd3e3298eaedf623ee4f
[2] I9966c90e3f90552b41ed84a68b19f3e540426432
Change-Id: I8dae56f0f5c009bdf3e8ebfa1b360756216ab886
Neutron-lib 1.1.0 is now out and contains the portbindings
API definition (as per commit [1]). This patch moves neutron
references over to the neutron-lib version.
NeutronLibImpact
- Consumers using the public constants within neutron's
portbindings API extension must now use the values
from neutron-lib.
[1] 87e42f993c07ae320159d5123662ee9f3bd4d903
Change-Id: I669af9b4c712877772d91a03857ab108714001d4
Setting up rules to allow DHCPv6, DHCP, and RAs from specific
IP addresses based on Neutron resources has a few issues:
1. It violates separation of concerns. We are implementing logic to
calculate where an IPv6 RA advertisement or DHCP advertisement
should be coming from in the security group code. This code should
not be trying to guess IPv6 LLAs, know about subnet modes, DHCP server
implementations, or the type of L3 plugin being used. Currently all
of these assumptions are baked into code that should only be
filtering, which makes it very rigid and brittle when it comes to
other implementations for DHCP and/or RAs.
2. It has scaling issues on large networks. Every time one of these
provider rules is updated, it triggers every L2 agent to refresh
all of the security group rules for ports in that network, which puts
significant load on the server.
3. It's main purpose: preventing spoofing of RA[1,2] and DHCP packets,
has long been superceded by preventing VMs from acting as DHCP/RA
servers[3][4].
This patch completely removes all of this logic and just returns
static provider rules to the agents that allow all DHCP server
and RA traffic ingress to the client. This addresses the issues
highlighted above since the code is significantly simplified and
the provider rules don't require refreshes on the agents.
Now that the provider rules never change, the RPC notification
listener on the agent-side for 'notify_provider_updated' is now
just a NOOP that doesn't trigger any refreshes. The notification
was left in place on the server side for older version agents
that have stale IP-specific provider rules. The entire notification
can be removed in the future.
The one open concern with this approach is that VMs will now be
able to receive DHCP offers from other DHCP servers on the same
network that aren't being filtered (e.g. a VM with port security
disabled or another device on a provider network). In order to
address this for DHCP, this patch adds two rules that only allow
DHCP offers targeted to either the broadcast or the correct client
IP. This prevents incorrect offers from ever reaching the client.
For RAs, this patch just allows all RAs so we may pick up
advertisements from other v6 routers attached to a network;
however, the instance won't actually be allowed to use bad addresses.
1. https://bugs.launchpad.net/neutron/+bug/1262759
2. I1d5c7aaa8e4cf057204eb746c0faab2c70409a94
3. Ice1c9dd349864da28806c5053e38ef86f43b7771
4. https://git.openstack.org/cgit/openstack/neutron/tree/
neutron/agent/linux/iptables_firewall.py
?h=521b1074f17574a5234843bce68f3810995e0e1d#n475
Closes-Bug: #1653830
Closes-Bug: #1663077
Change-Id: Ibfbf011284cbde396f74db9d982993f994082731
On profiling the get_devices_details communications between
the agent and the server, a significant amount of time
(60% in my dev env) is being spent in the AFTER_UPDATE events
for the port updates resulting from the port status changes.
One of the major offenders is the native DHCP agent notifier.
On each port update it ends up retrieving the network for the
port, the DHCP agents for the network, and the segments.
This patch addresses this particular issue by adding logic to
skip a DHCP notification if the only thing that changed on the
port was the status. The DHCP agent doesn't do anything based on
the status field so there is no need to update it when this is
the only change.
Change-Id: I948132924ec5021a9db78cf17efbba96b2500e8e
Partial-Bug: #1665215
Add an RPC interface to retrieve many OVO resources
at once based on filters.
Partially-Implements: blueprint push-notifications
Change-Id: I5e765712ab8b8065d71653c563f004c7a9ce9021
Maintaining the context is important for keeping the request ID
and subsequently operator/developer sanity while debugging.
The resource_type is also helpful to have since a function could be
subscribed for multiple resources.
This maintains and deprecates the existing 'subscribe' method for
backwards compatibility with callbacks that don't support receiving
the context and resource type. A new 'register' method is added
for callbacks to use that are compatible with receiving the context.
Change-Id: I06c8302951c99039b532acd9f2a68d5b989fdab5
Instead of using oslo.versionedobjects UUID type, use a custom UUIDField
class located in common_types that will actually validate passed values
for UUID-ness.
Closes-Bug: #1614537
Change-Id: I20b24ee57c521b1c68977c2ff7ae56b56875dd64
Neutron Manager is loaded at the very startup of the neutron
server process and with it plugins are loaded and stored for
lookup purposes as their references are widely used across the
entire neutron codebase.
Rather than holding these references directly in NeutronManager
this patch refactors the code so that these references are held
by a plugin directory.
This allows subprojects and other parts of the Neutron codebase
to use the directory in lieu of the manager. The result is a
leaner, cleaner, and more decoupled code.
Usage pattern [1,2] can be translated to [3,4] respectively.
[1] manager.NeutronManager.get_service_plugins()[FOO]
[2] manager.NeutronManager.get_plugin()
[3] directory.get_plugin(FOO)
[4] directory.get_plugin()
The more entangled part is in the neutron unit tests, where the
use of the manager can be simplified as mocking is typically
replaced by a call to the directory add_plugin() method. This is
safe as each test case gets its own copy of the plugin directory.
That said, unit tests that look more like API tests and that rely on
the entire plugin machinery, need some tweaking to avoid stumbling
into plugin loading failures.
Due to the massive use of the manager, deprecation warnings are
considered impractical as they cause logs to bloat out of proportion.
Follow-up patches that show how to adopt the directory in neutron
subprojects are tagged with topic:plugin-directory.
NeutronLibImpact
Partially-implements: blueprint neutron-lib
Change-Id: I7331e914234c5f0b7abe836604fdd7e4067551cf
This patch set is for breaking the circular dependency between
Agent/AgentVersionedObject.
See:https://review.openstack.org/#/c/297887/ for details.
Change-Id: I7be4ce2513e49e6da46a7bdffb8538613f0be7c7
Partial-Bug: #1597913
Co-Authored-By: Victor Morales <victor.morales@intel.com>
Co-Authored-By: Sindhu Devale <sindhu.devale@intel.com>
This makes the notifier subscribe to core resource events
and leverage them if they are available. This solves the
issue where internal core plugin calls from service plugins
were not generating DHCP agent notifications.
Closes-Bug: #1621345
Change-Id: I607635601caff0322fd0c80c9023f5c4f663ca25
Capture port not found exceptions from port updates of DHCP ports
that no longer exist. The DHCP agent already checks the return
value for None in case any of the other things went missing
(e.g. Subnet, Network), so checking for ports disappearing makes
sense. The corresponding agent-side log message for this has also
been downgraded to debug since this is a normal occurrence.
This also cleans up log noise from calling reload_allocations on
networks that have already been torn down due to all of the subnets
being removed.
Closes-Bug: #1621650
Change-Id: I495401d225c664b8f1cf7b3d51747f3b47c24fc0
Commit 85ed7017ff removed the DBError
handling to let the retry decorator do its magic, however the
full implications of this change were not evaluated. As a result,
DBReferenceError (which derives from DBError) is not processed
correctly and that caused a regression of the existing logic.
Rather than bloat the retry's responsibility even further, this
patch partially reverts commit 85ed7017f by narrowing down the
exception handling to DBReferenceError only.
Related-bug: #1618216
Change-Id: Icf4e5e4145dcdcdc710b8e42044467913ed01ec1
The DHCP port action handler has been catching DBErrors
since f1b9ac5a54, which is
well before we had the retry decorator to deal with these.
With the port action handler catching these, it means there
will not be retries on deadlocks or connection errors so
transient situations can result in a permanently broken
DHCP service for a network.
This removes the catch for DBError so the decorator can retry
the operation.
Closes-Bug: #1618216
Change-Id: I42031b481958bbfdb8f52902c294022717af7adf
The exception handler checking for concurrently deleted subnets
was missing InvalidInput, which comes from IPAM when a port creation
request references a subnet that doesn't exist.
This patch just catches that exception.
Change-Id: I14f9e4bddde845e3fef2b0d8649d3954ba5c93bd
Closes-Bug: #1618187