Removed E125 (continuation line does not distinguish itself
from next logical line) from the ignore list and fixed all
the indentation issues. Didn't think it was going to be
close to 100 files when I started.
Change-Id: I0a6f5efec4b7d8d3632dd9dbb43e0ab58af9dff3
There are some extreme conditions which will result the unbound
router gateway port. Then all the centralized floating IPs will
not be reachable since the gateway port was set to 4095 tag.
This patch adds the HA status to the router related port
processing code path. If it is HA router, the gateway port
will go to the right HA router processing code branch.
Closes-Bug: #1827754
Change-Id: Ida1c9f3a38171ea82adc2f11cb17945d6e2434be
All of the externally consumed variables from neutron.common.constants
now live in neutron-lib. This patch removes neutron.common.constants
and switches all uses over to lib.
NeutronLibImpact
Depends-On: https://review.openstack.org/#/c/647836/
Change-Id: I3c2f28ecd18996a1cee1ae3af399166defe9da87
Reduces E128 warnings by ~260 to just ~900,
no way we're getting rid of all of them at once (or ever).
Files under neutron/tests still have a ton of E128 warnings.
Change-Id: I9137150ccf129bf443e33428267cd4bc9c323b54
Co-Authored-By: Akihiro Motoki <amotoki@gmail.com>
If l3-agent was restarted by a regular action, such as config change,
package upgrade, manually service restart etc. We should not set the
HA port down during such scenarios. Unless the physical host was
rebooted, aka the VRRP processes were all terminated.
This patch adds a new RPC call during l3 agent init, it will try to
retrieve the HA router count first. And then compare the VRRP process
(keepalived) count and 'neutron-keepalived-state-change' count
with the hosting router count. If the count matches, then that
set HA port to 'DOWN' state action will not be triggered anymore.
Closes-Bug: #1798475
Change-Id: I5e2bb64df0aaab11a640a798963372c8d91a06a8
This patch implements the plugin.
This patch introduces an new service plugin for port forwarding resources,
named 'pf_plugin', and supports create/update/delete port forwarding
operation towards a free Floating IP.
This patch including some works below:
* Introduces portforwarding extension and the base class of plugin
* Introduces portforwarding plugin, support CRUD port forwarding
resources
* Add the policy of portforwarding
The race issue fix in:
https://review.openstack.org/#/c/574673/
Fip extend port forwarding field addition in:
https://review.openstack.org/#/c/575326/
Partially-Implements: blueprint port-forwarding
Change-Id: Ibc446f8234bff80d5b16c988f900d3940245ba89
Partial-Bug: #1491317
The externally consumed APIs from neutron.db.api were rehomed into
neutron-lib with https://review.openstack.org/#/c/557040/
This patch consumes the retry_db_errors function from lib by:
- Removing retry_db_errors from neutron.db.api
- Updating the imports for retry_db_errors to use it from lib
- Using the DB API retry fixture from lib in the UTs where applicable
- Removing the UTs for neutron.db.api as they are now covered in lib
NeutronLibImpact
Change-Id: I1feb842d3e0e92c945efb01ece29856335a398fe
Post-binding information about router ports is missing in results of RPC
calls made by l3 agents. sync_routers code ensures that bindings are
present, however, it does not refresh router objects before returning
them - for RPC clients ports remain unbound before the next sync and
there is no necessary address scope information present to create routes
from fip namespaces to qrouter namespaces.
Change-Id: Ia135f0ed7ca99887d5208fa78fe4df1ff6412c26
Closes-Bug: #1759971
The is_extension_supported function now lives in neutron-lib. This patch
removes the function from neutron and uses lib's version instead.
NeutronLibImpact
Change-Id: Iccb72e00f85043b3dff0299df7eb1279655e313e
Commit I81748aa0e48b1275df3e1ea41b1d36a117d0097d added the l3 extension
API definition to neutron-lib and commit
I2324a3a02789c798248cab41c278a2d9981d24be rehomed the l3 exceptions,
while Ifd79eb1a92853e49bd4ef028e7a7bd89811c6957 shims the l3
exceptions.
This patch consumes the l3 api def by:
- Removing the code from neutron that's now in lib.
- Using lib's version of the code where applicable.
- Tidying up the related unit tests as now that the l3 api def from lib
is used the necessary fixture is already setup in the parent chain when
setting up the unit test class.
NeutronLibImpact
Change-Id: If2e66e06b83e15ee2851ea2bc3b64ad366e675dd
The router_ids argument to auto_schedule_routers() is
unused, and was marked for deprecation in Queens.
Change-Id: Ie97b1ad05e294b5fe763ae8d7319800eb16ea3dc
In commit 500b255278 we are using
"get_router_ids" RPC to update HA network port status. But that
was needed to backport that commit to other branches.
As "get_router_ids" RPC is expected to fetch only router ids and
not to have any other processing, we are adding new RPC
"update_ha_network_port_status". L3 agent will call this new RPC
to set HA network port status to DOWN.
Related-bug: #1597461
Change-Id: I8f34c4f5178d2b422cfcfd082dfc9cf3f89a5d95
The well known service type constants are in
neutron_lib.plugins.constants, but for legacy reasons a few still exist
and are referenced from neutron_lib.constants that we'd like to remove.
This patch switches references over to neutron_lib's plugin constants.
Change-Id: I1861448cec303725b30cef8f42029f467f9e03a3
When l3 agent node is rebooted, if HA network port status is already
ACTIVE in DB, agent will get this status from server and then spawn
the keepalived (though l2 agent might not have wired the port),
resulting in multiple HA masters active at the same time.
To fix this, when the L3 agent starts up we can have it explicitly
set the port status to DOWN for all of the HA ports on that node.
Then we are guaranteed that when they go to ACTIVE it will be because
the L2 agent has wired the ports.
Closes-bug: #1597461
Change-Id: Ib0c8a71b6ff97e43a414f3db4882914b12170d53
according to https://wiki.openstack.org/wiki/Python3, now we should avoid
using six.iteritems and replace it with dict.items.
Change-Id: I58a399baa2275f280acc0e6d649f81838648ce5c
Closes-Bug: #1680761
Neutron-lib 1.1.0 is now out and contains the portbindings
API definition (as per commit [1]). This patch moves neutron
references over to the neutron-lib version.
NeutronLibImpact
- Consumers using the public constants within neutron's
portbindings API extension must now use the values
from neutron-lib.
[1] 87e42f993c07ae320159d5123662ee9f3bd4d903
Change-Id: I669af9b4c712877772d91a03857ab108714001d4
The handler was making the incorrect assumption that once a
host_id was set, a port which failed to bind could only be in
the 'binding_failed' state. So it would not try to rebind ports
that encountered an exeption during port binding commit that left
them in the unbound state.
Change-Id: I28bbeda5fed4275ea38e27308518f89df9ab4eff
Closes-Bug: #1648879
Neutron Manager is loaded at the very startup of the neutron
server process and with it plugins are loaded and stored for
lookup purposes as their references are widely used across the
entire neutron codebase.
Rather than holding these references directly in NeutronManager
this patch refactors the code so that these references are held
by a plugin directory.
This allows subprojects and other parts of the Neutron codebase
to use the directory in lieu of the manager. The result is a
leaner, cleaner, and more decoupled code.
Usage pattern [1,2] can be translated to [3,4] respectively.
[1] manager.NeutronManager.get_service_plugins()[FOO]
[2] manager.NeutronManager.get_plugin()
[3] directory.get_plugin(FOO)
[4] directory.get_plugin()
The more entangled part is in the neutron unit tests, where the
use of the manager can be simplified as mocking is typically
replaced by a call to the directory add_plugin() method. This is
safe as each test case gets its own copy of the plugin directory.
That said, unit tests that look more like API tests and that rely on
the entire plugin machinery, need some tweaking to avoid stumbling
into plugin loading failures.
Due to the massive use of the manager, deprecation warnings are
considered impractical as they cause logs to bloat out of proportion.
Follow-up patches that show how to adopt the directory in neutron
subprojects are tagged with topic:plugin-directory.
NeutronLibImpact
Partially-implements: blueprint neutron-lib
Change-Id: I7331e914234c5f0b7abe836604fdd7e4067551cf
When everything works as expected, no-one hardly pays any attention
to this log trace, which accounts for an incredible amount of log data.
This change proposes to emit the router payload only during failures
(when debugging info is needed the most), and furthermore it relocates
it to the L3 agent log files, where it is more pertinent.
Partial-bug: #1620864
Change-Id: I64281b963ba52c0a100a6194b7cafc5e9b1a8e74
As part of making DVR portbinding implementation generic, we rename
dvr portbinding functions as distributed portbinding functions.
In next patch we make dvr logic for port binding generic,
to be useful for all distributed router ports(for example, HA).
Partial-Bug: #1595043
Partial-Bug: #1522980
Change-Id: I402df76c64299156d4ed48ac92ede1e8e9f28f23
In this patch, auto schedule router will be removed from sync_routers,
so that the reported bug can be fixed. And potential race can be avoid
accoridng to [1]
The result of patch will make the l3 agent can't get the router info
when the router is not bound to the l3 agent. And router in agent will
be removed during the agent processing. This makes sense, since, in
neutron server, the router is not tied to the agent. For DVR, if there
are service port in the agent host, the router info will still be
returned to l3 agent.
[1] https://review.openstack.org/#/c/317949/
Change-Id: Id0a8cf7537fefd626df06064f915d2de7c1680c6
Co-Authored-By: John Schwarz <jschwarz@redhat.com>
Closes-Bug: #1593653
Currently, router_centralized_snat port can be bound to a host were
l3-agent is in standby state (L3 HA + DVR case). As a result VM without
floating ip is unable to reach external network. This change passes
ha_router_port flag to _ensure_host_set_on_port when called for
_snat_router_interfaces ports.
Note: this issue is intermittent, without changes in l3_rpc.py
unit test does not fail every time.
Co-Authored-By: Oleg Bondarev <obondarev@mirantis.com>
Closes-bug: #1582739
Change-Id: I74bad578361ed7eac8cc6c740b06b66ab1530cd5
Routers auto scheduling works when an l3 agent starts and performs
a full sync with neutron server. Neutron server looks for all
unscheduled routers and schedules them to that agent if applicable.
This was broken by commit 0e97feb0f3
which changed full sync logic a bit: now l3 agent requests all ids
of routers scheduled to it first. get_router_ids() didn't call
routers auto scheduling which caused the regression.
This patch adds routers auto scheduling to get_router_ids().
Closes-Bug: #1541348
Change-Id: If6d4e7b3a4839c93296985e169631e5583d9fa12
In case there are thousands of routers attached to thousands of
networks, sync_routers request might take a long time and lead to timeout
on agent side, so agent initiate another resync. This may lead to an endless
loop causing server overload and agent not being able to sync state.
This patch makes l3 agent first check how many routers are assigned to
it and then start to fetch routers by chunks.
Initial chunk size is set to 256 but may be decreased dynamically in case
timeouts happen while waiting response from server.
This approach allows to reduce the load on server side and to speed up
resync on agent side by starting processing right after receiving
the first chunk.
Closes-Bug: #1516260
Change-Id: Id675910c2a0b862bfb9e6f4fdaf3cd9fe337e52f
The L3 agent needs to know the address scope of the fixed ip of each
floating ip because floating ips are a way to cross scope boundaries.
Without the scope information, there could be ambiguity and no way to
know which scope to send it to.
[1] https://review.openstack.org/#/c/189741/
Change-Id: Id9f8c12954a6efbf4d9b99c011652eefbe5f5145
Partially-Implements: blueprint address-scopes
Without "L3_ROUTER_NAT" in neutron's service_plugins, l3_rpc will
fail when getting l3plugin. So, remove the useless "if" block
here.
Change-Id: I56f417e5723ed40c70a186394de0bfcff696e469
Closes-Bug: #1514144
A recent change used a keyword argument when it didn't need to,
correct it to fix the multinode DVR job.
End of typical traceback:
File "/opt/stack/new/neutron/neutron/api/rpc/handlers/l3_rpc.py",
in delete_agent_gateway_port(admin_ctx, network_id, host_id=host)
TypeError: delete_floatingip_agent_gateway_port() got multiple
values for keyword argument 'host_id'
Introduced in commit 639f1893dd
Related-bug: #1495147
Change-Id: Id2522bc843bc7b089b7783d3f765900a50a0033f
Today FloatingIP Agent gateway port is deleted and
re-created for DVR based routers based on floatingip
association and disassociation with VMs on compute
nodes by the plugin.
This introduces lot more strain on the plugin to
create and delete these ports when VMs come up and
get deleted that are associated with FloatingIps.
This patch will introduce an RPC call for the agent
to initiate a agent gateway port delete.
Also the agent will look for the last floatingip that
it manages, and if condition satisfies, the agent will
request the server to remove the FloatingIP Agent
Gateway port.
Change-Id: I47694b2ee60c363e2fe59ad5f7d168252da08a45
Related-Bug: #1468007
Related-Bug: #1408855
Related-Bug: #1450982
This patch is to address the failure of manual move of
dvr_snat routers from one service node to another.
The entry in the csnat_l3_agent_bindings table is now removed
during the router to agent unbind operation.
Appropriate notification is now sent to the agent to remove
snat/qrouter namespace.
There were other places in the code
that needed to examine the snat binding table to
check if updates were required -
validate_agent_router_combination() and
check_agent_router_scheduling_needed().
Additionally, schedule_routers() was made optional
within the rpc _notification path since it can
override the manual move being attempted.
Change-Id: Iac9598eb79f455c4ef3d3243a96bed524e3d2f7c
Closes-Bug: #1369721
Co-Authored-By: Ila Palanisamy <ilavajuthy.palanisamy@hp.com>
Co-Authored-By: Oleg Bondarev <obondarev@mirantis.com>
A misnamed function call and execution order issue was causing
update_subnet to fail when a PD enabled subnet received a new CIDR.
This patch fixes the issues, and introduces an rpc api test to
ensure the function works. This includes altering the process_prefix_update
RPC handler to expose the issue to the test.
Change-Id: Id1e781291f711865fd783ed5e0208694097b7024
Closes-Bug: 1482676
This patch includes the DB, IPAM & RPC changes needed for the IPv6 Prefix
Delegation feature.
To enable this feature, the subnetpool_id attribute of subnets has been
modified to allow for a special subnetpool identifier - "prefix_delegation".
WORKFLOW:
1. Admin sets default_ipv6_subnet_pool in neutron.conf to "prefix_delegation"
2. User creates a new IPv6 subnet without a CIDR or subnetpool ID
3. User creates an interface between this subnet and a router with an existing
external interface
The agent-side changes will follow in separate patches.
A documentation patch is up for review here:
https://review.openstack.org/#/c/178739
Video guides for configuring and using this feature are available on
YouTube:
https://www.youtube.com/watch?v=wI830s881HQhttps://www.youtube.com/watch?v=zfsFyS01Fn0
Change-Id: Ic0c6ed4dba74da94a75838178a1837f93d2d0885
Co-Authored-By: Baodong (Robert) Li <baoli@cisco.com>
Partially-Implements: blueprint ipv6-prefix-delegation
The decorator was previously added at the API layer
(commit 4e77442d52,
commit d04335c448).
However some RPC handlers are also dealing with port
create/update/delete operations, like dhcp ports for example.
We need to cover these cases too.
Also remove db retry from ml2 plugin delete_port()
as it's not needed once we retry at the API and RPC layers.
(there is already a unit test on this)
The patch also adds a unit test for checking deadlock
handling during port creation at API layer.
Though it's not directly related to the current fix,
I decided to leave it for regression preventing purposes.
Closes-Bug: #1479738
Change-Id: I7793a8f7c37ca542b8bc12372168aaaa0826ac4c
An HA port needs to point to the correct host (where the master router
is running) in order for L2Population to work.
Hence, this patch introduces two fixes:
* When a port owned by an HA router is up we make sure it points to the
right node where the master is running, or a random node if there is
no master yet (This corner case is fixed by the 2nd bullet point).
* When a L3 agent reports it's hosting a master, we need to update the
port binding to the host the master is now running on. This fixes
both routers with no elected master (Yet) and failovers.
This patch also changes the L3 HA failover test to use l2pop.
Note that the test does not pass when using l2pop without this patch.
Closes-Bug: #1365476
Co-Authored-By: Assaf Muller <amuller@redhat.com>
Change-Id: I8475548947526d8ea736ed7aa754fd0ca475cae2
This also adds a check to neutron/hacking/checks.py that should catch this
error in the future.
Blueprint: neutron-python3
Change-Id: Ie7b833ffa173772d39b85ee3ecaddace18e1274f
In the L3 RPC code if the host for a port is not
present, it ends up calling update_port with the
host_id set to None. This does not update the host
id at all because it's treated as an unset attribute
which leads to the same thing happening on the next
iteration. These pointless update calls are expensive
because they involve a semaphore and calls to mechanism
drivers.
This patch adjusts the logic to only send a port
update if it actually has a host to ensure is on
the port.
Change-Id: Ic55496dd2ba3abcef0a2de9fc8699c391b79fa51
Partial-Bug: #1445412
The get_routers method in the l3 RPC code has a log.debug
statement that formats all of the router data as indented
JSON. This method can be expensive if there are hundreds
of routers being synced and it happens even if debugging
is disabled since the function call result is the parameter
to the debug statement.
This patch adds and leverages a small helper class that takes a
callable and its args and defers calling it until the __str__ method
is called on it when it's actually trying to be rendered to a string.
Change-Id: I2bfceb286ce30f2a3595381b62bdc6dd71ed8483
Partial-Bug: #1445412
The L3 agent gets keepalived state change notifications via
a unix domain socket. These events are now batched and
send out as a single RPC to the server. In case the same
router got updated multiple times during the batch period,
only the latest state is sent.
Partially-Implements: blueprint report-ha-router-master
Change-Id: I36834ad3d9e8a49a702f01acc29c7c38f2d48833
The following patch will re-add it with its intended parameters
and use it in the agent.
Change-Id: Idffe963fffe5fdde6f474046a50208a2974edfa0
Partially-Implements: blueprint report-ha-router-master
It's mostly a matter of changing imports to a new location.
Non-obvious changes needed:
* pass overwrite= argument to oslo_context since oslo.log reads context
from its thread local store and not local.store from incubator
* don't store context at local.store now that there is no code that
would consume it
* LOG.deprecated() -> versionutils.report_deprecated_feature()
* dropped LOG.audit check from hacking rule since now the method does
not exist
* WritableLogger is now located in oslo_log.loggers
Dropped log module from the tree. Also dropped local module that is now
of no use (and obsolete, as per oslo team).
Added versionutils back to openstack-common.conf since now we use the
module directly from neutron code and not just as a dependency of some
other oslo-incubator module.
Note: tempest tests are expected to be broken now, so instead of fixing
all the oslo.log related issues for the subtree in this patch, I only
added TODOs with directions for later fix.
Closes-Bug: #1425013
Change-Id: I310e059a815377579de6bb2aa204de168e72571e
This removes the unsued RPC methods from
l3_rpc.py. 'get_snat_router_interface_ports"
was defined but not used by any agent.
Change-Id: Ide08e2a4b183b4f2616550efd5b1fb726b016b4c
Closes-Bug: #1421011