When we manually move a router from one dvr_snat node to
another dvr_snat node the snat_namespace should be removed in
the originating node by the agent and will be re-created in the
destination node by the destination agent.
But when the agent dies, the router_update message reaches the
agent after the agent restarts. At this time the agent should
remove the snat_namespace since it is no more hosted by the
current agent.
Even though we do have logic in agent to take care of cleaning
up the snat namespaces if the gw_port_host does not match with the
existing agent host, in this particular use case the self.snat_namespace
is always set to 'None' in the dvr_edge_router init call when agent
restarts.
This patch fixes the above issue by initializing the snat namespace
object during the router_init. Since we do have a valid snat
namespace object and if the gw_port_host mismatches, the agent
should clean up the namespace.
Change-Id: I30524dc77b743429ef70941479c9b6cccb21c23c
Closes-Bug: #1557909
(cherry picked from commit 9dc70ed77e)
Currently 'force_gateway_on_subnet' configuration is set to True
by default and enforces the subnet on to the gateway. With this
fix 'force_gateway_on_subnet' can be changed to False, and
gateway outside the subnet can be added.
Before adding the default route, a route to the gateway IP is
added. This applies to both external and internal networks.
Change-Id: I3a942cf98d263681802729cf09527f06c80fab2b
Closes-Bug: #1335023
Closes-Bug: #1398768
(cherry picked from commit b6126bc0f1)
Today static routes are added to the SNAT namespace
for DVR routers. But they are not added to the qrouter
namespace.
Also while configuring the static routes to SNAT
namespace, the router is not checked for the existence
of the gateway.
When routes are added to a router without a gateway the
routes are only configured in the router namespace, but
when a gateway is set later, those routes have to be
populated in the snat_namespace as well.
This patch addresses the above mentioned issues.
Closes-Bug: #1499785
Closes-Bug: #1499787
Conflicts:
neutron/agent/l3/dvr_edge_router.py
neutron/tests/functional/agent/test_l3_agent.py
neutron/tests/functional/agent/l3/framework.py
neutron/tests/functional/agent/l3/test_dvr_router.py
Change-Id: I37e0d0d723fcc727faa09028045b776957c75a82
(cherry picked from commit 158f9eabe2)
In big and busy clusters there could be a condition when
rabbitmq clustering mechanism synchronizes queues and during
this period agents connected to that instance of rabbitmq
can't communicate with the server and server considers them
dead moving resources away. After agent become active again,
it needs to cleanup state entries and synchronize its state
with neutron-server.
The solution is to make agents aware of their state from
neutron-server point of view. This is done by changing state
reports from cast to call that would return agent's status.
When agent was dead and becomes alive, it would receive special
AGENT_REVIVED status indicating that it should refresh its
local data which it would not do otherwise.
Conflicts:
neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py
neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
neutron/tests/unit/agent/dhcp/test_agent.py
neutron/tests/unit/plugins/ml2/drivers/linuxbridge/agent/test_linuxbridge_neutron_agent.py
neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/test_ovs_neutron_agent.py
Closes-Bug: #1505166
Change-Id: Id28248f4f75821fbacf46e2c44e40f27f59172a9
(cherry picked from commit 3b6bd917e4)
There seems to be a timing issue between the
ARP entries that arrive from the server to
the agent and the internal qr-device getting
created by the agent.
So those unsuccessful arp entries are dropped.
This patch makes sure that the early ARP entries
are cached in the agent and then utilized when
the internal device is up.
Closes-Bug: #1501086
Change-Id: I9ec5412f14808de73e8dd86e3d51593946d312a0
(cherry picked from commit d9fb3a66b4)
If the L3 agent fails to configure a router, commit:
4957b5b435 changed it so
that instead of performing an expensive full sync, only that
router is reconfigured. However, it tries to reconfigure the
cached router. This is a change of behavior from the fullsync
days. The retry is more likely to succeed if the
router is retrieved from the server, instead of using
the locally cached version, in case the user or operator
fixed bad input, or if the router was retrieved in a bad
state due to a server-side race condition.
Note that this is only relevant to full syncs, as those retrieve
routers from the server and queue updates with the router object.
Incremental updates queue up updates without router objects,
so if one of those fails it would always be resynced on a
second attempt.
Related-Bug: #1494682
Change-Id: Id0565e11b3023a639589f2734488029f194e2f9d
(cherry picked from commit 822ad5f06b)
While processing a router update in _process_router_update method,
if an exception occurs, we try to do a full_sync.
We only need to re-sync the router whose update failed.
Addressed a TODO in the same method, which falls in similar lines.
Change-Id: I7c43a508adf46d8524f1cc48b83f1e1c276a2de0
Closes-Bug: #1494682
(cherry picked from commit 4957b5b435)
When enable_metadata_proxy is false, the agent instance will
not have metadata_driver. And agent should avoid using it.
Change-Id: Ia18dc5dea23de49b97c8f225532531eb9232fb51
Closes-Bug: #1510399
(cherry picked from commit ce3a31faff)
According to the context, it should be KeyError here to catch.
AttributeError will not happen here. More details could be found
in the bug report.
Change-Id: Id6351172703ac492e86475f75bf1be03f4e4e8a3
Closes-bug: #1506934
(cherry picked from commit 9d65841200)
Currently l3 agent skips status update for floating ips in case
status didn't change: this might be wrong if status has changed
on server side while agent was processing. See bug for details.
L3 agent skips floating ip processing in case ip address exists
on external device. So we can still skip status update for such
floating ips.
Closes-Bug: #1505557
Change-Id: I908fe5a0555f68ab85e7d199c36a903b915e103f
(cherry picked from commit 7592b17b87)
Explicit call to periodic resync after start may lead to
double syncing. See bug for details.
Closes-Bug: #1505282
Change-Id: Ib5e481d579039b2c3e87d4f12cad1241d02fe060
(cherry picked from commit cbc268e839)
For every router interface added to a router
with a default gateway there will be an internal
SNAT port generated and will be required by the
L3 Agent to process the SNAT rules.
This bug was introduced by the change ID below
Icc099c1a97e3e68eeaf4690bc83167ba30d8099a.
When the gateway is removed these ports have to
be removed from the namespace. These ports are
cached in the router_info and should be provided
to the get_snat_port_for_internal_port function
when called from external_gateway_removed or when
called from _dvr_internal_network_removed.
This patch fixes this problem.
Closes-Bug: #1496578
Change-Id: Id5af4774ba246e24f343f5623af5ea9143bd5f6b
The if statement for calling create_rtr_2_fip_link and kicking
the FW agent includes a check on floating_ips, that has already
been performed by the previous if block. Pull this block into
the previous block for code clarity.
Change-Id: I8661aa3998bda9341f558d0ecbc8e2663cd95aca
Signed-off-by: Ryan Moats <rmoats@us.ibm.com>
Co-Authored-By: Brian Haley <brian.haley@hpe.com>
I was just going over this class trying to understand what methods
really are used outside of the class. I found that these two are not.
I thought I'd submit a quick patch to mark them "private".
Change-Id: Id91907996631b670e23a506e0a1feae4518e42ba
This patch adds the common framework to be used by specific
implementations of the DHCPv6 protocol for Prefix Delegation.
It also includes a reference implementation based on the Dibbler
DHCPv6 client. Dibbler version 1.0.1 or greater is required.
Sanity tests are included to verify the installed version.
A patch for admin/user documentation is up for review here:
https://review.openstack.org/#/c/178739
Video guides for configuring and using this feature are available on
YouTube:
https://www.youtube.com/watch?v=wI830s881HQhttps://www.youtube.com/watch?v=zfsFyS01Fn0
Co-Authored-By: Baodong (Robert) Li <baoli@cisco.com>
Co-Authored-By: Sam Betts <sam@code-smash.net>
Change-Id: Id94acbbe96c717f68f318b2d715dd9cb9cc7fe4f
Implements: blueprint ipv6-prefix-delegation
The patch makes L3 agent aware of possible SNAT role
rescheduling to/from it.
The gist is to compare gw_port host change.
If it was changed and agent is not on target host then
it needs to clear snat namespace if one exists. If agent
is on target host it needs to create snat namespace from
scratch if it doesn't exist.
Host field was excluded from gw_port comparison on
agent side as part of HA Router feature implementation.
This code was moved to corresponding module.
Closes-Bug: #1472205
Change-Id: I840bded9eb547df014c6fb2b4cbfe4a876b9b878
Centralized router can add routes, but distributed router can not,
the neutron router-update operation fails silently. This is
because on a distributed router commands need to be run in the
snat-* namespace, and not the qrouter-* namespace as on a
centralized router.
Change-Id: I517effcfc299c67c3413f7dc3352b97515ff69db
Closes-Bug: #1405910
Co-Authored-By: Ryan Moats <rmoats@us.ibm.com>
There is nothing Linux or agent specific in the function. I need to use
it outside agent code in one of depending patches, hence moving it into
better location while leaving the previous symbol in place, with
deprecation warning, for backwards compatibility.
Change-Id: I252356a72f3c742e57c1b6127275030f0994a221
Since a packet can only have one mark, and we will need to mark a
packet for multiple purposes, we need to use a coordinated bitmask for
the two cases of simple marking that we currently do in Neutron
leaving the other bits for address scopes.
DocImpact
Change-Id: Id0517758d06e036a36dc8b8772e41af55d986b4e
Partially-Implements: blueprint address-scopes
map() returns an iterator in python 3. In a case that a list is expected,
we wrap map() in a list call.
Change-Id: I623d854c410176c8ec43b732dc8f4e087dadefd9
Blueprint: neutron-python3
This indirection seems complicated to me. I don't know the history
behind it but it made some of the address scope work more difficult
than I think it needs to be.
Change-Id: I1e716135b542c09ec852f6ab7af5153a65803ba3
Partially-Implements: blueprint address-scopes
The one thing that I see that the two dvr classes have in common is
the ability to map internal ports to snat ports. The dvr local router
needs it to set up a redirect to the central part. The central part
needs it to create the port for the internal network.
This change renames the mapping method to something more logical and
removes snat_ports as an argument to two methods because it is a quick
O(1) operation to get it from the router dict and passing it around
just tangles things up.
Change-Id: Icc099c1a97e3e68eeaf4690bc83167ba30d8099a
oslo_utils raise ImportError if import fails. We should propagate other
failures to callers. Otherwise we may hide issues.
Also report exact failure from import_object in case L3 agent fails to
import interface_driver.
As part of the job, consolidated code to load interface driver into
common function.
Also, stopped checking for specific log messages in dhcp and l3 agent
unit tests: it's too fragile and actually not something we need a unit
test for.
Not to introduce more work for people who handle py3 porting effort,
added the unit test into the list of those that are executed for py34
job until the whole suite is ready for python3.
Change-Id: I10cdb8414c9fb4ad5cfd3f3b2630811f50ffb0c7
A few methods were left in the wrong class when splitting up the dvr
classes. This commit reduces the amount of dependency between the
two.
Change-Id: Id1b4f4e99a5c51576eddadd5eb0c973c0d5b46b8
Future work will extend init_l3 with more code specific to router
ports. It makes sense to separate these out in to one basic method
with basic L3 and another for router port specific logic.
Change-Id: Iec9a46cd0490c4f48bb306083711ff0c5e70ba87
Partially-Implements: blueprint address-scopes
oslo.service has graduated, so neutron should consume it.
Closes-Bug: #1466851
Depends-On: Ie0fd63f969f954029c3c3cf31337fbe38f59331a
Depends-On: I2093b37d411df9a26958fa50ff523c258bbe06ec
Depends-On: I4823d344878fc97e66ddd8fdae25c13a34dede40
Change-Id: I0155b3d8b72f6d031bf6f855488f80acebfc25d4
setup_test_registry_instance() in the base test case class gives
each test its own registry by mocking out the get_callback_manager.
The L3 agent test cases were duplicating this.
Partial-Bug: #1468998
Change-Id: I7356daa846524611e9f92365939e8ad15d1e1cd8
test_spawn_radvd called mock.patch on ensure_dirs after the
setup method already patched it out. This causes issues when
mock.patch.stopall() is called because the mocks are stored
as a set and are unwound in a non-deterministic fashion.[1]
So some of the time they will be undone correctly, but others
will leave a monkey-patched in mock, causing the ensure_dir
test to fail.
1. http://bugs.python.org/issue21239
Closes-Bug: #1467908
Change-Id: I321b5fed71dc73bd19b5099311c6f43640726cd4
DVR has dependency on the portbinding host to determine
where to start the FloatingIP Namespace when floatingip
is configured. But when we assign a floatingip to a port
that is not bound, even though the API will succeed, the
FloatingIP Namespace will not be created by the Agent and
so the FloatingIP will not be functional.
This patch addresses the issue by creating the Namespace
and configuring the rules when the late binding happens.
The agent will be requesting the FIP agent gateway port,
if required and then proceed to configure the FloatingIP
Namespace.
Change-Id: I9b9158bddb626c2bb535acd709452560546fd184
Closes-Bug: #1447034
Closes-Bug: #1460408
Currently the same dvr router class is used both by the L3 Agent
in the compute nodes that is responsible for the virtual routers
namespace and the fip namespace and also used by the centralized
SNAT L3 Agent in the network node.
This is the first step to decompose the two into different
classes.
The above means that we have one class of DVR router which is used
for two jobs (the virtual router namespace wiring and the fips wiring
in the compute node in one hand and the centralized snat wiring in the other)
The end goal of this patch is to separate the two into different classes
which will also help maintaining it and also help projects that want
to use one but not the other (for example only use the centralized
SNAT behaviour with there own DVR implementation)
Change-Id: I581a097b9e7c49f20d0eb0e4ca66a25e90d9511b
Partial-Bug: #1458541
Partially-Implements: blueprint dvr-router-code-decompose
In case router is deleted during l3 agent resync,
the "deleted" event is processed with higher priority, then
resync event for the router may be processed which will recreate
already deleted router.
This happens due to timestamp not being properly updated for deleted
router in router processor.
The fix adds timestamp update for deleted router.
Functional test will be updated in a follow-up patch
Logging was improved to make debugging a bit easier.
Closes-Bug: #1455439
Change-Id: I2d060064acccc10591a3d90be9011f116548cfce