In case when L3 agent is running in dvr_snat mode on compute node,
it is like that e.g. in some of the gate jobs, it may happen that
same router is scheduled to be in standby mode on compute node and
on same compute node there is instance connected to it.
So in such case metadata proxy needs to be spawned in router namespace
even if it is in standby mode.
Conflicts:
neutron/tests/unit/agent/l3/test_agent.py
Change-Id: Id646ab2c184c7a1d5ac38286a0162dd37d72df6e
Closes-Bug: #1817956
Closes-Bug: #1606741
(cherry picked from commit 6ae228cc2e)
Need to pass centralized floating IPs as preserve_ips to
_external_gateway_added during DVR router update.
Otherwise IP addresses will be deleted from gw device in certain case.
The case is when a router with active centralized floating IPs is
being scheduled to a new dvr_snat L3 agent (rescheduled from a down one).
Please see corresponding traces in the bug description.
Change-Id: Iaeb9fbed73144df6fcd9092c665ed19986e85f4d
Closes-bug: #1817306
(cherry picked from commit 1ee18775a9)
With DVR routers, if a port is associated with a FloatingIP,
before it is used by a VM, the FloatingIP will be initially
started at the Network Node SNAT Namespace, since the port
is not bound to any host.
Then when the port is attached to a VM, the port gets its
host binding, and then the FloatingIP setup should be migrated
to the Compute host and the original FloatingIP in the Network
Node SNAT Namespace should be cleared.
But the original FloatingIP setup in SNAT Namespace was not
cleared by the agent.
This patch addresses the issue.
Change-Id: I55a16bcc0020087aa1abe76f5bc85cd64ccdaecd
Closes-Bug: #1796491
(cherry picked from commit cd0cc47a6a)
In case when 2 dvr routers are connected to each other with
tenant network, those routers needs to be always deployed
on same compute nodes.
So this patch changes dvr routers scheduler that it will create
dvr router on each host on which there are vms or other dvr routers
connected to same subnets.
Co-Authored-By: Swaminathan Vasudevan <SVasudevan@suse.com>
Closes-Bug: #1786272
Conflicts:
neutron/agent/l3/agent.py
neutron/db/l3_dvr_db.py
neutron/tests/unit/agent/l3/test_agent.py
Change-Id: I579c2522f8aed2b4388afacba34d9ffdc26708e3
(cherry picked from commit 5018d70241)
(cherry picked from commit b127433f38)
For L3 DVR HA router, the centralized floating IP nat rules are not
installed in every HA node snat namespace. So, install the rules to
all the router snat-namespace on every scheduled HA router host.
Conflicts:
neutron/tests/common/l3_test_common.py
neutron/tests/functional/agent/l3/test_dvr_router.py
Conflicts:
neutron/tests/common/l3_test_common.py
Closes-Bug: #1793527
Change-Id: I08132510b3ed374a3f85146498f3624a103873d7
(cherry picked from commit ee7660f593)
(cherry picked from commit 2a1cdf01b5)
(cherry picked from commit b93ef2f7e8)
Move the iptables metadata marking rule earlier in
router init, that way any stray metadata requests
that arrive before the filter metadata redirect rule is
installed will just be dropped. We do this irregardless
of whether we will be running the metadata proxy.
Partial-bug: #1735724
Change-Id: I8982523dbb94a7c5b8a4db88a196fabc4dd2873f
(cherry picked from commit 6941977827)
radvd needs to run as root, but has the capability to drop privileges on
linux hosts. Currently, radvd process is not using this feature and
this can be considered a serious risk.
In addition, some distributions like SUSE, radvd process runs as a non
privileged user by default, causing radvd failure to daemonize
because it can't write the pid in the corresponding neutron folder and
break the IPv6 functionality.
This patch allows radvd process to run with the same user used by
neutron. In order to allow this, it changes the radvd config file
permissions to 444 because radvd doesn't allow that this file can be
writeable by self/group. The readonly mode is not a problem updating the
file because of the way the neutron_lib replace_file function handles
the files operations.
Closes-Bug: #1777922
Change-Id: Ic5d976ba71a966a537d1f31888f82997a7ccb0de
Signed-off-by: aojeagarcia <aojeagarcia@suse.com>
(cherry picked from commit 9f2b40f2ce)
l3-agent checks the HA state of routers when a router is updated.
To ensure that the HA state is only checked on HA routers the following
check is performed: `if router.get('ha') and not is_dvr_only_agent`.
This check should ensure that the check is only performed on
DvrEdgeHaRouter and HaRouter objects.
Unfortunately, there are cases where we have DvrEdgeRouter objects
running on 'dvr_snat' agents. E.g. when deploying a loadbalancer with
neutron-lbaas in a landscape with 6 network nodes and
max_l3_agents_per_router set to 3, it may happen that the loadbalancer
is placed on a network node that does not have a DvrEdgeHaRouter running
on it. In such a case, neutron will deploy a DvrEdgeRouter object on the
network node to serve the loadbalancer, just like it would deploy a
DvrEdgeRouter on a compute node when deploying a VM.
Under such circumstances each update to the router will lead to an
AttributeError, because the DvrEdgeRouter object does not have the
ha_state attribute.
This patch circumvents the issue by doing an additional check on the
router object to ensure that it actually has the ha_state attribute.
Closes-Bug: #1755243
Change-Id: I755990324db445efd0ee0b8a9db1f4d7bfb58e26
(cherry picked from commit 8c2dae659a)
When an HA router initialization fails early, it can lead to:
AttributeError: 'HaRouter' object has no attribute 'process_monitor'
Add init of 'self.process_monitor' in RouterInfo init code in
case we try and cleanup early.
Change-Id: Iddeaeef13adee10f7b130e3f9e584b6e9f037030
Closes-bug: #1735557
(cherry picked from commit c62d54d0c2)
Before this change, DVR_SNAT agents would get no routers when
asking for updates due to provisioning of DHCP ports on the
node they are running on. This means that there's no connectivity
between the DHCP port and the network gateway (that may be
hosted on a different node), and therefore things like DNS may
break when a VM attempts resolution when talking to the affected
DHCP port.
This change relaxed a conditional that prevents the right list of
routers to be compiled and returned from the server to the agent.
The agent on the other hand needs to make sure to allocate the
right type of router based on what is being returned from the server.
Closes-bug: #1733987
Change-Id: I6124738c3324e0cc3f7998e3a541ff7547f2a8a7
(cherry picked from commit b24013f569)
As soon as we call router_info.initialize(), we could
possibly try and process a router. If it is HA, and
we have not fully initialized the HA port or keepalived
manager, we could trigger an exception.
Move the call to check_ha_state_for_router() into the
update notification code so it's done after the router
has been created. Updated the functional tests for this
since the unit tests are now invalid.
Also added a retry counter to the RouterUpdate object so
the l3-agent code will stop re-enqueuing the same update
in an infinite loop. We will delete the router if the
limit is reached.
Finally, have the L3 HA code verify that ha_port and
keepalived_manager objects are valid during deletion since
there is no need to do additional work if they are not.
Change-Id: Iae65305cbc04b7af482032ddf06b6f2162a9c862
Closes-bug: #1726370
(cherry picked from commit d2b909f533)
Since ri.ex_gw_port can be None, the l3-agent can throw an
exception when looking for ports it might have in a given
network.
(cherry picked from 2cea213d94)
Change-Id: I3ab3e9c012022cd7eefa5c609ca9540649079ad3
Closes-bug: #1724043
With the current change in allowing the unbound fip
to be associated with the snat node, we are seeing
that all floating IPs that are associated with an
unbound port are created at the snat node.
This is also applicable for floating IPs that are
created just before associating the port to a VM.
We have seen such scenarios in the test cases.
This is the right behavior as per design. But when
the port is bound to a host, the floating IP should
be migrated to the respective host.
This patch fixes the issue by sending notification to
the respective node, when the port is bound and also
clear the fip from the snat node.
Closes-Bug: #1718788
Change-Id: I6b1f3ffc3c3336035632f6a82d3a87b3be57b403
(cherry picked from commit 27fcf86bcb)
The agent is not currently checking for the host bound
before configuring the floatingip. That leads to
floatingips being configured on multiple hosts.
This is a partial fix on the agent side to prevent
configuring a floatingip ip that is not bound to
this host.
Related-Bug: #1712412
Related-Bug: #1713927
Change-Id: I1bc8c42425f97234f56412a2f109a996d9f896de
(cherry picked from commit afd1995d91)
get_router_cidrs is over-ridden in dvr_edge_router. This function
is not returing the centralized_floating_ip cidrs when called.
This patch fixes the issue.
Change-Id: I95a2e3b1b1474aba15a7cdceeb4ed951b672e2ca
Closes-Bug: #1712728
(cherry picked from commit 8479741c30)
_get_floatingips_bound_to_host function was introduced
recently in dvr_local_router to retrieve the external
interface name for centralizing the floatingip.
This function was throwing a 'KeyError' on fip['host'] and
not required for centralized floatingips anymore.
The get_external_device_interface_name in dvr_local_router
will try to get the 'fg' interface that is required for
the bound floating-ips to clear up some of the rules.
In the case of the centralized unbound floating-ips, the
'qg' external interface is retreived from
get_snat_external_device_interface_name that is defined
in 'dvr_edge_router' and based on the namespace.
So _get_floatingips_bound_to_host can be removed from
get_external_device_inteface_name.
Closes-Bug: 1712412
Change-Id: I94c0a071df32f572745a2c29942956c3da9f309b
(cherry picked from commit 47fbc6157a)
This patch makes L3 agent to update its ports' MTU when it's changed on
core plugin side.
Related-Bug: #1671634
Change-Id: I4444da6358e8b8420a3a365e1107b02f5bb1161d
(cherry picked from commit cc69828ff0)
This patch is the agent side patch that takes care of configuring
the centralized floatingips for the unbound ports in the snat_namespace.
Change-Id: I595ce4d6520adfd57bacbdf20ed03ffefd0b190a
Closes-Bug: #1583694
When using dual stack, the IPv6 router interface responds
to ARP requests that only the IPv4 interface should.
This results in ARP flux and can cause a guest to address
packets to the wrong layer-2 address when sending traffic
to the IPv4 gateway.
Change arp_ignore and arp_announce sysctl options on interfaces
in the router namespace to be more strict in how we respond.
Closes-bug: 1692007
Change-Id: Ic3c2370995abb027a3412b473ce6bc63790c1105
The well known service type constants are in
neutron_lib.plugins.constants, but for legacy reasons a few still exist
and are referenced from neutron_lib.constants that we'd like to remove.
This patch switches references over to neutron_lib's plugin constants.
Change-Id: I1861448cec303725b30cef8f42029f467f9e03a3
When we create agent gateway port on all the nodes irrespective
of the floatingips we can basically use that agent gateway port to
forward traffic in and out of the nodes if the address_scopes match,
since we don't need SNAT functionality if address scopes match.
If a gateway is configured and if it has internal ports that belong
to the same address_scopes then no need to add the redirect rules.
At the same we should also add a static route in the fip namespace
for every interface that is connected to the router that belongs to
the same address scope.
Change-Id: I617e2fc5a70852c6f2e925ac7244f2a205d60de4
Closes-Bug: #1577488
This reverts commit fb2093c365.
This patch started spamming logstash like crazy with ERRORs.
Closes-Bug: #1693539
Change-Id: I81627f1bac1b981f930b66c126abd8285653bf49
Trying to check HA state on a DVR-only compute node
can trigger:
AttributeError: 'DvrLocalRouter' object has no attribute 'ha_state'
Also moved the mode assignment outside of the loops
since it only needs to be done once.
Co-Authored-By: Sean Redmond <sean.redmond1@gmail.com>
Closes-bug: #1691427
Change-Id: I3e48e06e76325939fbc9533b0198924bc96d600e
When we create agent gateway port on all the nodes irrespective
of the floatingips we can basically use that agent gateway port to
forward traffic in and out of the nodes if the address_scopes match,
since we don't need SNAT functionality if address scopes match.
If a gateway is configured and if it has internal ports that belong
to the same address_scopes then no need to add the redirect rules.
At the same we should also add a static route in the fip namespace
for every interface that is connected to the router that belongs to
the same address scope.
Change-Id: Iaf6d3b38b1fb45772cf0b88706586c057ddb0230
Closes-Bug: #1577488
Without this commit, the run_as_root parameter is always True when
stopping a process, which leads to the usage of unnecessary sudo such as
in some functional tests, like the keepalived ones.
This commit fixes the aforemetioned problem by taking run_as_root into
account when stopping a process. However, run_as_root will still always
be True if the process is spawned in a netns.
Closes-Bug: #1491581
Change-Id: Ib40e1e3357b9a38e760f4e552bf615cdfd54ee5a
Signed-off-by: Hunt Xu <mhuntxu@gmail.com>
according to https://wiki.openstack.org/wiki/Python3,
now we should avoid using six.iteritems and replace
it with dict.items.
Change-Id: I8753e80b34c0f86cf70aebc3bcbd3392ee933f62
Partial-Bug: #1680761
In order to route traffic between the internal subnets and the
external subnet that belong to the same address_scopes we need
to create the gateway port and the fip namespace irrespective of
the configured floatingips for the internal subnet.
This will consume an additional IP from the external subnet on
all nodes, but with the introduction of service_type networks,
this will not be an issue any more.
This patch is the first in series that creates the agent gateway
port and the fip namespace on every node when the gateway is set
for the router. For every router created it will connect the
router namespace to the fip namespace.
Partial-Bug: #1577488
DocImpact: Document the change in behavior for fip-agent-gw create
Change-Id: I30c4f7fc250e486fe9a71b68540e783e90a6cf15
Since [1], when the l3 agent does fullsync, for every router, it calls
ensure_snat_cleanup depending on whether the agent is dvr_snat or not.
However, DVR+HA routers always have snat namespaces on dvr_snat agents
holding themselves for keepalived. Therefore, the cleanup call is
unexpected and will cause a series of issues.
This patch ensures that snat namespaces of DVR+HA routers will not be
cleaned when the agent do fullsync.
[1] https://review.openstack.org/#/c/326729/
Change-Id: I5df0a1404f1a80ab0b226d7a60c2885e24247e02
Closes-Bug: #1632540
Neutron-lib 1.1.0 is now out and contains the portbindings
API definition (as per commit [1]). This patch moves neutron
references over to the neutron-lib version.
NeutronLibImpact
- Consumers using the public constants within neutron's
portbindings API extension must now use the values
from neutron-lib.
[1] 87e42f993c07ae320159d5123662ee9f3bd4d903
Change-Id: I669af9b4c712877772d91a03857ab108714001d4
When router_info initialize() fails(with trace) some resources(
like keepalived process) may not be created. While handling this
exception, l3 agent calls _process_updated_router instead of
again calling _process_added_router, which also fails trying to
access resources which are not created.
In this change, agent will have new router_info(i.e
self.router_info[router_id] = ri) only when initialize() succeeds.
When initialize() fails, as router_info is not part of agent,
"_process_router_if_compatible" will again call initialize().
We also cleanup router_info when initialize() fails.
Closes-bug: #1662804
Change-Id: I278ac83de57713c93d6e50846d79034d774c5d47
Neutron does not disable ipv6 forwarding for HA routers and it's
enabled by default in all router namespaces. For ipv6, this means
that it will automatically join the following groups:
* link-local all-routers multicast group (ff02::2)
* interface-local all routers multicast group (ff01::2)
* site-local all routers multicast group (ff05::2))
As a side effect it will answer to multicast listener queries, thus
causing external switch to learn its MAC address and disrupting traffic
to the master instance.
This patch will enable ipv6 forwarding on the gateway interface only
for master instances and disable it otherwise to fix the issue.
Also, the accept_ra procfs entry was enabled under certain
circumstances but it wasn't disabled otherwise. This patch, will
disable RA on the gateway interface for non master instances.
Closes-Bug: #1667756
Change-Id: I9bc890b43f750cad68fc67f4c79f1426c3506863
Refactoring Neutron configuration options for agent common config to be
in neutron/conf/agent/common. This will allow centralization of all
configuration options and provide an easy way to import.
Partial-Bug: #1563069
Change-Id: Iebac0cdd3bcfd0135349128921b7ad7a1a939ab8
Needed-By: Ib676003bbe909b5a9013a3178b12dbe291d936af
The following enhancements are added:
-- PD keeps track of status of neutron routers: active or
standalone (master), or standby (not master),
-- PD DHCP clients are only spawned in the active router. In the
standby router, PD keeps track of the assigned prefixes, but
doesn't spawn DHCP clients.
-- When switchover occurs, on the router becoming standby, PD
clients are "killed" so that they don't send prefix withdrawals
to the DHCP server. On the router becoming active, PD spawns DHCP
clients with the assigned prefixes configured as hints in the
DHCP client's configuration
Closes-Bug: #1651465
Change-Id: I17df98128c7a88e72e31251687f30f569df6b860
For IPv6, the csnat port list could have multiple
subnets contained in it, but we were only ever
looking at the one associated with the first fixed
IP when trying to match an internal port. Change
to check all subnets on all port combinations
(internal and csnat) before giving up.
Change-Id: I9c0ac933c08734a3f6738a233fdf6021ce9bd375
Closes-bug: #1624515
Unlike Legacy routers, DVR Edge Routers have their gateway
interfaces in the SNAT namespace as opposed to the router
namespace. Added a new method to the router_info class,
get_gw_ns_name(), so callers can determine which namespace
the gateway device lives in. This can then be over-ridden
in the DVR Edge router class.
The Prefix Delegation code will also now listen on the
update_router event from the l3-agent and reset the namespace
name if it changes.
Closes-Bug: #1541406
Change-Id: If6ada5027d0483fac7fc3ff935fee1edfc6e2759
Make sure the correct iptables rule is added when the router gets
an interface on a PD-enabled subnet. This will allow traffic on PD
subnets to reach the external network.
Includes a unit test for the new function, and modifies an
existing test to verify the adding and removal of the rule.
Change-Id: I42f8f42995e9809e5bda2b29726f7244c052ca1c
Closes-Bug: #1570122