update port may takes an excessive number of seconds
to complete if dvr routers are running on more than 100
compute nodes. This patch tries to save some time by removing
unnecessary calls inside looping through hosts.
Change-Id: Ide740e0c5c43c2d2b842460a37c8ce125da12b28
Closes-Bug: #1830456
(cherry picked from commit 00eb6f26f6)
We have a problem with SNAT with too many connections using the
same source and destination on the network nodes.
In addition we can see in the conntrack table that the who
"instert_failed" increases.
This might be a generic problem with conntrack and linux.
We suspect that we encounter the following "limitation / bug"
in the kernel.
There seems to be a workaround to alleviate this behavior by
setting the -random-fully flag in iptables for port consumption.
This patch fixes the problem by adding the --random-fully to
the SNAT rules.
Conflicts:
neutron/agent/linux/iptables_manager.py
neutron/common/constants.py
neutron/tests/unit/agent/l3/test_agent.py
Change-Id: I246c1f56df889bad9c7e140b56c3614124d80a19
Closes-Bug: #1814002
(cherry picked from commit 30f35e08f9)
It may happen that subnet is connected to dvr router using IP address
different than subnet's gateway_ip.
So in br-tun arp to dvr router's port should be dropped instead of
dropping arp to subnet's gateway_ip (or mac in case of IPv6).
Conflicts:
neutron/tests/unit/plugins/ml2/drivers/openvswitch/agent/test_ovs_neutron_agent.py
Change-Id: Ida6b7ae53f3fc76f54e389c5f7131b5a66f533ce
Closes-bug: #1831575
(cherry picked from commit ae3aa28f5a)
This patch adds a check to determine if the 'segments' service plugin is
enabled. The segment-host mapping db table should only be saved and
updated to the db table if the users configure the 'segments' service
plugin in the config file. The data should be available only in a routed
network resource situation.
Conflicts:
neutron/tests/unit/extensions/test_segment.py
NOTE(s10): conflict is due to If6792d121e7b8e1ab4c7a548982a42e69023da2b
not being in Queens
Change-Id: I65a42aa2129bef696906a18d82575461dc02ba21
Closes-Bug: #1799328
(cherry picked from commit 06ba6a1ace)
(cherry picked from commit 7f8c446d5f)
In functional tests for L3 HA agent, like e.g.
L3HATestFailover.test_ha_router_failover
it may happen that L3 agent will not change ipv6 accept_ra
knob and test fails because it checks that only once just
after router state is change.
This patch fixes that race by adding wait for 60 seconds to
ipv6 accept_ra change.
Conflicts:
neutron/tests/functional/agent/l3/framework.py
Change-Id: I459ce4b791c27b1e3d977e0de9fbdb21a8a379f5
Closes-Bug: #1829889
(cherry picked from commit 62b2f2b1b1)
Sometimes when port is created on dhcp agent's side, it may happend
that same port is already in network cache.
Before this patch if port with same IP address was already in cache,
resync was rescheduled because of duplicate IPs found in cache.
Now resync will be scheduled only if duplicate IP address belongs to
port with different MAC address or different id.
Change-Id: I23afbc10725f5dc78e3c63e6e505ef89ba8dc4a5
Closes-Bug: #1824802
(cherry picked from commit 5c433a027d)
New IP command introduced by Ie3fe825d65408fc969c478767b411fe0156e9fbc
requires only privsep initialization. This patch removes the prisep
error FailedToDropPrivileges when executed under neutron-rootwrap.
Closes-Bug: #1823038
Change-Id: I6cde3c9dae7ffdccce49e88c3c79d1c379f291cf
(cherry picked from commit aacd11ab9f)
There are some extreme conditions which will result the unbound
router gateway port. Then all the centralized floating IPs will
not be reachable since the gateway port was set to 4095 tag.
This patch adds the HA status to the router related port
processing code path. If it is HA router, the gateway port
will go to the right HA router processing code branch.
Closes-Bug: #1827754
Change-Id: Ida1c9f3a38171ea82adc2f11cb17945d6e2434be
(cherry picked from commit 3d99147e73)
Some policy rules e.g. for create_port are using rule "network:shared"
in which "shared" field is related to network resource instead of
port directly.
Because of that, "shared" was missing from "target" in policy
enforce module thus validation wasn't working properly for such rule.
This patch fixes it by adding to FieldCheck checker possibility to
get network object and use its "shared" field to validate policy.
Conflicts:
neutron/tests/unit/test_policy.py
Change-Id: I56c99883fce40c37a5ee26e6e661c0cc0783c42f
Closes-Bug: #1808112
(cherry picked from commit 0396912208)
(cherry picked from commit fcfd46b231)
In case of policy rule checks for rules like e.g.
"create_port:fixed_ips:subnet" couldn't be created to be
passed to policy enforcer because policy module could only
create rule checks for subattributes which are dict types.
With this patch checks for such rules can be created also for
attributes which are list of dicts, like e.g. fixed_ips in port
resource.
Conflicts:
etc/policy.json
neutron/tests/etc/policy.json
Change-Id: I02fffe77f57a513d2362df78885d327042bb8095
Closes-Bug: #1822105
(cherry picked from commit 9318fb8bb9)
(cherry picked from commit a238b1bed6)
(cherry picked from commit 73bbfa4315)
This patchset adds missing policy actions to the policy.json
file for several reasons:
1) It signals to operators all the policy actions that are
enforced in the system. With the governance spec [0]
urging projects toward policy in code documentation,
it makes sense to document all policy actions in the
policy.json as Neutron doesn't have policy in code.
2) It is consistent with Neutron's policy enforcement
documentation [1]:
"For each attribute which has been explicitly specified in the
request create a rule matching policy names in the form
<operation>_<resource>:<attribute> rule"
So it makes sense to capture each policy that is enforced,
including all those with these special attributes.
3) Why include "update_router:external_gateway_info" but not
"create_router:external_gateway_info"? This is inconsistent.
4) It makes it difficult to validate Neutron's policy via Patrole
if the policies aren't contained in the policy.json -- how else
is it possible to determine which policies to expect if they
aren't documented anywhere?
[0] https://governance.openstack.org/tc/goals/queens/policy-in-code.html
[1] https://docs.openstack.org/neutron/pike/contributor/internals/policy.html#authorization-workflow
Change-Id: I40f84134f0b56cfd574dfd69e5ebbf6a3fc2b3df
(cherry picked from commit 41fe927c80)
Once HA port is set, it must remain this value no matter
what the server return. Because there is race condition
between l3-agent side sync router info for processing
and server side router deleting.
This patch adds a helper function for every ha_port set
action. If the ha_port is not None, it will always stay
with original value.
Conflicts:
neutron/tests/unit/agent/l3/test_ha_router.py
Closes-Bug: #1826726
Change-Id: I96a088d25048be02a9c5b12c1d087df075b36fc4
(cherry picked from commit 45957f12c8)
(cherry picked from commit 13cb3cd34c)
_get_ports_query add filters after a get_collection. If limits are
passed to applied to the query, then no additional filter is allowed.
This patch extracts an eventual limit argument, to apply it only after
the additional filters.
Change-Id: I83394394860d10e27379efe0356d0fa9c567140e
Closes-Bug: #1826186
(cherry picked from commit cf8f3326be)
In conjunction with the prior fix to only get a subset of fields
when needed, this makes the querying of non-rules SG objects
very very fast.
Before the two fixes, if you have about ten security groups with 2000 rules each:
list all: 14s
list all, just 'id' field: 14s
list one: 0.6s
list one, just 'id' field: 0.6s
With just the previous partial fix:
list all: 14s
list all, just 'id' field: 6s
list one: 0.6s
list one, just 'id' field: 0.2s
Now with this change:
list all: 14s
list all, just 'id' field: 0.04s
list one: 0.6s
list one, just 'id' field: 0.03s
Closes-Bug: #1810563
Change-Id: I15df276ba7dbcb3763ab20b63b26cddf2d594954
(cherry picked from commit 1e9086f6e2)
SQLAlchemy may asynchronously push models out of session cache in which
case we may receive DetachedInstanceError.
In the test case, instead of deepcopying models to compare, compare
each modified attribute independently.
This change also includes conversion from InstrumentedLists to regular
lists when converting model attributes to object fields. The fact that
we were returning InstrumentedLists was always an oversight but it
revealed itself after the modification of the test case that is the
core of this patch.
When converting object fields to db, convert Port's distributed_binding
None value to a empty list to reflect that the relationship of the Port
database model is a list. It was not an issue before the patch because
we were not comparing model attribute for equality but for in-equality
before, and so None was always != [].
Finally, this patch moves a bunch of TODOs to better reflect where they
belong to.
Closes-Bug: #1770452
Change-Id: I42cdf540129bd4470ec1a59345db9845a6198328
(cherry picked from commit 1a8a15f630)
To fix bug 1722584 we inserted a checksum-fill rule for
metadata proxy replies. Recent kernels have disabled
this support for TCP because it was invalid, and
supposedly not doing anything, so let's get ahead of
things and remove the code.
Kernel mailing list discussion is at
https://lore.kernel.org/patchwork/patch/824819/
Partially reverts ed1c3b0217
Change-Id: Ib7cc8f82a91972f17987fb95130edc4069d9423f
Related-bug: #1722584
(cherry picked from commit b1b8a438fe)
1. give each HA failover case an independent vrrp_id
2. give each HA port an independent IP address, so the
interface IPs for router HA ports will be:
169.254.192.100 and 169.254.192.101
169.254.192.102 and 169.254.192.103
169.254.192.104 and 169.254.192.105
169.254.192.106 and 169.254.192.107
VIP of each case will be:
169.254.0.10/24
169.254.0.11/24
169.254.0.12/24
169.254.0.13/24
169.254.0.14/24
Conflicts:
neutron/tests/common/l3_test_common.py
Closes-Bug: #1819160
Change-Id: I1216d96af40449ec16a852cc1f6c4f15c85f4546
(cherry picked from commit c69a87405a)
(cherry picked from commit 2c5957f56d)
(cherry picked from commit c50bdf2329)
For auto-address IPv6 subnets postcommit has update port action
if the net already has ports. This results in
"cannot be called within a transaction" error for bulk IPv6 subnet
create.
Closes-Bug: #1822582
Change-Id: Ia32ec4c11c0793e7df07dcce19c122b3c7f865e1
(cherry picked from commit 14c76d3181)
The transition of a router from distributed to centralized may
mistakenley get its 'ha' attribute updated at the same time. The
side-effect is that the router may become HA enabled unexpectedly.
This patch fixed the mismatched attribute in _update_router_provider
method which addressed the issue cited above.
Closes-Bug: #1780094
Change-Id: Ib00de137692979229d1b7ba033ecff04e9cc9db0
(cherry picked from commit 886e241553)
When two routers are created at the same time, we can't assume the
status of each one. Instead of this, the status of each router is
first checked and then compared to the other router status.
Change-Id: If20a3a414986ea29fbfd50616761c14e5b249b2c
Closes-Bug: #1819160
(cherry picked from commit 8f35331c91)
The test bridge veth pair devices is not up which cause the
VRRP advertisement packet can not pass to each HA port. Then
multiple master router is up. This patch just sets the veth
pair devices up.
Closes-Bug: #1819160
Change-Id: I0e0d0311d73bce83d3c7341e7a0167917818b1ff
(cherry picked from commit 8cc480bd01)
Ovs-agent can process the ports in large sets, then all
of these ports will have to update DB status or attributes.
But neutron server is centralized. It may have to do
something else, or the database processing can be also
time-consuming. Because of these, it sometimes returns
the RPC timeout exception to ovs-agent. And a fullsync
will be triggered in next rpc loop. The restart time is
becoming longer and longer.
Adds a default step to update the port to reduce
the probability of RPC timeout.
Related-Bug: #1813703
Related-Bug: #1813704
Related-Bug: #1813706
Related-Bug: #1813707
Conflicts:
neutron/tests/unit/plugins/ml2/test_rpc.py
Change-Id: Ie37f4a4869969e235ce16b73cdfcbdc98626823e
(cherry picked from commit 8408af4f17)
(cherry picked from commit d7d30ea950)
(cherry picked from commit 5d705468de)
The code that ensures the fpr/rfp veth pair exists
between the qrouter and fip namespace was only setting
the mtu of the devices if it had to create them. Set
it all the time to support the mtu being changed.
Change-Id: I176b5f4d4f12cf09f930e2c1944e98082a09bcc6
Closes-bug: #1823798
(cherry picked from commit 6ded6d217a)
In some cases it may happen that when db test will fail due
to timeout oslo_db.exception.DBConnectionError will be raised
instead of sqlalchemy_exc.InterfaceError.
This patch adds handling such case in skip_if_timeout decorator.
Change-Id: I7350d5c884784317c94ff42f28526065ff399b40
Related-Bug: #1687027
(cherry picked from commit b7458b6159)
Currently, the dhcp Provisioning of ports is the crucial bottleneck
of that concurrently boot multiple VM.
The root cause is that these ports will be processed one by one by dhcp
agent when they belong to the same network, And the 'Provisioning complete'
port is still blocked other port's processing in other dhcp agents. The
patch aim to optimize the dispatch strategy of the port cast to agent to
improve the Provisioning process.
In server side, I classify messages to multi levels. Especially, I classify
the port_update_end or port_create_end message to two levels, the high-level
message only cast to one agent, the low-level message cast to all agent. In
agent side I put these messages to `resource_processing_queue`, with the queue,
We can delete `_net_lock` and process these messages in order of priority.
Additonally, I modified the `resource_processing_queue` for my demand. I update
`_queue` from LIST to PriorityQueue in `ExclusiveResourceProcessor`, by this
way, we can sort all message which cached in `ExclusiveResourceProcessor` by
priority.
Conflicts:
neutron/api/rpc/agentnotifiers/dhcp_rpc_agent_api.py
Related-Bug: #1760047
Change-Id: I255caa0571c42fb012fe882259ef181070beccef
(cherry picked from commit 99f4495c94)
(cherry picked from commit 740295d94d)
Moved the router processing queue code to the agent/common
directory and renamed it "resource processing queue". This
way it can be consumed by other agents, or possibly even
moved to neutron-lib in the future.
Conflicts:
neutron/agent/common/resource_processing_queue.py
neutron/agent/l3/agent.py
neutron/tests/unit/agent/l3/test_agent.py
Change-Id: I735cf5b0a915828c420c3316b78a48f6d54035e6
(cherry picked from commit f24f3b6b7b)
(cherry picked from commit 8f8c899c69)
Used to be, we would return an empty list. Now, as of change
https://review.openstack.org/#/c/630401/, we don't return the
field at all. That's an API regression.
Go back to returning an empty list.
Change-Id: I295076155eea518152e2479f93f3cf1ea811a207
(cherry picked from commit cc4d5a2561)
Kernel 4.4.0-145 backported a change on IPv6 fragmentation API, so
update the OVS version checked out for fullstack tests to a hash
including the needed compatibility layer changes
Change-Id: Ia9383c02e1c62e31db9493729aedbed5b94a3a3f
Closes-bug: #1823155
(cherry picked from commit 004caf773a)