neutron/neutron
Nate Johnston 1c768810cd Wait before deleting trunk bridges for DPDK vhu
DPDK vhostuser mode (DPDK/vhu) means that when an instance is powered
off the port is deleted, and when an instance is powered on a port is
created.  This means a reboot is functionally a super fast
delete-then-create.  Neutron trunking mode in combination with DPDK/vhu
implements a trunk bridge for each tenant, and the ports for the
instances are created as subports of that bridge.  The standard way a
trunk bridge works is that when all the subports are deleted, a thread
is spawned to delete the trunk bridge, because that is an expensive and
time-consuming operation.  That means that if the port in question is
the only port on the trunk on that compute node, this happens:

1. The port is deleted
2. A thread is spawned to delete the trunk
3. The port is recreated

If the trunk is deleted after #3 happens then the instance has no
networking and is inaccessible; this is the scenario that was dealt with
in a previous change [1].  But there continue to be issues with errors
"RowNotFound: Cannot find Bridge with name=tbr-XXXXXXXX-X".  What is
happening in this case is that the trunk is being deleted in the middle
of the execution of #3, so that it stops existing in the middle of the
port creation logic but before the port is actually recreated.

Since this is a timing issue between two different threads it's
difficult to stamp out entirely, but I think the best way to do it is to
add a slight delay in the trunk deletion thread, just a second or two.
That will give the port time to come back online and avoid the trunk
deletion entirely.

[1] https://review.opendev.org/623275

Related-Bug: #1869244
Change-Id: I36a98fe5da85da1f3a0315dd1a470f062de6f38b
(cherry picked from commit e37722c0f5)
2020-04-03 21:12:10 +00:00
..
agent Merge "Add accepted egress direct flow" into stable/queens 2020-03-09 23:47:13 +00:00
api Merge "List SG rules which belongs to tenant's SG" into stable/queens 2020-02-19 03:35:50 +00:00
cmd Secure dnsmasq process against external abuse 2019-01-25 13:58:19 +00:00
common DVR: Ignore DHCP port during DVR host query 2020-01-04 08:27:57 +08:00
conf Add accepted egress direct flow 2020-02-25 07:32:29 +08:00
core_extensions use qos constants from neutron-lib 2017-10-26 19:57:19 +00:00
db Ensure that default SG exists during list of SG rules API call 2020-03-03 09:09:40 +00:00
debug Change ip_lib network namespace code to use pyroute2 2017-10-04 21:09:28 +00:00
extensions Improve invalid port ranges error message 2019-03-21 10:18:01 -04:00
hacking hacking: Remove dead code 2017-07-19 13:43:44 +02:00
ipam Add bulk IP address assignment to ipam driver 2020-03-26 12:31:05 +00:00
locale Imported Translations from Zanata 2018-03-14 06:20:49 +00:00
notifiers use callback payloads for REQUEST/RESPONSE events 2017-12-24 07:27:11 +00:00
objects Use dynamic lazy mode for fetching security group rules 2019-04-30 14:00:38 -06:00
pecan_wsgi Set DB retry for quota_enforcement pecan_wsgi hook 2019-12-16 11:16:23 +00:00
plugins Add accepted egress direct flow 2020-02-25 07:32:29 +08:00
privileged Check the namespace is ready in test_mtu_update tests 2019-09-16 09:31:34 +00:00
quota Set DB retry for quota_enforcement pecan_wsgi hook 2019-12-16 11:16:23 +00:00
scheduler Fetch specific columns rather than full ORM entities 2018-09-27 19:12:37 +02:00
server Re-use existing ProcessLauncher from wsgi in RPC workers 2020-02-20 09:39:20 +00:00
services Wait before deleting trunk bridges for DPDK vhu 2020-04-03 21:12:10 +00:00
tests Add bulk IP address assignment to ipam driver 2020-03-26 12:31:05 +00:00
__init__.py
_i18n.py Make code follow log translation guideline 2017-08-14 02:01:48 +00:00
auth.py Use oslo.context class method to construct context object 2017-03-23 09:02:46 +00:00
manager.py Do not load default service plugins if core plugin is not DB based 2017-11-09 20:34:52 +00:00
neutron_plugin_base_v2.py Do not load default service plugins if core plugin is not DB based 2017-11-09 20:34:52 +00:00
opts.py Merge "Remove deprecated cache_url" 2018-01-03 06:35:59 +00:00
policy.py Treat networks shared by RBAC in same way as shared with all tenants 2019-06-28 06:05:44 +00:00
service.py Re-use existing ProcessLauncher from wsgi in RPC workers 2020-02-20 09:39:20 +00:00
version.py
worker.py replace WorkerSupportServiceMixin with neutron-lib's WorkerBase 2017-06-14 06:56:48 -06:00
wsgi.py Re-use existing ProcessLauncher from wsgi in RPC workers 2020-02-20 09:39:20 +00:00