neutron/neutron
Nate Johnston c0b15a8cbe Wait before deleting trunk bridges for DPDK vhu
DPDK vhostuser mode (DPDK/vhu) means that when an instance is powered
off the port is deleted, and when an instance is powered on a port is
created.  This means a reboot is functionally a super fast
delete-then-create.  Neutron trunking mode in combination with DPDK/vhu
implements a trunk bridge for each tenant, and the ports for the
instances are created as subports of that bridge.  The standard way a
trunk bridge works is that when all the subports are deleted, a thread
is spawned to delete the trunk bridge, because that is an expensive and
time-consuming operation.  That means that if the port in question is
the only port on the trunk on that compute node, this happens:

1. The port is deleted
2. A thread is spawned to delete the trunk
3. The port is recreated

If the trunk is deleted after #3 happens then the instance has no
networking and is inaccessible; this is the scenario that was dealt with
in a previous change [1].  But there continue to be issues with errors
"RowNotFound: Cannot find Bridge with name=tbr-XXXXXXXX-X".  What is
happening in this case is that the trunk is being deleted in the middle
of the execution of #3, so that it stops existing in the middle of the
port creation logic but before the port is actually recreated.

Since this is a timing issue between two different threads it's
difficult to stamp out entirely, but I think the best way to do it is to
add a slight delay in the trunk deletion thread, just a second or two.
That will give the port time to come back online and avoid the trunk
deletion entirely.

[1] https://review.opendev.org/623275

Related-Bug: #1869244
Change-Id: I36a98fe5da85da1f3a0315dd1a470f062de6f38b
(cherry picked from commit e37722c0f5)
2020-04-03 21:11:57 +00:00
..
agent Add accepted egress direct flow 2020-03-26 08:23:32 +00:00
api Merge "Fix bug: AttributeError arises while sorting with standard attributes" into stable/rocky 2020-02-26 14:08:37 +00:00
cmd Secure dnsmasq process against external abuse 2019-02-01 09:07:14 +00:00
common Merge "Trigger router update only when gateway port IP changed" into stable/rocky 2020-02-24 19:04:21 +00:00
conf Add accepted egress direct flow 2020-03-26 08:23:32 +00:00
core_extensions Refactor duplicated implementation of _get_policy_obj 2018-06-20 09:51:02 +08:00
db Ensure that default SG exists during list of SG rules API call 2020-03-03 09:09:29 +00:00
debug Fix all pep8 E265 errors 2018-04-30 16:35:52 -04:00
extensions Fix resource schemas and releated `get_sorts` test cases 2020-02-15 12:31:31 +01:00
hacking use sqla functions from neutron-lib 2018-07-25 21:04:20 +00:00
ipam Add bulk IP address assignment to ipam driver 2020-03-26 12:30:50 +00:00
locale Imported Translations from Zanata 2018-11-30 09:16:33 +00:00
notifiers Fix W503 pep8 warnings 2018-04-17 14:22:58 +00:00
objects Handle ports assigned to routers without routerports 2019-10-15 10:44:54 +00:00
pecan_wsgi Set DB retry for quota_enforcement pecan_wsgi hook 2019-12-09 12:08:47 +00:00
plugins Add accepted egress direct flow 2020-03-26 08:23:32 +00:00
privileged Check the namespace is ready in test_mtu_update tests 2019-09-16 09:30:43 +00:00
quota Set DB retry for quota_enforcement pecan_wsgi hook 2019-12-09 12:08:47 +00:00
scheduler Fetch specific columns rather than full ORM entities 2018-09-27 16:28:37 +00:00
server Re-use existing ProcessLauncher from wsgi in RPC workers 2020-02-19 07:59:06 +00:00
services Wait before deleting trunk bridges for DPDK vhu 2020-04-03 21:11:57 +00:00
tests Merge "Add bulk IP address assignment to ipam driver" into stable/rocky 2020-03-31 10:35:23 +00:00
__init__.py
_i18n.py Make code follow log translation guideline 2017-08-14 02:01:48 +00:00
auth.py Use oslo.context class method to construct context object 2017-03-23 09:02:46 +00:00
manager.py Avoid loading same service plugin more than once 2019-04-12 08:33:38 +00:00
neutron_plugin_base_v2.py Do not load default service plugins if core plugin is not DB based 2017-11-09 20:34:52 +00:00
opts.py Merge "Remove deprecated cache_url" 2018-01-03 06:35:59 +00:00
policy.py List SG rules which belongs to tenant's SG 2020-01-29 08:04:16 +00:00
service.py Re-use existing ProcessLauncher from wsgi in RPC workers 2020-02-19 07:59:06 +00:00
version.py
worker.py replace WorkerSupportServiceMixin with neutron-lib's WorkerBase 2017-06-14 06:56:48 -06:00
wsgi.py Re-use existing ProcessLauncher from wsgi in RPC workers 2020-02-19 07:59:06 +00:00