92b2d9c25a
DPDK vhostuser mode (DPDK/vhu) means that when an instance is powered
off the port is deleted, and when an instance is powered on a port is
created. This means a reboot is functionally a super fast
delete-then-create. Neutron trunking mode in combination with DPDK/vhu
implements a trunk bridge for each tenant, and the ports for the
instances are created as subports of that bridge. The standard way a
trunk bridge works is that when all the subports are deleted, a thread
is spawned to delete the trunk bridge, because that is an expensive and
time-consuming operation. That means that if the port in question is
the only port on the trunk on that compute node, this happens:
1. The port is deleted
2. A thread is spawned to delete the trunk
3. The port is recreated
If the trunk is deleted after #3 happens then the instance has no
networking and is inaccessible; this is the scenario that was dealt with
in a previous change [1]. But there continue to be issues with errors
"RowNotFound: Cannot find Bridge with name=tbr-XXXXXXXX-X". What is
happening in this case is that the trunk is being deleted in the middle
of the execution of #3, so that it stops existing in the middle of the
port creation logic but before the port is actually recreated.
Since this is a timing issue between two different threads it's
difficult to stamp out entirely, but I think the best way to do it is to
add a slight delay in the trunk deletion thread, just a second or two.
That will give the port time to come back online and avoid the trunk
deletion entirely.
[1] https://review.opendev.org/623275
Related-Bug: #1869244
Change-Id: I36a98fe5da85da1f3a0315dd1a470f062de6f38b
(cherry picked from commit
|
||
---|---|---|
.. | ||
drivers | ||
rpc | ||
seg_types | ||
__init__.py | ||
callbacks.py | ||
exceptions.py | ||
models.py | ||
plugin.py | ||
rules.py | ||
utils.py |