neutron/neutron/common
Terry Wilson 1f30f2dfff Rely on worker count for HashRing caching
The current code looks at a hash ring node's created_at/updated_at
fields and tries to determine whether the node has been updated
based on whether updated_at - created_at > 1 second (due to the
method that initially fills them being different by microseconds).
Unfortunately, due to the notify() method being called which calls
the hash ring node's touch_node(), a node can be updated in under
a second, meaning we will prevent caching for much longer than
we intend.

When using sqlite in-memory db, this continually re-creating the
Hash Ring objects for every event that is processed is exposing an
issue where rows that should be in the db just *aren't*.

This patch instead limits the hash ring nodes to api workers and
prevents caching only until the number of nodes == number of api
workers on the host. The switch from spawning hash ring nodes
where !is_maintenance to is_api_worker is primarily because it
seems to be difficult to get a list of *all* workers from which to
subtract the maintenance worker so that _wait_startup_before_caching
can wait for that specific number of workers. In practice, this
means that RpcWorker and ServiceWorker workers would not process
HashRing events.

A note on bug 1903008: While this change will greatly reduce the
likelihood of this issue taking place, we still have some work to
do in order to fully understand why it rubs the database backend
in the wrong way. Thus, we will make this change 'related to'
instead of closing the bug.

Related-Bug: #1894117
Related-Bug: #1903008
Change-Id: Ia198d45f49bddda549a0e70a3374b8339f88887b
(cherry picked from commit c4007b0833)
2021-05-06 14:31:44 +00:00
..
ovn Rely on worker count for HashRing caching 2021-05-06 14:31:44 +00:00
__init__.py Update License Headers to replace Nicira with VMware 2014-02-27 08:11:15 +00:00
_constants.py Auto-remove floating agent gw ports on net/subnet delete 2021-01-26 10:55:25 +00:00
_deprecate.py Fix flake8 N534 untranslated exception message 2018-10-19 15:46:04 -04:00
cache_utils.py Fix return correct cache when reusing port 2020-03-27 16:48:57 +04:00
config.py Enable ovsdb debug messages in functional and fullstack 2020-01-29 17:11:44 +00:00
coordination.py Remove usage of six.PY2 2020-05-22 12:59:01 -04:00
eventlet_utils.py Bump pylint version to support python 3.8 2020-08-06 16:00:30 +02:00
ipv6_utils.py Adding check for IPv6 address in setup_controllers 2019-10-29 15:10:25 +00:00
profiler.py Make code follow log translation guideline 2017-08-14 02:01:48 +00:00
test_lib.py Revert "Removed test_lib module" 2015-06-29 08:27:41 +00:00
utils.py Remove class "Timer" 2021-04-08 13:23:02 +00:00