nova/nova at 61fc81a6761d34afdfc4a6d1c4c953802fd8a179 - nova - OpenDev: Free Software Needs Free Tools

History

Balazs Gibizer 61fc81a676 Prevent leaked eventlets to send notifications In out functional tests we run nova services as eventlets. Also those services can spawn there own eventlets for RPC or other parallel processing. The test case executor only sees and tracks the main eventlet where the code of the test case is running. When that is finishes the test executor considers the test case to be finished regardless of the other spawned eventlets. This could lead to leaked eventlets that are running in parallel with later test cases. One way that it can cause trouble is via the global variables in nova.rpc module. Those globals are re-initialized for each test case so they are not directly leaking information between test cases. However if a late eventlet calls nova.rpc.get_versioned_notifier() it will get a totally usable FakeVersionedNotifier object regardless of which test case this notifier is belongs to or which test case the eventlet belongs to. This way the late eventlet can send a notification to the currently running test case and therefore can make it fail. The current case we saw is the following: 1) The test case nova.tests.functional.test_servers.ServersTestV219.test_description_errors creates a server but don't wait for it to reach terminal state (ACTIVE / ERROR). This test case finishes quickly but leaks running eventlets in the background waiting for some RPC call to return. 2) As the test case finished the cleanup code deletes the test case specific setup, including the DB. 3) The test executor moves forward and starts running another test case 4) 60 seconds later the leaked eventlet times out waiting for the RPC call to return and tries doing things, but fails as the DB is already gone. Then it tries to report this as an error notification. It calls nova.rpc.get_versioned_notifier() and gets a fresh notifier that is connected to the currently running test case. Then emits the error notification there. 5) The currently running test case also waits for an error notification to be triggered by the currently running test code. But it gets the notification form the late eventlet first. As the content of the notification does not match with the expectations the currently running test case fails. The late eventlet prints a lot of error about the DB being gone making the troubleshooting pretty hard. This patch proposes a way to fix this by marking each eventlet at spawn time with the id of the test case that was directly or indirectly started it. Then when the NotificationFixture gets a notification it compares the test case id stored in the calling eventlet with the id of the test case initialized the NotificationFixture. If the two ids do not match then the fixture ignores the notification and raises an exception to the caller eventlet to make it terminate. Change-Id: I012dcf63306bae624dc4f66aae6c6d96a20d4327 Closes-Bug: #1946339		2021-10-14 18:27:30 +02:00
..
accelerator	smartnic support - reject server move and suspend	2021-08-05 15:58:41 +08:00
api	Merge "Support interface attach / detach with new resource request format"	2021-09-02 19:03:47 +00:00
cmd	nova-manage: Ensure mountpoint is passed when updating attachment	2021-09-29 11:53:02 +01:00
compute	Store old_flavor already on source host during resize	2021-09-27 12:01:20 +02:00
conductor	Merge "Support move ops with extended resource request"	2021-08-31 21:38:24 +00:00
conf	Merge "workarounds: Remove rbd_volume_local_attach"	2021-09-02 12:16:53 +00:00
console	Merge "console: Improve logging"	2021-09-07 14:29:08 +00:00
db	Add missing __init__.py in nova/db/api	2021-09-20 11:28:46 +02:00
hacking	Add two new hacking rules	2021-09-01 12:26:52 +01:00
image	glance: Remove [glance]/allowed_direct_url_schemes	2021-01-28 12:46:57 +00:00
keymgr	…
locale	Imported Translations from Zanata	2020-04-26 07:51:21 +00:00
network	Support interface attach / detach with new resource request format	2021-09-01 15:51:47 +02:00
notifications	Merge "Allow 'bochs' as a display device option"	2021-09-03 15:07:35 +00:00
objects	Update min supported service version for Yoga	2021-10-01 13:09:02 +00:00
pci	mypy: Add type annotations to 'nova.pci'	2021-04-26 18:06:21 +01:00
policies	policy: Deprecate field from 'os-extended-server-attributes' policy	2021-08-26 10:54:25 +01:00
privsep	Retry lvm volume and volume group query	2021-06-15 12:39:26 +02:00
scheduler	Support interface attach / detach with new resource request format	2021-09-01 15:51:47 +02:00
servicegroup	Remove six.binary_type/integer_types/string_types	2020-12-13 11:25:14 +00:00
storage	Stop leaking ceph df cmd in RBD utils	2021-05-11 17:28:56 +02:00
tests	Prevent leaked eventlets to send notifications	2021-10-14 18:27:30 +02:00
virt	Merge "hardware: Add TODO to remove '(un)pin_cpu_with_siblings'"	2021-09-11 09:15:52 +00:00
volume	Remove six.text_type (1/2)	2020-12-13 11:25:31 +00:00
__init__.py	…
availability_zones.py	Remove six.PY2 and six.PY3	2020-08-15 07:45:23 +00:00
baserpc.py	…
block_device.py	fup: Remove unused legacy block_device_info format	2021-08-20 13:26:46 +01:00
cache_utils.py	trivial: Remove unused 'cache_utils' APIs	2020-02-05 17:20:28 +00:00
config.py	db: Post reshuffle cleanup	2021-08-09 15:34:40 +01:00
context.py	db: Unify 'nova.db.api', 'nova.db.sqlalchemy.api'	2021-08-09 15:34:40 +01:00
crypto.py	Replace md5 for fips	2021-02-25 16:01:43 -05:00
debugger.py	trivial: Remove remaining '_LW' instances	2020-05-18 17:00:41 +01:00
exception.py	Convert features not supported error to HTTPBadRequest	2021-09-01 09:09:58 -05:00
exception_wrapper.py	rpc: Rework 'get_notifier', 'wrap_exception'	2021-03-01 11:06:48 +00:00
filters.py	trivial: Remove remaining '_LI' instances	2020-05-18 17:00:57 +01:00
i18n.py	trivial: Remove remaining '_LI' instances	2020-05-18 17:00:57 +01:00
loadables.py	…
manager.py	db: Unify 'nova.db.api', 'nova.db.sqlalchemy.api'	2021-08-09 15:34:40 +01:00
middleware.py	Allow X-OpenStack-Nova-API-Version header in CORS	2021-06-15 07:35:36 -04:00
monkey_patch.py	Correctly disable greendns	2020-09-11 12:42:04 -04:00
policy.py	Reuse code from oslo lib for JSON policy migration	2021-01-14 22:41:33 +00:00
profiler.py	…
quota.py	db: Post reshuffle cleanup	2021-08-09 15:34:40 +01:00
rpc.py	rpc: Rework 'get_notifier', 'wrap_exception'	2021-03-01 11:06:48 +00:00
safe_utils.py	…
service.py	Restore retrying the RPC connection to conductor	2020-11-13 18:02:00 +01:00
service_auth.py	…
test.py	Prevent leaked eventlets to send notifications	2021-10-14 18:27:30 +02:00
utils.py	Replace getargspec with getfullargspec	2021-05-12 10:50:52 +08:00
version.py	Change API unexpected exception message	2021-02-17 21:30:07 +00:00
weights.py	Remove six.add_metaclass	2020-08-15 07:45:39 +00:00
wsgi.py	trivial: Remove remaining '_LI' instances	2020-05-18 17:00:57 +01:00