Suppress Chassis Not Found on API Operation

When you have a multi-db deployment, or even just many
different threads operating on the same server with different
transactions, you can run into a situation where one thread
initiates a transaction to get a list of nodes, and then
another triggers a delete of the chassis (and most likely node,
but hey, there is really no way to detect that and work.)

So as the API is processing the response and making the json
result set, the query to resolve a chassis_id on a node object
can begin to fail.

Before this patch, this would raise an exception to the client.

Now, we just suppress the error, and return the field value
as None.

In the grand scheme, the node is likely has also already
been deleted as well.

Change-Id: I3594ac580c01454c70922a965a2a653a8b568cbb
Closes-Bug: 1508995
Story: 1508995
Task: 10038
This commit is contained in:
Julia Kreger 2022-08-01 18:18:53 -07:00
parent 45c9c3029f
commit fb253a670f
3 changed files with 36 additions and 2 deletions

View File

@ -1172,8 +1172,16 @@ def _get_chassis_uuid(node):
"""
if not node.chassis_id:
return
chassis = objects.Chassis.get_by_id(api.request.context, node.chassis_id)
return chassis.uuid
try:
chassis = objects.Chassis.get_by_id(api.request.context,
node.chassis_id)
return chassis.uuid
except exception.ChassisNotFound:
# NOTE(TheJulia): This is a case where multiple threads are racing
# and the chassis was not found... or somebody edited the database
# directly. Regardless, operationally, there is no chassis, and we
# return as there is nothing to actually return to the API consumer.
return
def _replace_chassis_uuid_with_id(node_dict):

View File

@ -600,6 +600,23 @@ class TestListNodes(test_api_base.BaseApiTest):
self.assertCountEqual(['driver_info', 'links'], data)
self.assertEqual('******', data['driver_info']['fake_password'])
def test_get_one_with_deleted_chassis(self):
node = obj_utils.create_test_node(self.context,
chassis_id=self.chassis.id)
with mock.patch.object(self.dbapi,
'get_chassis_by_id',
autospec=True) as mock_gc:
# Explicitly return a chassis not found, and make sure the API
# hides this from the API consumer as this is likely just an
# in-flight deletion across multiple DB sessions or different
# API surfaces (or, just slow DB replication.)
mock_gc.side_effect = exception.ChassisNotFound(
chassis=self.chassis.id)
data = self.get_json(
'/nodes/%s' % node.uuid,
headers={api_base.Version.string: str(api_v1.max_version())})
self.assertIsNone(data['chassis_uuid'])
def test_get_network_interface_fields_invalid_api_version(self):
node = obj_utils.create_test_node(self.context,
chassis_id=self.chassis.id)

View File

@ -0,0 +1,9 @@
---
fixes:
- |
Fixes an issue where an API user, when requesting a node list or single
node object, could get an error indicating that the request was bad as
the chassis was not found. This can occur when in-flight delete
operations are in progress on another thread. Instead of surfacing a
request breaking error, the API now suppresses the error and just
treats it as if there is no Chassis.