Add extra safety to metastatic bnr cleanup

In case the backing node no longer exists during the cleanup method
(perhaps due to a bug or forcible admin operation), add some extra
safetly to ensure that the metastatic driver can still clean up
its internal records and recover.

See https://review.opendev.org/c/zuul/nodepool/+/924932 for more
background and a traceback.

Change-Id: Ib59f20a6c8ab1283f5bea397ff156d824d71d5e4
This commit is contained in:
James E. Blair 2024-08-14 16:21:05 -07:00
parent 4c00f6a72d
commit 325455ccc3

View File

@ -321,10 +321,18 @@ class MetastaticAdapter(statemachine.Adapter):
self.log.info("Backing node %s has been idle for "
"%s seconds, releasing",
bnr.node_id, now - bnr.last_used)
# Set the bnr to failed just in case something
# goes wrong after this point, we won't use it
# any more.
bnr.failed = True
node = self._getNode(bnr.node_id)
node.state = zk.USED
self.zk.storeNode(node)
self.zk.forceUnlockNode(node)
if node:
# In case the node was already removed due
# to an error, allow the bnr to still be
# cleaned up.
node.state = zk.USED
self.zk.storeNode(node)
self.zk.forceUnlockNode(node)
backing_node_records.remove(bnr)
return []