Don't force_reconnect() on unhandled Idl exception

There's no reason to believe that reconnecting to ovsdb-server will
resolve an unhandled exception in python-ovs. In addition, since users
often subclass Idl and add their own notify() methods, there could be
exceptions thrown from that code.

The best we can do is log what is going on and rely on users to fix
the issue. Delaying with sleep() is usually a bad idea since if there
was some kind of ovsdb reconnection, it will delay calls to Idl.run()
which will handle that reconnection over several calls.

Change-Id: Iab2177fb9fa653292a3805689895f98e0833dc4a
(cherry picked from commit cd70d1e29033f50e23ccd97aacde6cc476baf52e)
This commit is contained in:
Terry Wilson 2022-10-24 14:58:46 -05:00 committed by Jakub Libosvar
parent 92cbba4481
commit ab3e0cb0d0

View File

@ -93,7 +93,6 @@ class Connection(object):
self.thread.start()
def run(self):
errors = 0
while self.is_running:
# If we fail in an Idl call, we could have missed an update
# from the server, leaving us out of sync with ovsdb-server.
@ -108,22 +107,10 @@ class Connection(object):
self.idl.run()
except Exception as e:
# This shouldn't happen, but is possible if there is a bug
# in python-ovs
errors += 1
# in python-ovs, or an unhandled exception in overridden
# Idl.notify() code
LOG.exception(e)
with self.lock:
self.idl.force_reconnect()
try:
idlutils.wait_for_change(self.idl, self.timeout)
except Exception as e:
# This could throw the same exception as idl.run()
# or Exception("timeout"), either way continue
LOG.exception(e)
sleep = min(2 ** errors, 60)
LOG.info("Trying to recover, sleeping %s seconds", sleep)
time.sleep(sleep)
continue
errors = 0
txn = self.txns.get_nowait()
if txn is not None:
try: