Fix TRY_AGAIN handling

I believe removing wait_for_change back in the day was an error.
We can't do the exponential backoff ourselves because that will
also delay reconnecting to the the db, because idl.run() needs to
be called. Also, do_commit() doesn't ensure that idl.run() is
called if status is TRY_AGAIN. wait_for_change() will ensure that
we call idl.run() to reconnect quickly and don't try the txn again
until we have reconnected and the seqno has changed.

Revert "Don't spam retries 100s of times a second"
This reverts commit 6596164f51217cc7fabf302ce14ccc9d9beaff1f.

Revert "Ensure idl.run() called on TRY_AGAIN"
This reverts commit 1810faecc9ad2345f3e2f9185ac64194c5a0d711.

Revert "Don't wait on TRY_AGAIN when calling commit_block()"
This reverts commit 158ae06bce0f56e93677f94c59f81e5e76ee1ccc.

Closes-Bug: #1988457
Change-Id: I237136262862d5117d08eb3b513a0b8658a79f05
This commit is contained in:
Terry Wilson 2022-09-01 09:48:38 -05:00
parent 96cf8d6288
commit c3bacb3ba3

View File

@ -23,7 +23,6 @@ from ovsdbapp.backend.ovs_idl import idlutils
from ovsdbapp import exceptions
LOG = logging.getLogger(__name__)
MAX_SLEEP = 8
class Transaction(api.Transaction):
@ -75,7 +74,6 @@ class Transaction(api.Transaction):
def do_commit(self):
self.start_time = time.time()
attempts = 0
retries = 0
if not self.commands:
LOG.debug("There are no commands to commit")
return []
@ -84,6 +82,7 @@ class Transaction(api.Transaction):
raise RuntimeError("OVS transaction timed out")
attempts += 1
# TODO(twilson) Make sure we don't loop longer than vsctl_timeout
seqno = self.api.idl.change_seqno
txn = idl.Transaction(self.api.idl)
self.pre_commit(txn)
for i, command in enumerate(self.commands):
@ -105,20 +104,9 @@ class Transaction(api.Transaction):
status = txn.commit_block()
if status == txn.TRY_AGAIN:
LOG.debug("OVSDB transaction returned TRY_AGAIN, retrying")
# In the case that there is a reconnection after
# Connection.run() calls self.idl.run() but before do_commit()
# is called, commit_block() can loop w/o calling idl.run()
# which does the reconnect logic. It will then always return
# TRY_AGAIN until we time out and Connection.run() calls
# idl.run() again. So, call idl.run() here just in case.
self.api.idl.run()
# In the event that there is an issue with the txn or the db
# is down, don't spam new txns as fast as we can
time.sleep(min(2 ** retries, self.time_remaining(), MAX_SLEEP))
retries += 1
idlutils.wait_for_change(self.api.idl, self.time_remaining(),
seqno)
continue
retries = 0
if status in (txn.ERROR, txn.NOT_LOCKED):
msg = 'OVSDB Error: '
if status == txn.NOT_LOCKED: