Restart tiller on openstack pending install check

When the armada-api pod, which runs helmv2-cli, goes up, it connects
tiller to a postgres instance running on the active controller. It does
that using the active controller's floating IP address. On setups with
more than one controller, this creates an issue where that connection
is no longer valid after performing a host swact, since it still points
to the old controller.

After the swact, the first helmv2-cli command will fail with a broken
pipe error. One of the steps in the Openstack installation involves
checking the system for pending helm installations, which will run a
helmv2-cli command, causing the application operation to fail if ran
right after a host swact.

This patch is a workaround to that problem. It forces a tiller restart
by catching the first HelmTillerFailure exception caused by the broken
pipe error and retrying the operation, which then will reestablish the
connection between tiller and the correct instance of postgres.

Closes-Bug: #1917308
Change-Id: Ia09f9d2844611471314d5d3af70e9bbb0938437c
Signed-off-by: Gustavo Santos <gustavofaganello.santos@windriver.com>
This commit is contained in:
Gustavo Santos 2021-03-26 18:11:47 -03:00
parent 26d65c3ed7
commit 1c69d99d4f
1 changed files with 9 additions and 0 deletions

View File

@ -20,6 +20,7 @@ from sysinv.openstack.common import context
import tempfile
import threading
import psutil
import retrying
LOG = logging.getLogger(__name__)
@ -179,6 +180,14 @@ def delete_helm_release(release):
timer.cancel()
def _retry_on_HelmTillerFailure(ex):
LOG.info('Caught HelmTillerFailure exception. Retrying... '
'Exception: {}'.format(ex))
return isinstance(ex, exception.HelmTillerFailure)
@retrying.retry(stop_max_attempt_number=2,
retry_on_exception=_retry_on_HelmTillerFailure)
def get_openstack_pending_install_charts():
env = os.environ.copy()
env['PATH'] = '/usr/local/sbin:' + env['PATH']