[TRAIN-only] Don't stop scale down if ansible fails

This is train only as the fix in future versions may be different.

If the ansible action fails, we still want to continue with the node
removal. Currently if all nodes being removed are down, the scale down
action which is invoked using the config download workflow will error.
There is no way to remove the nodes if we block on this.  Scale down
actions should be best effort so we should just throw a warning and
cleanup the node(s). In future versions we could add better handling of
specific error codes via ansible-runner, but the mistral execution is
not flexible enough to skip only one some errors.  We had addressed this
condition previously, however it appears that newer versions of ansible
are returning a non zero rc even if we have ignore_unreachable set to
true.

Change-Id: I1fe1fecffcf3b44895721118d675205e612155eb
Related-Bug: #1887702
This commit is contained in:
Alex Schultz 2020-12-16 12:43:32 -07:00
parent 43773f5a05
commit f5892a583a
2 changed files with 5 additions and 3 deletions

View File

@ -144,7 +144,8 @@ class TestDeleteNode(fakes.TestDeleteNode):
'timeout': 240
})
def test_node_delete_wrong_instance(self):
@mock.patch('tripleoclient.workflows.scale.ansible_tear_down')
def test_node_delete_wrong_instance(self, mock_tear_down):
argslist = ['wrong_instance', '--templates',
'--stack', 'overcloud', '--yes']
@ -162,7 +163,7 @@ class TestDeleteNode(fakes.TestDeleteNode):
}])
# Verify
self.assertRaises(exceptions.DeploymentError,
self.assertRaises(exceptions.InvalidConfiguration,
self.cmd.take_action, parsed_args)
@mock.patch('tripleoclient.workflows.baremetal.expand_roles',

View File

@ -37,7 +37,8 @@ def ansible_tear_down(clients, **workflow_input):
if payload['status'] == 'SUCCESS':
print("Scale-down configuration completed.")
else:
raise exceptions.DeploymentError("Scale-down configuration failed.")
print("WARNING: Scale-down configuration error. Manual cleanup of "
"some actions may be necessary. Continuing with node removal.")
def delete_node(clients, timeout, **workflow_input):