This also moves some helper functions from tripleo-common
mistral action to utils.py. A subsequent patch would
cleanup the mistral actions in tripleo-common.
Closes-Bug: #1915780
Change-Id: I8cf4c983c8810a42bd703b097dc1cb8034798314
Ansible runner does not seralize nested dicts correctly and
also raises argument list too long error.
Change-Id: I758b5d51d4fa3ea921e032b51f06d2486f137a45
Closes-Bug: #1914369
This changes to update the stack without using
the plan and also enables server side env merging
as we don't use the plan-environment.
Also makes changes to call derive params playbooks
without plan.
Depends-On: https://review.opendev.org/c/openstack/tripleo-ansible/+/772197
Change-Id: I8caad3e9185f1c6d23b0941b966192957ca8320b
This changes to use the stack environment instead
of swift plan for overcloudrc generation. Also simplifies
it by making a direct tripleo-common api call.
Change-Id: Id0458bfdf1b819fe783e721741cd56c89b03ecd1
Let's drop this annoying thing:
/usr/lib/python3.6/site-packages/tripleoclient/workflows/plan_management.py:328: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
"Updating plan with passwords.".format(name))
Change-Id: Ia161b57b1442c2cc4bc83b1c7f94bfaf59548e2c
The client utils will now run a new playbook to ensure that the local
archive directory is created early in the deployment process. This
change will allow us to build toward a swift-less deployment. All of
the client calls, save one, has been moved to use tripleo-common which
will assist us to better manage, and migrate from swift storage to a
local archive.
> As a product of this change all of the "webhook" calls have been
removed. which was deprecated as part of the Zaqar and Mistral work.
These calls were removed because several swift calls were tied into
them, and because mistral is no longer part of the stack, and has
been gone for a few cycles, we can safely remove these calls which
do nothing.
Depends-On: Ibe9b2ffe94cdf493fc84366979d1d78b8528ea1b
Change-Id: I7531612a49527f8a21df415c648acb41ac7a0b10
Signed-off-by: Kevin Carter <kecarter@redhat.com>
Let's make the monitoring of the mistral executions a bit more robust.
If for some reason a tcp reset occurs while monitoring the execution
state of a workflow, tripleoclient completely gives up:
2020-12-21 17:16:25 | 2020-12-21 17:16:25.015 297753 ERROR tripleoclient.v1.overcloud_upgrade.UpgradeRun [-] Exception occured while running the command:
keystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to https://192.168.24.2:13989/v2/executions/dfe7ee67-6cd0-407c-9f61-b355a1cf0b25:
('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
This is bad UX because the execution is actually running via mistral
in the background just fine, it's just the monitoring that did not
survive a hiccup.
With this patch we were able to inject artificial problems on mistral
with the reproducer below [1] and the client survived them just fine:
TASK [Gathering Facts] *********************************************************
Tuesday 22 December 2020 13:27:38 +0000 (0:00:03.337) 0:00:03.438 ******
ok: [controller-1]
2020-12-22 13:27:41.242 3728 WARNING tripleoclient.workflows.base [-] Connection failure while fetching execution ID. Retrying: Unable to establish connection to https://192.168.24.2:13989/v2/executions/bb693f44-180f-4b16-a215-d379e0fd9209: HTTPSConnectionPool(host='192.168.24.2', port=13989): Max retries exceeded with url: /v2/executions/bb693f44-180f-4b16-a215-d379e0fd9209 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fb4b22d9e80>: Failed to establish a new connection: [Errno 111] Connection refused',)): keystoneauth1.exceptions.connection.ConnectFailure: Unable to establish connection to https://192.168.24.2:13989/v2/executions/bb693f44-180f-4b16-a215-d379e0fd9209: HTTPSConnectionPool(host='192.168.24.2', port=13989): Max retries exceeded with url: /v2/executions/bb693f44-180f-4b16-a215-d379e0fd9209 (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fb4b22d9e80>: Failed to establish a new connection: [Errno 111] Connection refused',))^[[00m
ok: [controller-0]
PLAY [Load global variables] ***************************************************
The mistral execution continues correctly and tripleoclient deals with the hiccup without erroring out.
[1] A quick reproducer for this issue is to run a longer workflow (minor
update or ffu of a node) and once tripleoclient is just monitoring the
mistral execution just run:
iptables -I INPUT -p tcp --dport 13989 -j REJECT
sleep 13
iptables -D INPUT 1
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: Ie08f3bc7c7cd7796f067a9ee0a99c017a5567ea2
Closes-Bug: #1909019
Users may want to skip the container prepare process to ensure they
don't update their containers. This change adds a
--disable-container-prepare flag to the following actions which should
skip the preparation actions.
- overcloud deploy (and updates)
- overcloud plan actions
- undercloud deploy (and upgrades)
- tripleo deploy (and standalone deploy)
Closes-Bug: #1896757
Depends-On: https://review.opendev.org/#/c/737522/
Change-Id: I30b448930f53aef108d9bdb544a6d02b18658b0d
Ansible forks are now configurable for a deployment. The default
will also no longer exceed 100 forks (but --ansible-forks can).
Change-Id: Iaa6763a3124e45c1f0297cd59cc57f206dfc5cda
Depends-On: I57345d5b100efce143fa940b56c81f5e6bc6c390
Signed-off-by: Luke Short <ekultails@gmail.com>
This patch makes sure that the following two parameters are correctly
passed when creating deployment plan.
- use_default_templates
- source_url
It also removes useless validate_stack parameter, which is never used
in playbooks invoked to create deployment plan.
Depends-on: https://review.opendev.org/#/c/743128/
Change-Id: If54940ec8ee127c73d0b5820704f119466f80df0
This patch makes sure that explicit arguments are defined for the
update_deployment_plan method.
With this change, update_deployment_plan should raise TypeError when
the method is called with wrong arguments, and help us with detecting
interface mismatch more easily, compared to KeyError it currently
raises.
This patch also removes validate_stack parameter from update deployment
plan, which is always False.
Change-Id: I7dd35e64a32b99dd018ef8835d51ebb26fce8051
Now that we've decided to ignore skiplist when limit
is specified, ensure we do that in master/ussuri.
Change-Id: I8a67554f268042d2cceebf3ec3b1f098f001bbfd
Related-Bug: #1857298
--node-timeout - Maximum timeout for node introspection
--max-retries - Maximum introspection retries
--retry-timeout - Maximum timeout between introspection retries
Change-Id: I9c245dbc258c9714bb5a581d6d4d23b42cf53198
This change allows us identifying a set of parameters which should
not been passed in the upgrade prepare or upgrade converge steps.
As it is now, it is mostly intended to block the converge step
if the FFU parameters (Stein registry parameters) were left in
the environment files before running the converge step, however
it will allow blocking the upgrade prepare in the case that some
deprecated or not recommended parameter is provided in the templates.
The way how it works is by converting every single yaml passed in
the environment files into a list of keys (only for the
parameters_default so far), then it will try to intersect the list
of forbidden parameters with the list of keys. If there is a match
an exception will be raised showing those parameters:
ERROR openstack [-] The following parameters should be removed from
the environment files:
ceph3_namespace
name_suffix_stein
tag_stein
name_prefix_stein
ceph3_image
namespace_stein
ceph3_tag
Change-Id: I24715f5e55d4cd6cf9879345980d3a3c5ab8830c
The __future__ module [1] was used in this context to ensure compatibility
between python 2 and python 3.
We previously dropped the support of python 2.7 and now we only support
python 3 so we don't need to continue to use this module and the imports
listed below.
Change-Id: I19fbdebe406575d2567f98a322ff68e6e992fac7
It seems we want to output stack events irrespective of verbosity,
as per https://review.opendev.org/#/c/724856/. There is no point of
having all the useless logic for verbosity in wait_for_stack_ready().
Also fixes stack update for scale-down to list events.
Change-Id: I96a2a2255253aa2feac62b67ad5d5813e3126a20
This change reverts the change to verbose_level because it enables
ansible verbosity by default. Additionally this change hard codes true
for the wait_for_stack_ready function call so we always get the heat
stack creation output.
Revert "Don't override default verbose_level"
This reverts commit c8cd714588.
Change-Id: Idcc00fa04d4a356efb9e0bc30bc24e0121f31961
This change will provide the operator the ability to better control
a given deployment or operational task while leveraging the
tripleoclient.
A utility has been added to sanitize user input. This will ensure
the parsed string is in valid ansible limit format.
Change-Id: I190f6efe8d728f124c18ce80be715ae7c5c0da01
Depends-On: I0056fdbe3d9807e6baf4a1645a632ab9eb1b2668
Signed-off-by: Luke Short <ekultails@gmail.com>
Co-Authored-By: Kevin Carter <kecarter@redhat.com>
This was added to fetch the ansible errors via mistral api. We had
long removed the json_error.py callback plugin that used to write
ansible-errors.json, when moving ansible code to tripleo-ansible
repo and the command is useless now. Also, users won't need it as
they can get the errors from ansible.log.
Change-Id: Iceb4c282f3ae9fcb2c3836263ba66f5792e42334
We don't use mistral anymore to set the deployment status.
This simplifies setting of the deployment status and also
removes the usage of DeploymentStatusAction.
Change-Id: I5454510dc3aae681d5b37570a15cf0d0c8a3a114
We need to set the deployment status before stack deployment
and for playbook failure after the stack deployment.
Change-Id: Ic353b6552d28d66bcc53a7a4df318e60de5bd320
Fix misused ansible connection timeout and deployment timeout passed in
config download and ansible runner utility.
Allow ansible runner utility to be given a job_timeout as well.
Also fix the misuse of timeout parameters in related worklows. Add
--overcloud-ssh-port-timeout and use it to configure ansible connection
timeout for the DeleteNode interace of the involved
workflows. Then use the timeout parameter as real timeout instead of
mistakingly passing it as a connection timeout.
Add new unit test for ansible timeout in config_download. Add missing
coverage for the existing timeout-related params in other unit tests.
Closes-Bug: #1868063
Co-authored-by: Kevin Carter <kevin@cloudnull.com>
Change-Id: I2a4d151bcb83074af5bcf7d1b8c68d81d3c0400d
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
In order to make it configurable via env/settings,
use writebale tmp paths for ansible runner. This also aligns the
way we call it for other places.
Change-Id: I64999f19b4ce2083f05e09c40d6b89c8d8ba2cdd
Related-bug: #1868063
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
This change will set the verbosity consistently whenever a playbook is
executed via the client.
All tests have been updated to ensure that the verbosity setting is always
defined when a playbook is executed.
Change-Id: I35b10d48344c8b7f71186bc529a300f75d7b8d63
Signed-off-by: Kevin Carter <kecarter@redhat.com>
This workflow is no longer required becuase it has been converted to an
ansible playbook.
Story: 2007212
Task: 39045
Depends-On: I3d1c736f6d1ee704ece0101134f95582a5d060eb
Change-Id: I829217c99456f8ac8f49542eb1842c451fe77ad1
Signed-off-by: Kevin Carter <kecarter@redhat.com>
This change removes the `stack_management.py` workflow file. This file
is being deleted because it is unused since we converted the overcloud
delete workbook.
Story: 2007212
Task: 38418
Change-Id: I8adf62fce31ecd69464c429d5988c8f171bb640f
Signed-off-by: Kevin Carter <kecarter@redhat.com>
As we now have ansible playbook running the stack create/update
and would only finish when the stack has been *_COMPLETE/*_FAILED,
we need to get the event marker before doing the update, else there
would be no events listed for subsequent deploy (stack update).
Left the call to utils.wait_for_stack_ready() for the time being.
We can just get the events as the stack operation is finished or
move the event polling to the playbook.
Change-Id: I746f3bb135fac309bb7e780648e3e8216d76770e