This changes to update the stack without using
the plan and also enables server side env merging
as we don't use the plan-environment.
Also makes changes to call derive params playbooks
without plan.
Depends-On: https://review.opendev.org/c/openstack/tripleo-ansible/+/772197
Change-Id: I8caad3e9185f1c6d23b0941b966192957ca8320b
As we've moved to a new way of generating nic configs
with only ansible, this would ensure that we check
if there is any custom heat nic config mapping and
only allow if user sets the 'NetworkConfigWithAnsible'
parameter as false.
Depends-On: https://review.opendev.org/758333
Change-Id: Ief2e6bb41687233d226ab5cb186fc6dbae191ce2
There have been cases where operators inadvertenly changed the
CephClusterFSID on a stack update, which is unsupported by both
Ceph and openstack.
For this reason we need to check that the existing deployed Ceph
cluster ID present in the stack is consistent with the value of
the environment, raising an InvalidConfiguration exception if
they are different.
Change-Id: I6aca5c701cb00c82c6b3f92db72b5547799a10bf
Closes-Bug: #1882548
When using the ironic "direct" deploy interface RAW images
are streamed to the target node. In comparison 'qcow2' images
is transferred to the baremtal node and the image is then
converted in to 'raw' in RAM. This put's a high RAM requirement
on the baremetal nodes.
This change updates the image upload/update code to convert
the 'qcow2' image to a 'raw' image prior to upload/update.
Related-Bug: #1893912
Change-Id: I4774e6afc3844ee7c1e8900f2509a2c402abf490
Don't use return code from ansible_runner and manage with a flag.
This also removes the fail_on_rc flag from run_ansible_playbook()
and makes it consistent to raise RuntimeError if rc !=0
everywhere including _standalone_deploy().
Change-Id: Ia5971af601d1d9500f33045768b38ac7937117f5
Closes-Bug: #1889394
This change will ensure that an ansible.cfg file stored in the config-download
directory is used and persists across runs.
Change-Id: I3b546921689d00b2cc1bde5a4d09363e65df79b5
Signed-off-by: Kevin Carter <kecarter@redhat.com>
This change will provide the operator the ability to better control
a given deployment or operational task while leveraging the
tripleoclient.
A utility has been added to sanitize user input. This will ensure
the parsed string is in valid ansible limit format.
Change-Id: I190f6efe8d728f124c18ce80be715ae7c5c0da01
Depends-On: I0056fdbe3d9807e6baf4a1645a632ab9eb1b2668
Signed-off-by: Luke Short <ekultails@gmail.com>
Co-Authored-By: Kevin Carter <kecarter@redhat.com>
This change will provide the ability to load extra vars from files, instead
of having to pass options through the CLI parser. By loading vars from file
we can ensure options are made more safe and better handled, especially in
cases when a given option may be massive, as is the case with
`parameter_defaults`.
> A new argument has been added to the ansible playbook runner which will
allow us to pass options into the method that will be stored in an
extravars file, which is then dynamically loaded by ansible-runner.
Information on extravars files can be seen here: [0].
> A test has been added to exercise the new extravars file capability.
[0]: https://ansible-runner.readthedocs.io/en/latest/intro.html#env-extravars
Closes-Bug: #1871338
Change-Id: I9675e587abf3f07e91319a40620a8f4c67fbf97b
Signed-off-by: Kevin Carter <kecarter@redhat.com>
Fix misused ansible connection timeout and deployment timeout passed in
config download and ansible runner utility.
Allow ansible runner utility to be given a job_timeout as well.
Also fix the misuse of timeout parameters in related worklows. Add
--overcloud-ssh-port-timeout and use it to configure ansible connection
timeout for the DeleteNode interace of the involved
workflows. Then use the timeout parameter as real timeout instead of
mistakingly passing it as a connection timeout.
Add new unit test for ansible timeout in config_download. Add missing
coverage for the existing timeout-related params in other unit tests.
Closes-Bug: #1868063
Co-authored-by: Kevin Carter <kevin@cloudnull.com>
Change-Id: I2a4d151bcb83074af5bcf7d1b8c68d81d3c0400d
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
This used to be created earlier by the mistral workflows running before
setting the status. As we've removing those, it has to be created,
if does not exist.
Change-Id: I9600f8b08391b36eae02a051713967b932fa06d3
This change adds a switch that will enable or disable raising an
exception when a playbook executes. This will allow some methods to
return the RC and status information when a playbook is run, even when
there's a failure. The default behaviour is to raise an exception on
failure, but when fail_on_rc is set to False the run_ansible_playbook
method will return the status and rc information regardless of any
failures.
To ensure we're not raising an exception from the ansible runner library
its been changed to RuntimeError.
Change-Id: I3af652615b5227144256074c05170d148f19bc1d
Closes-Bug: #1859182
Signed-off-by: Kevin Carter <kecarter@redhat.com>
This change removes the use of mistral from the update static methods. These
methods will now call the required functions directly using
`deployment.config_download`, which will save time and improve reliability.
* The run_update_ansible_action method has been updated to ensure we're supporting
multi-playbook execution properly.
Story: 2007212
Task: 38435
Task: 38438
Change-Id: Ic324847341142829e986128d502fdcab2cbddcd8
Signed-off-by: Kevin Carter <kecarter@redhat.com>
This change Converts plan status checks to using a direct call instead of
executing via mistral.
* A new method was added allowing us to update the deployment status object
when required.
* Tests have been added for the new static method update_deployment_status.
Story: 2007212
Task: 38430
Change-Id: Ie19be2078e2f349bf06e5b99ab93ca843e367463
Signed-off-by: Kevin Carter <kecarter@redhat.com>
This change adds a raised exception when the return code from a given
playbook run is not 0. This will ensure that any playbook failure is
captured, raising an exception from within the executing method. Prior
to this change exception handling was expected to be done in the calling
method, however, doing that would create lots of unnecessary code
duplication.
Change-Id: Ic742ec4653eb45a66c0d5c86d3d0ff31947be5c4
Signed-off-by: Kevin Carter <kecarter@redhat.com>
returning and raising exceptions are mutually exclusive. we need to
return rc properly from run_ansible_playbook() for it to be used
later in tripleo deploy.
Change-Id: Ia07433fb6886931530afebad49c8b6bf1f062af5
Closes-Bug: #1859182
This change replaces all of the ansible shell commands with the
python library, ansible-runner. This library is supported by
upstream ansible, is approved by the openstack foundation, is
supported in global requirements, and provides a better, more
programatic interface into running ansible playbooks.
All tests that interacted with the old shell commands have been
updated to now test using the library.
Change-Id: I8db50da826e2fbc074f4e7986d6fd00f6d488648
Signed-off-by: Kevin Carter <kecarter@redhat.com>
In a rare case like httpd reload by logrotate, heat-api returns
500 code. If this happens, trpleoclient can't get the status of
the stack even though the process is still on-going.
To handle this situation, tipleoclient should retry when it can't
get the stack information by 500 code.
Change-Id: I97a6825f4ff9f125eb597e5b7bd0c553c37e49e7
Closes-Bug: #1855633
This patch adds a new parameter called 'gathering_policy' (Defaults to
None) to the 'run_ansible_playbook' function. This parameter will
control the default policy of the Ansible fact gathering. Sets to None
by default, it will use the default policy for Ansible (ie. 'implicit').
Change-Id: I0668241a1675dd4e344cc24b6ff2cbb8f93b7a45
Signed-off-by: Gael Chamoulaud <gchamoul@redhat.com>
This patch adds a new parameter called 'plan' (Defaults to "overcloud")
to the 'run_ansible_playbook' function. It will allow to execute
Validations with Ansible in a different plan or stack that the default
one through the TripleO CLI. Note That it was already possible only
while using Mistral but not for Ansible.
This patch also brings:
- Change the default values of the 'tags' and 'skip_tags' arguments
to 'None' and fixes their non Pythonic tests.
- Add '--stack' alias to the '--plan' argument for 'validator run'
command.
Change-Id: I6f8f55963f3f5261ec1497b650e0ca509d31dd32
Signed-off-by: Gael Chamoulaud <gchamoul@redhat.com>
The 'openstack tripleo validator list' subcommand can now get only the
available parameters for the validations using the new --parameters
argument.
```
$ openstack tripleo validator list \
--parameters \
[--validation-name <validation_id>[,<validation_id>,...] |
--group <group>[,<group>,...]]
```
Here is an output example:
```
Waiting for messages on queue 'tripleo' with no timeout.
{
"undercloud-cpu": {
"parameters": {
"min_undercloud_cpu_count": 8
}
},
"undercloud-ram": {
"parameters": {
"min_undercloud_ram_gb": 24
}
}
}
```
The --create-vars-file allow the operator to generate either a JSON or a
YAML file containing only the parameters of one or multiple validations.
This file will be available to pass as extra vars to the validations
execution.
```
$ openstack tripleo validator list \
--parameters \
--create-vars-file [json|yaml] /home/stack/myvars \
[--validation-name <validation_id>[,<validation_id>,...] |
--group <group>[,<group>,...]]
```
Change-Id: I6e2255c0d490ee8105f0757d02f5d8fba1d4fa20
Signed-off-by: Gael Chamoulaud <gchamoul@redhat.com>
This patch adds a new parameter called 'extra_vars (Defaults to None) to
the 'run_ansible_playbook' function. This parameter will set additional
variables to the 'ansible-playbook' command. It will accept either a
dict or the absolute path of a file (YAML or JSON format).
Change-Id: Ib25ee9593528ad680b14ca09c62addbbd0b773a3
Signed-off-by: Gael Chamoulaud <gchamoul@redhat.com>
This patch switch over Ansible to run validations by names by
default. The --use-mistral argument will have to be used in order to
execute them through Mistral.
Co-Authored-By: Gaël Chamoulaud <gchamoul@redhat.com>
Change-Id: Ia393f4d776ab2c09439e7772b5596ddbb47e0a5e
This patch adds a new parameter called 'log_path_dir' (Defaults to None)
to the 'run_ansible_playbook' function. The Ansible log file will be
created in the location of the playbook by default, otherwise in the given
directory path.
Change-Id: I7222a116974458b9149771cb44f7d5f7bc51bc79
Signed-off-by: Gael Chamoulaud <gchamoul@redhat.com>
If the heat api is overloaded or temporarily unavailable, we might get a
503 or 504 from haproxy during the deployment. We should retry polling
for events in this case as to not prematurely exit the deployment.
Change-Id: I947cd0f9bf4a97e46c3d2bf3e9b986f7d38e9357
Closes-Bug: #1833452
Ironic can use HTTP links or local files, and we already put the images to
a location accessible inside of ironic containers (for introspection).
This change switches to using file images for IPA. The existing Glance
images are not deleted since some nodes may be using them. Multi-arch
layout of [[PLATFORM-]ARCH/]agent.EXT is reused from the unit tests
of the `image upload` command, assuming that's what people are using.
Change-Id: Ie6fa04112e3348f429dc42b28442f8996ab03f29
Implements: blueprint nova-less-deploy
Depends-On: https://review.opendev.org/#/c/663897/
We extend run_update_ansible_action
to allow running the Ansible playbooks with Mistral
or executing directly thru Ansible.
This is needed in case we need to run
exceptionally a task depending on Mistral
but Mistral is broken. For example, retry
an upgrade operation after having Mistral broken.
Change-Id: I15511b4f36260292e0ea4100b15b8e65a701b38b
We are duplicating the code to fetch a specific value from the stack
outputs. This adds a get_stack_output_item that can be used to fetch
any named stack output and refactors the other functions that did this
same type of action.
Change-Id: I9f45661d432c47f9df962009cc3c6b9182006d1c
The problem we're solving here is that our operators using SSL + FQDN
based endpoints will have failures during the deployment because we
don't lookup the FQDN into IP addresses, needed later in the deployment
for proper binding.
This patch transforms undercloud_*_host parameters into IP addresses:
- We raise if lookup returns nothing.
- We raise if lookup returns more than one IP.
- We support both IPv4 and IPv6.
- We raise if the IP is a loopback.
- We raise if the returned IP is invalid.
Utils changes:
* Introduce utils.is_valid_ip.
Return True if the IP is either v4 or v6. Return False otherwise.
* Introduce utils.is_loopback.
Return True if the given host is a loopback. Return False otherwise.
* Introduce utils.get_host_ips.
Returns a list of IPs for a host to lookup.
* Introduce utils.get_single_ip.
Translate an hostname or FQDN into an IP address if it is valid IP.
Return it unchanged if it is an IPv4 or IPv6 address.
If the host is not reachable, it'll raise an exception.
By default it excludes the loopbacks but it can be allowed by setting
allow_loopback = True.
* Use utils.get_single_ip to translate undercloud_admin_host and
undercloud_public_host to IP addresses.
Related-Bug: #1763776
Change-Id: Ic008cc758493aa95e8aa237d23c2f66c0a930509
With the new validation framework, all the validations have been
migrated into their own Ansible Roles. The validations are now usable
from tripleo-validations/playbooks directory.
Change-Id: Id836ba2d88f04a90c1738bb71f2aeb07211d2620
Implements: blueprint validation-framework
Signed-off-by: Gael Chamoulaud <gchamoul@redhat.com>
The push=False flag with use with kolla-build.conf is not working if we
use buildah, clear example is the
tripleo-build-containers-centos-7-buildah job, it's suppose to work like
docker one but it's pushing too.
Closes-Bug: #1822752
Change-Id: I01788b3c11ac701b2cf8c151f95ccad7046532de
As suggested by Jiri in this review adding a list() around
'map(replaced_list_value, template_part)' fixes the py3 issue
for me.
Closes-bug: #1819737
Change-Id: I1241985eac1aa7e092a11db5a443da0055ff0141
Unfortunately strftime does not always return a timezone that matches up
with what is available via zoneinfo on the file system. So instead of
using that, this patch creates a function to use timedatectl on the host
itself to determine what the current timezone is configured to.
Change-Id: I0d64cb0a534b48f1aa747655f7b7d997c74d77bc
Closes-Bug: #1820081
We've been deploying with keystone v3 for some releases now, so it's
time to get rid of this file; given that it's the same content as
overcloudrc.
Change-Id: I137e08213ef7f0f49510e2ebc905e351fb25b85a
Closes-Bug: #1733640
The symlink needs to be created with sudo because if validations are
enabled, the call to run ansible-playbook-3 is before the "sudo tripleo deploy
command".
This patch moves the symlink into a function, that will be called before
the preflight if enabled and during the tripleo deploy/upgrade command,
which should cover standalone deployment and upgrade.
Change-Id: I1ce9b3df6c937cd796728d4fc5921fcf1023e5db
Check defined networks in the resource registry of the stack against the
networks defined in the environment files. If the environment provided
doesn't have the networks defined, it's likely they were improperly
dropped which can lead to deployment issues. This is a light check that
only checks for the existance of the networks but not the contents of
those networks. This handles the case where a user forgets to include
the network-isolation configuration on a subsequent update to the cloud.
This does not prevent a user from changing the contents of the networks
to something that breaks their deployment.
Partial-Bug: #1817631
Change-Id: Ia97a2367770e37bf8c55f2fd04c9a9efde914a67