When deploying TripleO in dry-run (Ansible check mode to True); we don't
need to generate the Hieradata ; it's expecting actual files
and data; which doesn't exist in check mode.
Let's just skip the whole role if the check mode is used.
Change-Id: I1f54878ef6a59ab772e6c2f75493efb2eb45cf3a
Instead of including the entire deploy_steps_tasks.yaml tasks file for
each role at each step, use the per-step files and only if they exist.
This cuts down on the amount of time that ansible has to spend skipping
tasks that don't get run at a certain step, which can be significant at
scale.
Change-Id: I06ee04b9b47226433f25e3cff08c461462a907d9
Depends-On: Id5fdb4dd1a6290d1097d2d81523161c87ab6d4dd
Instead of executing these debug tasks for every host, we can delegate
them to localhost and only run them once.
Change-Id: I367db4c9e743c77f121751ecca6bb4996b8a3df9
include_tasks is dynamic and the tasks are either included (or not) at
runtime. This has the advantage that if a "when" keyword excludes the
include_tasks, then all the tasks are excluded as a group.
This is opposed to import_tasks which happen at playbook parse time. The
"when" keyword is inherited by each individual task that was imported.
While the two are functionally equivalent for these use cases,
import_tasks ends up being much slower, since ansible then has to
compute a much larger set of tasks to skip at runtime. Using
include_tasks is much faster, even at small scale (~50 hosts).
Change-Id: I2db81d39b3294aa2784a340562f10fd9bf3fe9ee
- Force fact gathering so that we're ensured to have the proper FQDN
- Update start squence so that our scale down process is not starting
from irrelevant steps
- correct list evaluation. The compute service argument should have one
item in the list. Prior to this change it was expecting zero items,
which was causing the removal tasks to skip.
Co-Authored-By: "Cedric Jeanneret (Tengu) <cjeanner@redhat.com>"
Co-Authored-By: "Emilien Macchi <emilien@redhat.com>"
Co-Authored-By: "Kevin Carter (cloudnull) <kecarter@redhat.com>"
Change-Id: I7c1615685718924e872a2f9173b15c63bba8c482
Closes-Bug: #1856062
"service_names" was a useful hieradata which listed the services enabled
for a specific roles vs "enabled_services" which are for all the
services enabled in the cloud, no matter the role.
This is re-added for backward compatibility.
Depends-On: I75b1112089a66cf5db0a2fd651bb24428cf861fd
Related-Bug: #1855138
Change-Id: I7339b8791817bdaffa65c928d424796114efdf57
Just to ensure we have the right label, even if something does mount the
directory with re-labelling. This would avoid any race-condition chance.
Also update old svirt_sandbox_file_t alias since the common thing is
"container_file_t".
Change-Id: Ic036ad901885f9d8c8072b560f2d9f3c8e919d58
Closes-Bug: #1854377
This has been observed downstream via https://bugzilla.redhat.com/show_bug.cgi?id=1777529
The scenario is as follows:
1) Undercloud VM is configured with a bunch of interfaces in dhcp
2) One NIC (eth0 in this case) is on a network without a dhcp server
3) Said NIC is configured in undercloud.conf to be part of the
br-ctlplane bridge
In the above scenario the undercloud install will fail with the
following error:
"Unable to start service network: Job for network.service failed because the control process exited with error code...."
The reason is that eth0 times out when asking for dhcp and so the whole
network.service start command fails.
We can just move the network service enablement *after* os-net-config
has run. This fixes this scenario (I just tested it) and it still
covers for enabling the network service in order to maintain network
connectivity at reboot, which was the reason for adding this code in the
first place anyway.
Change-Id: I0d13c9ee2490aa765d546961c9a1fc14e931c0c7
Since the "render all_nodes data as group_vars" tasks are skipped when
ansible check is true, we should not run the "set all_nodes data as
group_vars" tasks as well. It was missed and now failing when --check is
used (see bug report).
Change-Id: I2852c8285a0e72d855bfed216b53de6bdeeabe68
Closes-Bug: #1854246
It may happen this service isn't enabled (especially on newer
Centos/RHEL releases), leading in network misconfigurations.
Ensuring it's actually enabled allows br-ctlplane to get configured upon
reboot.
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1774581
Change-Id: I4a56d837e08498bc4d25d25c7203bfd7d012974e
Co-Authored-By: Alex Schultz <aschultz@redhat.com>
That step include, in particular, the puppet-container.py/sh creation
file plus some other task that should happen very early during the
deploy process.
During update if a change in puppet-container.py happen in the
template we have to trigger those steps to get the changes. This is
especially critical because we run puppet container configuration
during the update stage.
Closes-Bug: #1853156
Change-Id: I26406da82c584dc5093c17ad26f263057a5cbcaa
This patch updates the templates to use the new ansible role,
tripleo-hosts-entries, for managing the entries in /etc/hosts instead of
the values from the Heat stack.
Change-Id: I606e0f27f9f9ae9d85bc0fc653f8985eb734d004
Depends-On: Ia02ca1263590e2b579f2534e99119d7b1cd4b39a
We don't want to run the "Render all_nodes data as group_vars for overcloud"
tasks in check mode, since it relies on the command "whoami"; which
doesn't execute in check mode.
Change-Id: Ia8df4794fcf8eff9a2d5a8a7e99e0e5ebf1f8e1f
For each step and each container, generate a unique JSON, and Paunch
will be able to read them all thanks to the patch in dependency
(tripleo-ansible).
Old location:
/var/lib/tripleo-config/container-startup-config-step1.json
We keep the old files for backward compatibility.
New location:
/var/lib/tripleo-config/container-startup-config/step_1/haproxy.json
Note: hashed files won't be generated for the old location anymore,
since it's done via container-puppet.py in which we now give the new
location.
Story: 2006732
Task: 37162
Depends-On: If0f1c6c308cd58f7baa9a8449fbf685ff10f0e0a
Change-Id: I1cf8923a698d0f6e0b1e00a7985f363a83e914c4
Because these steps are shared with the undercloud/standalone/etc, it
doesn't necessarily make sense to declare they are Overcloud steps.
Let's just call them Deploy steps.
Change-Id: I60124d27f305333c8d54175ac5f2b2e500b45409
Previously, the use of double quotes in these tasks caused ansbile to
escape the double quote when printing the debug message. The escape
character (\) made copy/pasting the task name more difficult.
Switching to single quotes around the task name makes it easier on the
operator to copy/paste so that --start-at-task can be easily used.
Change-Id: I5e5837cc22462769de05be73249633de4fa65fcf
Ealier, KernelArgs had been configured using ansible
tasks part of THT repo. Thoese ansiblet asks has been
moved to tripleo-kernel role of tripleo-ansible. This
role will be invoked from the boot-params-service.
boot-params-service has been moved from pre network to
the deployment/kernel directory.
OvS-DPDK configuration was done using puppet-vswitch
module by invoking puppet in PreNetworkConfig's
ExtraConfig script. A new ansible role tripleo-ovs-dpdk
has been created to apply the DPDK configurations via
ansible instead of puppet. This role will be common
for both ml2-ovs and ml2-ovn. Common parameter merging
has been enhanced to provide common deploy steps.
ODL is not validated as it has been deprecated and
currently no active usage or development.
Depends-On: https://review.opendev.org/#/c/688864/
Change-Id: I4b6f7d73cf76954e93760c59d522d485187157cf
After the refactoring of step_1 to improve speed during deployment, we
were not updating the /var/lib/container-puppet/container-puppet.json
or the container-startup-configs.json for instance.
Net result is that the configuration is being done on the initially
deployed container and non HA container are not updated.
Note that not all step1 tasks were refarctored into the
deploy_steps_tasks_step_1. So we still need to loop over step 1 in the
deploy_steps_tasks.
Closes-Bug: #1848359
Change-Id: I76b16cd7781ea601778004afa4e0bc3020ce3c59
Default is True for backward compatibility. For now, only Paunch is
supported but there are some alternatives under development:
I2f88caa8e1c230dfe846a8a0dd9f939b98992cd5
Change-Id: Iceff88c6f4710c8023541314ac08c1ea4100cee7
The role name was missing from the task name, so the debug task that
refers to the task name was incorrect. This patch adds the name so that
--start-at-task can be used with Host prep steps for a specific
role.
Change-Id: Ie0888d5d075f0fae528bc4c71482246d334632ef
container-puppet.py had within it a nested bash script using
a string literal. This change moves that script into a stand
alone file which will be copied to the host and used when
needed. This change will begin allowing us to deconstruct
the container-puppet.py process allowing us to simplify the
process, inturn making deployments more reliable, faster,
and easier to understand.
Change-Id: I447f33f159c5a50e55401b4e0a893c250869d185
Signed-off-by: Kevin Carter <kecarter@redhat.com>
Use update_serial as ansible variable for being able to
override the value with CLI or ansible-playbook command line.
This should result as the following:
at the heat level:
* if no update_serial in the roles_data is provided, the default
value is 1
* if update_serial is provided by the roles_data, the default value
for the ansible var is the one in the roles_data.
* if ansible-playbook -e update_serial=<something> , the serial value
is the one provided by the CLI.
Change-Id: Id4a6326977bebba56c3da2dbf3c4113b6658d433
These appear to be dumping the vars to the logs which is quite verbose
and uncessary. Let's no_log the include vars so we don't end up with the
vars constantly dumped out.
Change-Id: I06a5f893adef4eaf6db17bf5ead4653cfed599f1
The large block of tasks in deploy-steps-tasks.yaml that are run only at
step 1, results in a large number of tasks being skipped at steps 2-5.
Task skipping is not "free", as ansible has to repeatedly compute what
needs to be skipped.
Note that the previous usage with the block statement causes the "when"
statement to be inherited to each task, so it has to be recalculated for
every task across every node for all steps.
The issue starts to become apparent with deployments with larger number
of nodes.
This commit refactors the steps 1 tasks into their own tasks file, and
then uses the include_task module to only include the tasks at step 1.
A separate play is also used for these step 1 tasks so that ansible
doesn't have to repeatedly calculate if the include_task should be
executed for steps 2-5 as well.
Change-Id: Id5985ce8ac741baa9adc9f5874df0459fd4c24b2
Up to now, each of the upgrade steps to gather facts from the
Undercloud and Overcloud nodes did contain a tag named facts.
This allowed us to run the facts gathering at will, however
there is a problem with the upgrade_tasks. As it is allowed to
pass a specific tag during its execution. So, if the task passed
is different than facts, for example system_upgrade, then the
facts gathering tasks wont be executed and the upgrade tasks
will fail when trying to access some of the ansible facts (as
it could be ansible_hostname).
The solution for this is to run always the facts gathering, no
matter what tag is being passed we'll have the facts available
during the execution.
Change-Id: Ibee072bec9916804163dab29e164eb1423cd08f8
Named debug ansible tasks have been added to the plays that get
generated in deploy_steps_playbook.yaml (from common/deploy-steps.j2).
The explicitly named tasks allow for using ansible-playbook's
--start-at-task option to resume a deployment from the start of a given
play.
For example, this could be used to resume a deployment with:
ansible-playbook ... --start-at-task "Overcloud common deploy step tasks 3" ...
Previously this was not possible since many of the tasks that got
generated in common_deploy_steps_tasks.yaml used an ansible variable in
the name, which is not resolved until runtime, so --start-at-task is
ignored.
Change-Id: If40a5ecaacf8c74c98775eb6bde05d967694f640
There are places when we have include_tasks in nested tasks. It means
that -t "tag_from_nested_task" won't work as all nested tasks will need
either "always" tag or should be "import_tasks" instead.
This patch replaces include_tasks with import_tasks for External steps
allowing to run
openstack overcloud external-update run --tags tag_from_nested_task
openstack overcloud external-upgrade run --tags tag_from_nested_task
without hacking all nested tasks.
Closes-Bug: #1842410
Change-Id: I51a148cdc5538d5a1106d58d227d361d1e6f9e19
To reflect the change done in dependency, we need to change the variable
name.
Change-Id: Id0f708e6d6ce9565a5569716edfe993c7eceabaa
Related-Bug: #1842141
Depends-On: Idbdd6a21eb2cd488daa1e3ddc844b4fc5267047c
Instead of using one task include per role in the same play, use
separate plays instead. This reduces the amount of task skipping that
Ansible has to do since each include only applies to a single role.
In a deployment with many roles, this will improve performance.
Change-Id: I01ef631ea3dad8b9c030d61ed0883a9af13616ad
When using --tags to run only a subset of tasks from
common/deploy-steps-tasks.yaml, the condition that checks the result of
container_startup_configs_json_stat.stat.exists was failing since the
task that defined it was skipped.
This patch adds an additional "is defined" check to ensure the var is
defined be the result is checked.
Change-Id: Iadde90ed9416902848df2e60551470c0f1689a32
Once the tripleo-ansible patch has merged that also expects the
bootstrap node to be chosen from the sorted inventory group, this
variable can be removed.
Depends-On: I4b280bcafce42bba4c823b8205f296b83e2f3e5d
Change-Id: Icf654f6597291ddff8aa90b3afb8679dce9e2284
Previously, the task that checks if container_puppet_tasks should be
written to the bootstrap node was not sorted, which meant the order was
undefined coming out of the Ansible inventory.
This patch adds a sort filter to the check so that by convention the
first node is chosen from the inventory.
A variable, sorted_bootstrap_node, is also added to control the sorting
in the bootstrap_node.j2 template from the tripleo-hieradata role so
that the tripleo-ansible patch is able to merge first.
Depends-On: I3d595ea5b84f940a3b2dbc69798f69fe03529c10
Change-Id: I6b93f5b0747c5a11d24615a3bbb5516f9be81401