The playbook fails when removing unreachable nodes from deployment with
`openstack overcloud delete node`. Some `ignore_unreachable: true` are
missing. Also, one cannot use `any_errors_fatal: true` when ignoring
unreachable nodes, otherwise the playbook execution will stop after the
current task.
Change-Id: Ibcb84e58bac1975490df281c0de950cdf74337b2
When refactoring deploy-steps to add a common playbook [0] it seems that
the post_upgrade_steps_playbook block was missed. As a consequence, when
executing the post_upgrade_tasks some of the common Ansible variables
are not available.
[0] - Ib00e8aa9f7d06517290543a8aaf8a2527969bd3c
Change-Id: I04704a14a8b932e21d21348e10014c707a87eeeb
Closes-Bug: #1875579
Since I2cc721676005536b14995980f7a042991c92adcc we can no longer assume that
an overcloud group exists in the inventory. Default to the <stack_name> group
instead.
Change-Id: I895e315ff3984ebf1806288a8275a8b0d74bef49
Closes-bug: #1875429
Currently if you have selinux enabled on the undercloud but disable it
for the overcloud, selinux is disabled on the undercloud during the
deployment. This can be resolved by only managing the selinux setting
for the deployment target hosts rather than the all.
Change-Id: I94b81ea0b954cdba7704720a145b752fa58d4308
Closes-Bug: #1874828
This one toggles the no_log parameter. Directly related to #1873770 in
order to allow a deeper debug within CI.
Change-Id: I27f677467263c0e6cc78d775edff55b3811fec1f
Related-Bug: #1873770
This simplifies all the split/join transformations and improves the
memory footprint to a reduced list of unique entries for
HostsEntryValue (originally required for storing the ultimate data for
hosts entries in a form of a quite long single-line string value).
That improves the hosts entries processing for large scale deployments
and removes possible limitations to the sizes of strings.
Closes-bug: #1869375
Change-Id: I5ac498621e9e3c49def565744a7b521cb2cc5c25
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
Hosts entries are used to be configured via tripleo_ansible's
tripleo_hosts_entries.
Ifd4bc4ce5618587c341ecbf37f82777ae6fc2f4a removed the use
of WRITE_HOSTS, which currently makes hosts-config.yaml "headless" and
taking no real data for the hosts-config.sh template that generates
outputs for OS::TripleO::Hosts::SoftwareConfig.
Also I606e0f27f9f9ae9d85bc0fc653f8985eb734d004 removed the use of
HOST_ENTRY, which makes the hosts-config.sh taking an empty value for
it.
Probably that all makes it safe now to remove any use of
hosts-config.sh and hosts-config.yaml and corresponding
OS::TripleO::Hosts::SoftwareConfig completely.
Change-Id: Id04767ae0c32caf62271cf564608350974fefd1b
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
Many of the lines are difficult to grasp due to the crazy
quotation and concatenation implemented to get the desired
result from generating playbooks via a jinja template.
We can make it easier to read and easier to understand by
using the jinja raw tag instead. This eases the maintenace
burden on us all and helps us sleep better at night.
Change-Id: I82c4de4a63817707a2b0ed0ced827be37c0d0463
When checking for the existance of a file on the host where
Ansible is being executed, using the stat module with localhost
delegation is rather heavy-handed. We can instead just make use
of the 'exists' test.
This should improve execution time just a little bit and reduces
the number of tasks for us to maintain. We also remove the
repetition of the task file path by using a variable.
Change-Id: I8b278ca83b2afb07575dbae2496ec265c3a06473
It would appear that we use many of the same plays in several
of the playbooks. In this patch we extract these into a common
playbook file so that we only need to repeat a single import
in each of the playbooks. This way we reduce the maintenance
burden because we only need to maintain it in one file.
Change-Id: Ib00e8aa9f7d06517290543a8aaf8a2527969bd3c
- When Paunch is disabled, don't create container-puppet.py and if the
file exists, make sure we remove it so operators don't run it by
accident.
- Remove the reference of that script from the README and the commands,
to make it clear there is a new tool now.
Change-Id: I5032eef6567b37c02fe53dea852aadff3e185eec
It would appear that we use the same str_replace params most
of the time, and there's no harm in using aliases where only
some of them are used. This way we reduce the maintenance
burden because we only need to maintain it in one place.
Change-Id: Ib034405a15ade9e9fb234a9875ebbe922abfdfc6
include_tasks is dynamic and the tasks are either included (or not) at
runtime. This has the advantage that if a "when" keyword excludes the
include_tasks, then all the tasks are excluded as a group.
This is opposed to import_tasks which happen at playbook parse time. The
"when" keyword is inherited by each individual task that was imported.
While the two are functionally equivalent for these use cases,
import_tasks ends up being much slower, since ansible then has to
compute a much larger set of tasks to skip at runtime. Using
include_tasks is much faster, even at small scale (~50 hosts).
This is applying what was done in https://review.opendev.org/697510
to the update/upgrade tasks.
When doing include_tasks, we ensure that we also apply the 'always' tag
so that we have access to use the tags in the included task files. See
[a] for further details.
[a] https://odyssey4.me/2019/11/26/ansible-include-tags.html
Change-Id: I2eab008ca27546acbd2b1275f07bcca0b84b858c
Except for standalone004 and ovb-ha jobs which still run on Docker,
let's enable tripleo-ansible to manage the containers instead of paunch.
Depends-On: https://review.opendev.org/#/c/709043
Change-Id: Ib29e7c9ce4028e1cb6f6ea6c0ae77890aefde93b
We need to make sure that the hiera data are fresh before the update
step so that anyone using those data during those steps are seeing the
latest information from heat.
Factor out the hiera generation and include it in deployment and
update playbooks.
The double tasks definition in the deployment playbook seemed to be
redundant so It has been removed.
Change-Id: I6b6c676880ccc8cbed23af135e5865c222a8f1d0
Closes-Bug: #1861799
Running os-net-config as async, with failed_when as false
results in undefined variable error if async task times out.
Instead of ignoring failure of task, check for the
presense of results of the command execution 'rc', if it
is not defined, then rest of the tasks are not useful.
Closes-Bug: #1862627
Change-Id: Ibbcde856ac69bf73a47086d95a52c3b1a0d10911
While they are, at SELinux level, exactly the same (one is an alias to
the other), the "container_file_t" name is easier to understand (and
shorter to write).
A second pass in a couple of days or weeks will be needed in order to
change files that were merged after this first pass.
Change-Id: Ib4b3e65dbaeb5894403301251866b9817240a9d5
When performing the Undercloud upgrade from OSP13
to OSP16, we start from an almost empty Undercloud
node which has been upgraded to RHEL8 via Leapp.
The /var/lib/config-data is being lost during the
upgrade procedure, so this task makes sure that the
folder exists before checking the selinux state.
Change-Id: I760a4e532e0c299efcf57cee68e8e8f93795ea29
The container-puppet tasks only need to be run if tasks actually exist, which
is already being checked on the ansible control node.
A "when" statement is then applied to the set of tasks necessary to run
the container-puppet tasks, when the tasks are actually defined.
This patch moves that set of tasks to a separate tasks file and uses a
dynamic include. This results in less tasks being skipped, which can
save several minutes at scale. This results in 3 less tasks that need to
be skipped at steps 1-5, which equates to 15 tasks overall, when no
container-puppet tasks actually exist.
When container-puppet tasks do exist, all the tasks will be executed as
necessary.
Change-Id: Ifad32bf79942cde58295fd9aae7e23e2f62c1ae2
Prior to this patch, the /etc/hosts was generated only on the overcloud
nodes, leading to some issues when it comes to TLS-Everywhere, as raised
in associated bug.
Depends-On: https://review.opendev.org/706242
Change-Id: I836ab1a23c8aea35c0cea54d0765c7313a4b9038
Closes-Bug: 1861782
The generate-config tasks which run on the host to generate config data
under /var/lib/config-data, only run at step1.
There are several tasks that used a when statement to only run the
related tasks at step1. This patch moves all the related generate-config
tasks to a separate tasks file, which can then be dynamically included
at step1.
Using a dynamic include results in less tasks being skipped, which can
save several minutes at scale. This results in 4 less tasks that need to
be skipped at steps 2-5, which equates to 16 tasks overall.
Change-Id: Ifdddcb13362e26babedd47e674089fb0e2a37994
Update deploy_steps_playbook.yaml to use the new action plugin for
rendering the all_nodes data. The native python is much faster than the
jinja2 template:
Change-Id: I3ac05c30f7c5d136c5da9441faf7890cb6fb9d05
These roles were not renamed when we removed all of the hyphens.
This change removes the remaining hyphenated roles.
Change-Id: I10a0064fa0bdb80957a3ef7acfe376c745d8512b
Signed-off-by: Kevin Carter <kecarter@redhat.com>
In Id5985ce8ac741baa9adc9f5874df0459fd4c24b2 the step1 tasks were moved
into their own file with the leading whitespace and block statement left
exactly as they were.
This patch removes the unnecessary block statement, moves the block name
to the parent inclusion and removes all the leading whitespace.
Change-Id: I243c761a88f746a6abb4ddb13845e813eaf7155c
There was already a play/task for "common deploy step tasks 1", so this
was a bad name to begin with. Use common bootstrap tasks instead.
Also adds a missing debug task.
Change-Id: I9840a26f10d8ad72b5fa187e56b1b3dbfd63e40d
It would appear that we use many of the same tasks in several
of the plays. We can use anchors/alaises to reduce this
repetition This way we reduce the maintenance burden because
we only need to maintain it in one place.
Change-Id: I2c8a4a0270c99d76500ac42d90fffdc0475cb995
The next iteration of fast-forward-upgrade will be
from queens through to train, so we update the names
accordingly.
Change-Id: Ia6d73c33774218b70c1ed7fa9eaad882fde2eefe
When upgrading from Rocky to Stein, an upgrade of the operating system is
performed. This upgrade from RHEL7 to RHEL8 implies the removal of the
default /usr/bin/python binary. As the facts cache is enabled, Ansible's
strategy does not consider to upgrade facts and therefore we try to run the
ansible playbook using the old python binary when running the upgrade.
This fails with the error: /usr/bin/python: No such file or directory.
This patch makes use of the setup task in combination with gather_facts
false, to ensure that the facts are gathered and refreshed for the
Overcloud nodes. This way, we make sure that we are using the right
python binary. As during scale, a similar situation is occuring, this
patch adds the same logic in scale_playbook.
Closes-Bug: #1856313
Change-Id: I87974e88c38b42e90bc3cd801fcf1deaf268720c
Ansible has decided that roles with hypens in them are no longer supported
by not including support for them in collections. This change renames all
the roles we use to the new role name.
Depends-On: Ie899714aca49781ccd240bb259901d76f177d2ae
Change-Id: I4d41b2678a0f340792dd5c601342541ade771c26
Signed-off-by: Kevin Carter <kecarter@redhat.com>
- Remove /var/lib/docker-puppet which was depecrated in Stein, and not
used since. The new directory is /var/lib/container-puppet.
- Remove /var/lib/tripleo-config/container-startup-config-*.json
generation, since it's not done per step and per container in
/var/lib/tripleo-config/container-startup-config/step_X/*.json
- Adapt container-puppet.py to point to the right json file by default.
- Adapt deploy-steps.j2 to check the step configs directory instead of
the deprecated json file.
Change-Id: I98963941c9d969ab1dfd92d70f973013f84e1c25
Note: this patch won't be backported to Train.
Each play in deploy_steps_playbook.yaml should set
any_errors_fatal=true. This patch adds that argument where it was
missing.
We don't yet have the capability in TripleO to continue the deployment
if some hosts fail because we don't have per-role logic specifying which
roles need 100% success and which ones don't. Once that is available,
any_errors_fatal=true could be removed.
Change-Id: I1b6dc3cec6199fd50a779cde3a8199ba19297191
Closes-Bug: #1859175