368 Commits

Author SHA1 Message Date
Emilien Macchi
f1e1f4ba04 Do not configure Hiera and Hieradata in Ansible check mode
When deploying TripleO in dry-run (Ansible check mode to True); we don't
need to generate the Hieradata ; it's expecting actual files
and data; which doesn't exist in check mode.

Let's just skip the whole role if the check mode is used.

Change-Id: I1f54878ef6a59ab772e6c2f75493efb2eb45cf3a
2019-12-20 17:33:30 -05:00
James Slagle
2a6336a742 Execute deploy_steps_tasks per step
Instead of including the entire deploy_steps_tasks.yaml tasks file for
each role at each step, use the per-step files and only if they exist.

This cuts down on the amount of time that ansible has to spend skipping
tasks that don't get run at a certain step, which can be significant at
scale.

Change-Id: I06ee04b9b47226433f25e3cff08c461462a907d9
Depends-On: Id5fdb4dd1a6290d1097d2d81523161c87ab6d4dd
2019-12-15 17:22:21 -05:00
James Slagle
a85d3d7069 Delegate and run once debug start-at-task messages
Instead of executing these debug tasks for every host, we can delegate
them to localhost and only run them once.

Change-Id: I367db4c9e743c77f121751ecca6bb4996b8a3df9
2019-12-13 14:29:20 -05:00
Zuul
45ff61361a Merge "Use include_tasks instead of import_tasks" 2019-12-13 17:04:47 +00:00
James Slagle
6f8b2db26a Use include_tasks instead of import_tasks
include_tasks is dynamic and the tasks are either included (or not) at
runtime. This has the advantage that if a "when" keyword excludes the
include_tasks, then all the tasks are excluded as a group.

This is opposed to import_tasks which happen at playbook parse time. The
"when" keyword is inherited by each individual task that was imported.

While the two are functionally equivalent for these use cases,
import_tasks ends up being much slower, since ansible then has to
compute a much larger set of tasks to skip at runtime. Using
include_tasks is much faster, even at small scale (~50 hosts).

Change-Id: I2db81d39b3294aa2784a340562f10fd9bf3fe9ee
2019-12-12 16:23:08 -05:00
Emilien Macchi
fe6b235e5f scale: fixes for compute scale down
- Force fact gathering so that we're ensured to have the proper FQDN
- Update start squence so that our scale down process is not starting
  from irrelevant steps
- correct list evaluation. The compute service argument should have one
  item in the list. Prior to this change it was expecting zero items,
  which was causing the removal tasks to skip.

Co-Authored-By: "Cedric Jeanneret (Tengu) <cjeanner@redhat.com>"
Co-Authored-By: "Emilien Macchi <emilien@redhat.com>"
Co-Authored-By: "Kevin Carter (cloudnull) <kecarter@redhat.com>"

Change-Id: I7c1615685718924e872a2f9173b15c63bba8c482
Closes-Bug: #1856062
2019-12-12 13:54:31 +00:00
Zuul
d642bf9ef9 Merge "Re-enable "service_names" hieradata" 2019-12-06 09:04:03 +00:00
Zuul
64a2ee13bc Merge "Ensure we set proper SELinux label on container-puppet.sh" 2019-12-06 07:16:29 +00:00
Zuul
56c197b8c8 Merge "Use ansible for hosts entries" 2019-12-06 01:37:00 +00:00
Emilien Macchi
b5eff67830 Re-enable "service_names" hieradata
"service_names" was a useful hieradata which listed the services enabled
for a specific roles vs "enabled_services" which are for all the
services enabled in the cloud, no matter the role.

This is re-added for backward compatibility.

Depends-On: I75b1112089a66cf5db0a2fd651bb24428cf861fd
Related-Bug: #1855138
Change-Id: I7339b8791817bdaffa65c928d424796114efdf57
2019-12-04 17:26:19 -05:00
Zuul
8baf366b6d Merge "Don't set all_nodes data as group_vars in check mode" 2019-11-29 00:39:36 +00:00
Cédric Jeanneret
3b146b1e45 Ensure we set proper SELinux label on container-puppet.sh
Just to ensure we have the right label, even if something does mount the
directory with re-labelling. This would avoid any race-condition chance.

Also update old svirt_sandbox_file_t alias since the common thing is
"container_file_t".

Change-Id: Ic036ad901885f9d8c8072b560f2d9f3c8e919d58
Closes-Bug: #1854377
2019-11-28 17:01:05 +01:00
Michele Baldessari
b7e28c9a85 Move 'Ensure network service is enabled' after os-net-config has run
This has been observed downstream via https://bugzilla.redhat.com/show_bug.cgi?id=1777529
The scenario is as follows:
1) Undercloud VM is configured with a bunch of interfaces in dhcp
2) One NIC (eth0 in this case) is on a network without a dhcp server
3) Said NIC is configured in undercloud.conf to be part of the
   br-ctlplane bridge

In the above scenario the undercloud install will fail with the
following error:
 "Unable to start service network: Job for network.service failed because the control process exited with error code...."

The reason is that eth0 times out when asking for dhcp and so the whole
network.service start command fails.

We can just move the network service enablement *after* os-net-config
has run. This fixes this scenario (I just tested it) and it still
covers for enabling the network service in order to maintain network
connectivity at reboot, which was the reason for adding this code in the
first place anyway.

Change-Id: I0d13c9ee2490aa765d546961c9a1fc14e931c0c7
2019-11-28 11:43:25 +01:00
Emilien Macchi
0362abcbd1 Don't set all_nodes data as group_vars in check mode
Since the "render all_nodes data as group_vars" tasks are skipped when
ansible check is true, we should not run the "set all_nodes data as
group_vars" tasks as well. It was missed and now failing when --check is
used (see bug report).

Change-Id: I2852c8285a0e72d855bfed216b53de6bdeeabe68
Closes-Bug: #1854246
2019-11-27 22:06:03 -05:00
Zuul
c3434df84a Merge "Ensure "network" service is enabled" 2019-11-22 21:49:21 +00:00
Cédric Jeanneret
804dd0f341 Ensure "network" service is enabled
It may happen this service isn't enabled (especially on newer
Centos/RHEL releases), leading in network misconfigurations.

Ensuring it's actually enabled allows br-ctlplane to get configured upon
reboot.

Related: https://bugzilla.redhat.com/show_bug.cgi?id=1774581
Change-Id: I4a56d837e08498bc4d25d25c7203bfd7d012974e
Co-Authored-By: Alex Schultz <aschultz@redhat.com>
2019-11-22 08:19:36 -07:00
Sofer Athlan-Guyot
10a523a03d Make sure we apply all deploy step-0 during update.
That step include, in particular, the puppet-container.py/sh creation
file plus some other task that should happen very early during the
deploy process.

During update if a change in puppet-container.py happen in the
template we have to trigger those steps to get the changes.  This is
especially critical because we run puppet container configuration
during the update stage.

Closes-Bug: #1853156
Change-Id: I26406da82c584dc5093c17ad26f263057a5cbcaa
2019-11-20 20:38:28 +01:00
James Slagle
597cdb6796 Use ansible for hosts entries
This patch updates the templates to use the new ansible role,
tripleo-hosts-entries, for managing the entries in /etc/hosts instead of
the values from the Heat stack.

Change-Id: I606e0f27f9f9ae9d85bc0fc653f8985eb734d004
Depends-On: Ia02ca1263590e2b579f2534e99119d7b1cd4b39a
2019-11-14 11:29:51 -05:00
Emilien Macchi
427e54164d Do not run rendering all_nodes data as group_vars in check mode
We don't want to run the "Render all_nodes data as group_vars for overcloud"
tasks in check mode, since it relies on the command "whoami"; which
doesn't execute in check mode.

Change-Id: Ia8df4794fcf8eff9a2d5a8a7e99e0e5ebf1f8e1f
2019-11-05 14:21:21 +01:00
Zuul
ad27bcc3ea Merge "Move KernelArgs and OvS-DPDK deployment to ansible role" 2019-11-04 17:09:19 +00:00
Emilien Macchi
b5a33168a7 Generate startup configs files per step and per container
For each step and each container, generate a unique JSON, and Paunch
will be able to read them all thanks to the patch in dependency
(tripleo-ansible).

Old location:
/var/lib/tripleo-config/container-startup-config-step1.json
We keep the old files for backward compatibility.

New location:
/var/lib/tripleo-config/container-startup-config/step_1/haproxy.json

Note: hashed files won't be generated for the old location anymore,
since it's done via container-puppet.py in which we now give the new
location.

Story: 2006732
Task: 37162
Depends-On: If0f1c6c308cd58f7baa9a8449fbf685ff10f0e0a
Change-Id: I1cf8923a698d0f6e0b1e00a7985f363a83e914c4
2019-10-25 12:13:13 +00:00
Alex Schultz
6d76836e4c Drop the overcloud designation for deploy steps
Because these steps are shared with the undercloud/standalone/etc, it
doesn't necessarily make sense to declare they are Overcloud steps.
Let's just call them Deploy steps.

Change-Id: I60124d27f305333c8d54175ac5f2b2e500b45409
2019-10-24 13:56:04 +00:00
James Slagle
bd0e7c4cca Use single quotes for --start-at-task debug tasks
Previously, the use of double quotes in these tasks caused ansbile to
escape the double quote when printing the debug message. The escape
character (\) made copy/pasting the task name more difficult.

Switching to single quotes around the task name makes it easier on the
operator to copy/paste so that --start-at-task can be easily used.

Change-Id: I5e5837cc22462769de05be73249633de4fa65fcf
2019-10-24 13:56:00 +00:00
Saravanan KR
16679d0ec4 Move KernelArgs and OvS-DPDK deployment to ansible role
Ealier, KernelArgs had been configured using ansible
tasks part of THT repo. Thoese ansiblet asks has been
moved to tripleo-kernel role of tripleo-ansible. This
role will be invoked from the boot-params-service.
boot-params-service has been moved from pre network to
the deployment/kernel directory.

OvS-DPDK configuration was done using puppet-vswitch
module by invoking puppet in PreNetworkConfig's
ExtraConfig script. A new ansible role tripleo-ovs-dpdk
has been created to apply the DPDK configurations via
ansible instead of puppet. This role will be common
for both ml2-ovs and ml2-ovn. Common parameter merging
has been enhanced to provide common deploy steps.

ODL is not validated as it has been deprecated and
currently no active usage or development.

Depends-On: https://review.opendev.org/#/c/688864/
Change-Id: I4b6f7d73cf76954e93760c59d522d485187157cf
2019-10-23 10:12:42 +05:30
Sofer Athlan-Guyot
6e649ccc7e Fix update of non HA container during update.
After the refactoring of step_1 to improve speed during deployment, we
were not updating the /var/lib/container-puppet/container-puppet.json
or the container-startup-configs.json for instance.

Net result is that the configuration is being done on the initially
deployed container and non HA container are not updated.

Note that not all step1 tasks were refarctored into the
deploy_steps_tasks_step_1. So we still need to loop over step 1 in the
deploy_steps_tasks.

Closes-Bug: #1848359

Change-Id: I76b16cd7781ea601778004afa4e0bc3020ce3c59
2019-10-17 10:22:14 +02:00
Zuul
4b39d1fcf8 Merge "Use update_serial as an ansible variable" 2019-10-12 04:41:05 +00:00
Emilien Macchi
474d3d3a55 Add toggle to enable/disable Paunch for a deployment
Default is True for backward compatibility. For now, only Paunch is
supported but there are some alternatives under development:
I2f88caa8e1c230dfe846a8a0dd9f939b98992cd5

Change-Id: Iceff88c6f4710c8023541314ac08c1ea4100cee7
2019-10-10 12:01:23 -04:00
Zuul
f7dd66ca72 Merge "Add missing role name to Host prep steps task" 2019-10-08 17:33:22 +00:00
James Slagle
2b9f06fc80 Add missing role name to Host prep steps task
The role name was missing from the task name, so the debug task that
refers to the task name was incorrect. This patch adds the name so that
--start-at-task can be used with Host prep steps for a specific
role.

Change-Id: Ie0888d5d075f0fae528bc4c71482246d334632ef
2019-10-07 12:13:26 +00:00
Kevin Carter
29b0763921 Move nested bash script into a real file
container-puppet.py had within it a nested bash script using
a string literal. This change moves that script into a stand
alone file which will be copied to the host and used when
needed. This change will begin allowing us to deconstruct
the container-puppet.py process allowing us to simplify the
process, inturn making deployments more reliable, faster,
and easier to understand.

Change-Id: I447f33f159c5a50e55401b4e0a893c250869d185
Signed-off-by: Kevin Carter <kecarter@redhat.com>
2019-10-04 13:42:12 +00:00
Mathieu Bultel
c6cd605781 Use update_serial as an ansible variable
Use update_serial as ansible variable for being able to
override the value with CLI or ansible-playbook command line.

This should result as the following:
at the heat level:
  * if no update_serial in the roles_data is provided, the default
    value is 1
  * if update_serial is provided by the roles_data, the default value
    for the ansible var is the one in the roles_data.
  * if ansible-playbook -e update_serial=<something> , the serial value
    is the one provided by the CLI.

Change-Id: Id4a6326977bebba56c3da2dbf3c4113b6658d433
2019-09-30 13:10:09 +00:00
Zuul
239ef885d2 Merge "Add no_log to include_vars" 2019-09-27 19:54:37 +00:00
Zuul
cf80bb320c Merge "Move common step 1 tasks to their own file" 2019-09-25 05:34:50 +00:00
Alex Schultz
07c94dd1d8 Add no_log to include_vars
These appear to be dumping the vars to the logs which is quite verbose
and uncessary. Let's no_log the include vars so we don't end up with the
vars constantly dumped out.

Change-Id: I06a5f893adef4eaf6db17bf5ead4653cfed599f1
2019-09-24 11:08:06 +00:00
Zuul
282386e0b5 Merge "Run facts gathering always for upgrades." 2019-09-10 19:27:23 +00:00
Zuul
fd051e610e Merge "Add named debug tasks to each play" 2019-09-10 00:37:00 +00:00
Zuul
12de47bc59 Merge "Ensure container_startup_configs_json_stat is defined" 2019-09-07 04:04:44 +00:00
Zuul
619ce3331e Merge "Remove sorted_bootstrap_node var" 2019-09-07 02:34:37 +00:00
James Slagle
8cfe97cb96 Move common step 1 tasks to their own file
The large block of tasks in deploy-steps-tasks.yaml that are run only at
step 1, results in a large number of tasks being skipped at steps 2-5.
Task skipping is not "free", as ansible has to repeatedly compute what
needs to be skipped.

Note that the previous usage with the block statement causes the "when"
statement to be inherited to each task, so it has to be recalculated for
every task across every node for all steps.

The issue starts to become apparent with deployments with larger number
of nodes.

This commit refactors the steps 1 tasks into their own tasks file, and
then uses the include_task module to only include the tasks at step 1.

A separate play is also used for these step 1 tasks so that ansible
doesn't have to repeatedly calculate if the include_task should be
executed for steps 2-5 as well.

Change-Id: Id5985ce8ac741baa9adc9f5874df0459fd4c24b2
2019-09-06 18:14:12 -04:00
Jose Luis Franco Arza
205ac0f123 Run facts gathering always for upgrades.
Up to now, each of the upgrade steps to gather facts from the
Undercloud and Overcloud nodes did contain a tag named facts.
This allowed us to run the facts gathering at will, however
there is a problem with the upgrade_tasks. As it is allowed to
pass a specific tag during its execution. So, if the task passed
is different than facts, for example system_upgrade, then the
facts gathering tasks wont be executed and the upgrade tasks
will fail when trying to access some of the ansible facts (as
it could be ansible_hostname).
The solution for this is to run always the facts gathering, no
matter what tag is being passed we'll have the facts available
during the execution.

Change-Id: Ibee072bec9916804163dab29e164eb1423cd08f8
2019-09-06 10:59:10 +02:00
Zuul
581993a9e7 Merge "Rename pre/post deployments host vars" 2019-09-06 06:21:21 +00:00
James Slagle
7859700354 Add named debug tasks to each play
Named debug ansible tasks have been added to the plays that get
generated in deploy_steps_playbook.yaml (from common/deploy-steps.j2).
The explicitly named tasks allow for using ansible-playbook's
--start-at-task option to resume a deployment from the start of a given
play.

For example, this could be used to resume a deployment with:
ansible-playbook ... --start-at-task "Overcloud common deploy step tasks 3" ...

Previously this was not possible since many of the tasks that got
generated in common_deploy_steps_tasks.yaml used an ansible variable in
the name, which is not resolved until runtime, so --start-at-task is
ignored.

Change-Id: If40a5ecaacf8c74c98775eb6bde05d967694f640
2019-09-05 17:34:54 -04:00
Zuul
9b88629d63 Merge "Use separate plays for Host prep steps" 2019-09-05 06:27:02 +00:00
Sergii Golovatiuk
74a1cd7d13 Replace include_tasks with import_tasks
There are places when we have include_tasks in nested tasks. It means
that -t "tag_from_nested_task" won't work as all nested tasks will need
either "always" tag or should be "import_tasks" instead.

This patch replaces include_tasks with import_tasks for External steps
allowing to run

openstack overcloud external-update run --tags tag_from_nested_task
openstack overcloud external-upgrade run --tags tag_from_nested_task

without hacking all nested tasks.

Closes-Bug: #1842410

Change-Id: I51a148cdc5538d5a1106d58d227d361d1e6f9e19
2019-09-04 15:26:12 +00:00
Emilien Macchi
20a329f87d Rename pre/post deployments host vars
To reflect the change done in dependency, we need to change the variable
name.

Change-Id: Id0f708e6d6ce9565a5569716edfe993c7eceabaa
Related-Bug: #1842141
Depends-On: Idbdd6a21eb2cd488daa1e3ddc844b4fc5267047c
2019-09-01 03:45:21 +00:00
Zuul
94fd8052a2 Merge "Sort bootstrap node check for container_puppet_tasks" 2019-08-31 02:29:38 +00:00
James Slagle
493d1c62fd Use separate plays for Host prep steps
Instead of using one task include per role in the same play, use
separate plays instead. This reduces the amount of task skipping that
Ansible has to do since each include only applies to a single role.

In a deployment with many roles, this will improve performance.

Change-Id: I01ef631ea3dad8b9c030d61ed0883a9af13616ad
2019-08-28 18:02:10 -04:00
James Slagle
7bd3bbbd06 Ensure container_startup_configs_json_stat is defined
When using --tags to run only a subset of tasks from
common/deploy-steps-tasks.yaml, the condition that checks the result of
container_startup_configs_json_stat.stat.exists was failing since the
task that defined it was skipped.

This patch adds an additional "is defined" check to ensure the var is
defined be the result is checked.

Change-Id: Iadde90ed9416902848df2e60551470c0f1689a32
2019-08-28 18:00:03 -04:00
James Slagle
1b9d3566d1 Remove sorted_bootstrap_node var
Once the tripleo-ansible patch has merged that also expects the
bootstrap node to be chosen from the sorted inventory group, this
variable can be removed.

Depends-On: I4b280bcafce42bba4c823b8205f296b83e2f3e5d
Change-Id: Icf654f6597291ddff8aa90b3afb8679dce9e2284
2019-08-28 18:00:03 -04:00
James Slagle
e995415d86 Sort bootstrap node check for container_puppet_tasks
Previously, the task that checks if container_puppet_tasks should be
written to the bootstrap node was not sorted, which meant the order was
undefined coming out of the Ansible inventory.

This patch adds a sort filter to the check so that by convention the
first node is chosen from the inventory.

A variable, sorted_bootstrap_node, is also added to control the sorting
in the bootstrap_node.j2 template from the tripleo-hieradata role so
that the tripleo-ansible patch is able to merge first.

Depends-On: I3d595ea5b84f940a3b2dbc69798f69fe03529c10

Change-Id: I6b93f5b0747c5a11d24615a3bbb5516f9be81401
2019-08-27 13:25:58 -04:00