With I57047682cfa82ba6ca4affff54fab5216e9ba51c Heat has added
a new template version for wallaby. This would allow us to use
2-argument variant of the ``if`` function that would allow for
e.g. conditional definition of resource properties and help
cleanup templates. If only two arguments are passed to ``if``
function, the entire enclosing item is removed when the condition
is false.
Change-Id: I25f981b60c6a66b39919adc38c02a051b6c51269
This was mainly there as an legacy interface which was
for internal use. Now that we pull the passwords from
the existing environment and don't use it, we can drop
this.
Reduces a number of heat resources.
Change-Id: If83d0f3d72a229d737a45b2fd37507dc11a04649
It is best to avoid placing db creds on the compute nodes to limit the
exposure if an attacker succeeds in gaining access to the hypervisor
host.
Related patches in puppet-nova remove the credentials from nova.conf
however the current scope of db credential hieradata is all nova tripleo
services - so it will but written to the hieradata keys on compute
nodes.
This patch refactors the nova hieradata structure, splitting the
nova-api/nova database hieradata out into individual templates and
selectively including only where necessary, ensuring we have no db
creds on a compute node (unless it is an all-in-one api+compute node).
Depends-On: I07caa3185427b48e6e7d60965fa3e6157457018c
Change-Id: Ia4a29bdd2cd8e894bcc7c0078cf0f0ab0f97de0a
Closes-bug: #1871482
The minor update procedure has been suggesting the execution of the
online db migrations step for few releases already, by the execution
of the openstack overcloud external-update run --tags online_upgrade.
However, this command has never executed any task at all as the
online_upgrade tasks belong to the external_upgrade_tasks section.
Instead of changing the interface in all releases by modifying the docs
let's just fix this in the code.
Change-Id: I4453e7c755cccc52bb135a14be1b04fdea53de6b
Closes-Bug: #1900467
Now that the FFU process relies on the upgrade_tasks and deployment
tasts there is no need to keep the old fast_forward_upgrade_tasks.
This patch removes all the fast_forward_upgrade_tasks section from
the services, as well as from the common structures.
Change-Id: I39b8a846145fdc2fb3d0f6853df541c773ee455e
With multi-cell we can have stacks with no nova-api so the nova database
operations must be coupled to nova-conductor, not nova-api.
Change-Id: I27aa62d00b77ca7ff61bb31c6f396f03e148a144
Closes-bug: #1883098
When checking if keystone/nova healthchecks are healthy, make sure the
registered fact is set (which can slip to a further retry if podman
inspect took too much time to execute).
That way, we process the retries without an error like found in the bug
report.
Change-Id: I9f5063c9c3b598afd5bd01447f00a1146a20f4c3
Closes-Bug: #1878063
They were disabled until the native podman healthcheck was integrated in
tripleo-ansible and it finally merged; so we can remove that safeguard
and it should be working.
Change-Id: I03361c33e54f0c8e71b420b144464ccb29a1ca4e
The systemd healthchecks are moving away, so we can use the native
podman healthchecks interface.
See I37508cd8243999389f9e17d5ea354529bb042279 for the whole context.
This patch does the following:
- Migrate the healthcheck checks to use podman inspect instead of
systemd service status.
- Force the tasks to not run, because we first need
https://review.opendev.org/#/c/720061 to merge
Once https://review.opendev.org/#/c/720061 is merged, we'll remove the
condition workaround and also migrate to unify the way containers are
checked; and use the role in tripleo-validations.
Depends-On: https://review.opendev.org/720283
Change-Id: I7172d81d305ac8939bee5e7f64960b0a9fea8627
- deploy-steps-tasks-step-1.yaml: Do not ignore errors when dealing
with check-mode directories. The file module is resilient enough to
not fail if the path is already absent.
- deploy-steps-tasks.yaml: Replace ignore_errors by another condition,
"not ansible_check_mode"; this task is not needed in check mode.
- generate-config-tasks.yaml: Replace ignore_errors by another
condition, "not ansible_check_mode"; this task is not needed in check mode.
- Neutron wrappers: use fail_key: False instead of ignore_errors: True
if a key can't be found in /etc/passwd.
- All services with service checks: Replace "ignore_errors: true" by
"failed_when: false". Since we don't care about whether or not the
task returns 0, let's just make the task never fail. It will only
improve UX when scrawling logs; no more failure will be shown for
these tasks.
- Same as above for cibadmin commands, cluster resources show
commands and keepalived container restart command; and all other shell
or command or yum modules uses where we just don't care about their potential
failures.
- Aodh/Gnocchi: Add pipefail so the task isn't support to fail
- tripleo-packages-baremetal-puppet and undercloud-upgrade: check shell
rc instead of "succeeded", since the task will always succeed.
Change-Id: I0c44db40e1b9a935e7dde115bb0c9affa15c42bf
The next iteration of fast-forward-upgrade will be
from queens through to train, so we update the names
accordingly.
Change-Id: Ia6d73c33774218b70c1ed7fa9eaad882fde2eefe
Certain config containers might need to be replaced and re-run
regardless of whether configuration changes on update and upgrade.
Adding the DeployIdentifier to the env will ensure that they are.
Change-Id: I150212ebac3fed471ffb4e7ed7b6eb6c7af3fad9
Closes-Bug: #1860571
In nova, enable_numa_live_migration was deprecated in train release,
so remove the corresponding parameter, NovaEnableNUMALiveMigration,
in templates.
Change-Id: I9616b290bf4ee6fefee66efb6924a3fd6699ccae
Ansible has decided that roles with hypens in them are no longer supported
by not including support for them in collections. This change renames all
the roles we use to the new role name.
Depends-On: Ie899714aca49781ccd240bb259901d76f177d2ae
Change-Id: I4d41b2678a0f340792dd5c601342541ade771c26
Signed-off-by: Kevin Carter <kecarter@redhat.com>
When podman parses such volume map it removes the slash
automatically and shows in inspection volumes w/o slash.
When comparing configurations it turns to be a difference and
it breaks idempotency of containers, causing them to be recreated.
Change-Id: Ifdebecc8c7975b6f5cfefb14b0133be247b7abf0
When upgrading from Rocky to Stein we moved also from using the docker
container engine into Podman. To ensure that every single docker container
was removed after the upgrade a post_upgrade task was added which made
use of the tripleo-docker-rm role that removed the container. In this cycle,
from Stein to Train both the Undercloud and Overcloud work with Podman, so
there is no need to remove any docker container anymore.
This patch removes all the tripleo-docker-rm post-upgrade task and in those
services which only included a single task, the post-upgrade-tasks section
is also erased.
Change-Id: I5c9ab55ec6ff332056a426a76e150ea3c9063c6e
Moving all the container environments from lists to dicts, so they can
be consumed later by the podman_container ansible module which uses
dict.
Using a dict is also easier to parse, since it doesn't involve "=" for
each item in the environment to export.
Change-Id: I894f339cdf03bc2a93c588f826f738b0b851a3ad
Depends-On: I98c75e03d78885173d829fa850f35c52c625e6bb
Inflight validations are now properly deactivated within the
tripleoclient/tripleo-common code.
This reverts commit 1761fc81c252e3dd565fe4f27e13f2c26426c806.
Change-Id: I4ea9bfadbcc71c847232c8585d99f8698daffc9a
The systemd healthcheck timer first triggers 120s after activation.
The initial value for ExecMainStatus is 0, resulting in false positives if we
check this too early.
This changes waits (up to 5 mins) for ExecMainPID to be set and the service to
return to an inactive/failed state.
Change-Id: Iad4ebb283a7a6559b6fffead4145cc9bbad45e4e
Depends-On: Ia2897a6be3e000a9594103502b716431baa615b1
Related-bug: #1843555
The validation tasks added in I2c044e3d2af7f747acde5ad3bf256386b8c550a3 are not
valid on docker. As it's now deprecated we can just skip them.
Change-Id: I4ff530af8ad7f864b8038e5e509ec38840096c5d
Related-bug: #1842687
Instead of running "podman exec" to test the container healthchecks, we
should rather rely on the status of systemd timers which reflect the
real state of the healthchecks, since they run under a specific user and
pid.
Also, we should only test the healthchecks if
ContainerHealthcheckDisabled is set to False.
Change-Id: I2c044e3d2af7f747acde5ad3bf256386b8c550a3
Closes-Bug: #1842687
This patch removes fluentd composable service in favor of rsyslog composable service
and modifies *LoggingSource configuration accordingly.
Change-Id: I1e12470b4eea86d8b7a971875d28a2a5e50d5e07
Before we start services on upgraded bootstrap
controller (usually controller-0), we need to
stop services on unupgraded controllers
(usually controller-1 and controller-2).
Also we need to move the mysql data transfer
to the step 2 as we need to first stop the
services.
Depends-On: I4fcc0858cac8f59d797d62f6de18c02e4b1819dc
Change-Id: Ib4af5b4a92b3b516b8e2fc1ae12c8d5abe40327f
The tripleo-docker-rm role has been replaced by tripleo-container-rm [0].
This role will identify the docker engine via the container_cli variable
and perform a deletion of that container. However, these tasks inside the
post_upgrade_tasks section were thought to remove the old docker containers
after upgrading from rocky to stein, in which podman starts to be the
container engine by default.
For that reason, we need to ensure that the container engine in which the
containers are removed is docker, as otherwise we will be removing the
podman container and the deployment steps will fail.
Closes-Bug: #1836531
[0] - 2135446a35
Depends-On: https://review.opendev.org/#/c/671698/
Change-Id: Ib139a1d77f71fc32a49c9878d1b4a6d07564e9dc
This converts all Docker*Image parameter varients into
Container*Image varients.
The commit was autogenerated with the following shell commands:
for file in $(grep -lr Docker.*Image --include \*.yaml --exclude-dir releasenotes); do
sed -e "s|Docker\([^ ]*Image\)|Container\1|g" -i $file
done
Change-Id: Iab06efa5616975b99aa5772a65b415629f8d7882
Depends-On: I7d62a3424ccb7b01dc101329018ebda896ea8ff3
Depends-On: Ib1dc0c08ce7971a03639acc42b1e738d93a52f98
UpgradeRemoveUnusedPackages is not used anymore. All packages are
supposed to be removed on undercloud upgrade to 14.
Change-Id: Ie6b739390ec0ae0c5773a5a6c63b49422195623a
This change combines the previous puppet and docker files
into a single file that performs the docker service installation
and configuration.
Change-Id: I9bd5c9f007d9f69d7310cdd0106bcc923c1b0acd
During some of the nova service flattening it was included some of the
baremetal upgrade_tasks into the containerized services. This patch removes
them.
Change-Id: I4a569195deeadb34180561c778dabe77be4f6466
Closes-Bug: #1816453
Live migration is currently totally broken if a NUMA topology is
present. This affects everything that's been regrettably stuffed in with
NUMA topology including CPU pinning, hugepage support and emulator
thread support. Side effects can range from simple unexpected
performance hits (due to instances running on the same cores) to
complete failures (due to instance cores or huge pages being mapped to
CPUs/NUMA nodes that don't exist on the destination host).
Until such a time as we resolve these issues, we should alert users to
the fact that such issues exist. A workaround option is provided for
operators that _really_ need the broken behavior, but it's defaulted to
False to highlight the brokenness of this feature to unsuspecting
operators.
The related nova change is I217fba9138132b107e9d62895d699d238392e761
The proposed change allows to configure the 'enable_numa_live_migration'
workarounds option through TripleO. By default this feature will be
disabled for NUMA topology instances.
Depends-On: I16794fbfef0e6e83d3fcebb9e6bc2fcf478ebf72
Change-Id: I523756b418afe1827490c936966af8936ffdbaa6
- uses split-control-plane
- adds a new CellController role
- nova-conductor, message rpc (not notifications) and db
- move nova dbsync from nova-api to nova-conductor
- nova db is more tightly coupled to conductor/computes
- we don't have a nova-api services on a CellController
- super-conductor on Controller will sync cell0 db
- new 'magic' MysqlCellInternal endpoint
- always refers the to local MysqlInternal endpoint
- identical to MysqlInternal for regular deployment
- but doesn't get overridden when inheriting EndpointMap from parent
control-plane stack
- duplicate service node name hiera for transport_urls on cell stack
- nova -> cell oslo messaging rpc nodes
- neutron agent -> global messaging rpc nodes
- run cell host discovery only on default cell, for additional cells
the cell needs to be created first
bp tripleo-multicell-basic
Co-Authored-By: Martin Schuppert <mschuppert@redhat.com>
Change-Id: Ife9bf12d3a6011906fa8d9f97f7524b51aef906a
Depends-On: I79c1080605611c5c7748a28d2afcc9c7275a2e5d
This change combines the previous puppet and docker files
into a single file that performs the docker service installation
and configuration. With this patch the baremetal version of
nova has been removed.
Change-Id: Ic577851f8d865d5eec41dbfb00c27520bedc3fdb