With I57047682cfa82ba6ca4affff54fab5216e9ba51c Heat has added
a new template version for wallaby. This would allow us to use
2-argument variant of the ``if`` function that would allow for
e.g. conditional definition of resource properties and help
cleanup templates. If only two arguments are passed to ``if``
function, the entire enclosing item is removed when the condition
is false.
Change-Id: I25f981b60c6a66b39919adc38c02a051b6c51269
Nova vnc configuration right now uses NovaVncProxyNetwork,
NovaLibvirtNetwork and NovaApiNetwork to configure the different
components (novnc proxy, nova-compute and libvirt) for vnc.
If one of the networks get changed from internal_api, the service
configuration between libvirt, nova-compute and novnc proxy gets
inconsistent and the console is broken.
This changed to just use NovaLibvirtNetwork for configuring the vnc
endpoints and removes NovaVncProxyNetwork completely.
Change-Id: Icef2481b65b41b524ad44eeecfbee4451006e1d2
Closes-Bug: #1917719
This is using linux-system-roles.certificate ansible role,
which replaces puppet-certmonger for submitting certificate
requests to certmonger. Each service is configured through
it's heat template.
Partial-Implements: blueprint ansible-certmonger
Depends-On: https://review.rdoproject.org/r/31713
Change-Id: Ib868465c20d97c62cbcb214bfc62d949bd6efc62
This was mainly there as an legacy interface which was
for internal use. Now that we pull the passwords from
the existing environment and don't use it, we can drop
this.
Reduces a number of heat resources.
Change-Id: If83d0f3d72a229d737a45b2fd37507dc11a04649
nova-consoleauth was removed in Stein. We need to delete the compute
services during major upgrades.
Related: https://bugzilla.redhat.com/1825849
Change-Id: I74465f5ae77a0666540d3465e2ad29b03f9bd3c3
Adding the ability to specifies the private key size
used when creating the certificate. We have defined the
default value the same as we have before 2048 bits.
Also, it'll be able to override the key_size value
per service.
Depends-on: I4da96f2164cf1d136f9471f1d6251bdd8cfd2d0b
Change-Id: Ic2edabb7f1bd0caf4a5550d03f60fab7c8354d65
It is best to avoid placing db creds on the compute nodes to limit the
exposure if an attacker succeeds in gaining access to the hypervisor
host.
Related patches in puppet-nova remove the credentials from nova.conf
however the current scope of db credential hieradata is all nova tripleo
services - so it will but written to the hieradata keys on compute
nodes.
This patch refactors the nova hieradata structure, splitting the
nova-api/nova database hieradata out into individual templates and
selectively including only where necessary, ensuring we have no db
creds on a compute node (unless it is an all-in-one api+compute node).
Depends-On: I07caa3185427b48e6e7d60965fa3e6157457018c
Change-Id: Ia4a29bdd2cd8e894bcc7c0078cf0f0ab0f97de0a
Closes-bug: #1871482
We recently introduced a change that allowed operators to pass novnc TLS
cipher parameters to puppet-nova:
https://review.opendev.org/#/c/723920/10
Unfortunately, the default values for NovaVNCProxySSLCiphers and
NovaVNCProxySSLMinimumVersion conflict with puppet-nova and causes TLS-e
deployments to fail with the following error during the overcloud
deployment:
/var/log/containers/nova/nova-novncproxy.log:2020-08-03 04:45:41.120 8
ERROR nova oslo_config.cfg.ConfigFileValueError: Value for option
ssl_minimum_version from LocationInfo(location=<Locations.user: (4,
True)>, detail='/etc/nova/nova.conf') is not valid: Valid values are
[default, tlsv1_1, tlsv1_2, tlsv1_3], but found ''
This is because the values don't match what puppet-nova is expecting and
it causes the containers to fail.
This commit attempts to add some resonable defaults in THT that align
more closely with the puppet-nova defaults. It also only sets the
ciphers if they're set by the end user.
Change-Id: I2663bc9154846cc4642c3a030be0c57df4f25e1b
We recently introduced new variables to puppet-nova that expose nova VNC
settings for SSL ciphers:
I3a1262f70f6a801db276701a39ebb01f40025192
However, when we updated tht we used the actual names of the nova
settings instead of the names of the puppet variables:
Ida03a0aa54ca15b343339d92abb9c105ead8b0b6
This patch changes tht to use the puppet variables instead of the nova
configuration names. Without this patch, setting the heat variables has
no affect on tls settings in nova.conf.
Change-Id: Iacc25c694f2e5491e971b9cef4e984602be25c04
Now that the FFU process relies on the upgrade_tasks and deployment
tasts there is no need to keep the old fast_forward_upgrade_tasks.
This patch removes all the fast_forward_upgrade_tasks section from
the services, as well as from the common structures.
Change-Id: I39b8a846145fdc2fb3d0f6853df541c773ee455e
When checking if keystone/nova healthchecks are healthy, make sure the
registered fact is set (which can slip to a further retry if podman
inspect took too much time to execute).
That way, we process the retries without an error like found in the bug
report.
Change-Id: I9f5063c9c3b598afd5bd01447f00a1146a20f4c3
Closes-Bug: #1878063
They were disabled until the native podman healthcheck was integrated in
tripleo-ansible and it finally merged; so we can remove that safeguard
and it should be working.
Change-Id: I03361c33e54f0c8e71b420b144464ccb29a1ca4e
The systemd healthchecks are moving away, so we can use the native
podman healthchecks interface.
See I37508cd8243999389f9e17d5ea354529bb042279 for the whole context.
This patch does the following:
- Migrate the healthcheck checks to use podman inspect instead of
systemd service status.
- Force the tasks to not run, because we first need
https://review.opendev.org/#/c/720061 to merge
Once https://review.opendev.org/#/c/720061 is merged, we'll remove the
condition workaround and also migrate to unify the way containers are
checked; and use the role in tripleo-validations.
Depends-On: https://review.opendev.org/720283
Change-Id: I7172d81d305ac8939bee5e7f64960b0a9fea8627
This adds new NovaVNCProxySSLCiphers and NovaVNCProxySSLMinimumVersion
parameters to manage the allowed TLS ciphers and minimum protocol
version to enforce for incoming client connections to the VNC proxy
service.
Change-Id: Ida03a0aa54ca15b343339d92abb9c105ead8b0b6
Related-Bug: 1842149
- deploy-steps-tasks-step-1.yaml: Do not ignore errors when dealing
with check-mode directories. The file module is resilient enough to
not fail if the path is already absent.
- deploy-steps-tasks.yaml: Replace ignore_errors by another condition,
"not ansible_check_mode"; this task is not needed in check mode.
- generate-config-tasks.yaml: Replace ignore_errors by another
condition, "not ansible_check_mode"; this task is not needed in check mode.
- Neutron wrappers: use fail_key: False instead of ignore_errors: True
if a key can't be found in /etc/passwd.
- All services with service checks: Replace "ignore_errors: true" by
"failed_when: false". Since we don't care about whether or not the
task returns 0, let's just make the task never fail. It will only
improve UX when scrawling logs; no more failure will be shown for
these tasks.
- Same as above for cibadmin commands, cluster resources show
commands and keepalived container restart command; and all other shell
or command or yum modules uses where we just don't care about their potential
failures.
- Aodh/Gnocchi: Add pipefail so the task isn't support to fail
- tripleo-packages-baremetal-puppet and undercloud-upgrade: check shell
rc instead of "succeeded", since the task will always succeed.
Change-Id: I0c44db40e1b9a935e7dde115bb0c9affa15c42bf
The next iteration of fast-forward-upgrade will be
from queens through to train, so we update the names
accordingly.
Change-Id: Ia6d73c33774218b70c1ed7fa9eaad882fde2eefe
Deployment is failing with error [1] because the owner/group
of the TLS generated certificate and key were set to 'qemu'.
This user and group exist on compute nodes, but not on controller.
[1] Error: Could not find group qemu"
This patch adds 'qemu' user/group on controller node to
resolve the issue as this user is required to retrieve the cert,
used by the VNC proxy, the same way as on the compute nodes.
Change-Id: I3aa774c06d91a3b67726fad0d0ca409cda5b78b9
Closes-Bug: #1860971
Ansible has decided that roles with hypens in them are no longer supported
by not including support for them in collections. This change renames all
the roles we use to the new role name.
Depends-On: Ie899714aca49781ccd240bb259901d76f177d2ae
Change-Id: I4d41b2678a0f340792dd5c601342541ade771c26
Signed-off-by: Kevin Carter <kecarter@redhat.com>
When podman parses such volume map it removes the slash
automatically and shows in inspection volumes w/o slash.
When comparing configurations it turns to be a difference and
it breaks idempotency of containers, causing them to be recreated.
Change-Id: Ifdebecc8c7975b6f5cfefb14b0133be247b7abf0
This patch is fixing following issues, which makes rsyslog service
to fail to start successfully:
- Changes LoggingSource configuration key 'path' to 'file' for various services
- Fixes LoggingSource configuration key 'startmsg.regex' for pacemaker
- Removes nonexistent log files from LoggingSource of keystone
Change-Id: I7fe6456a1d2a3ba4300a82c57b76774152422250
This change converts our filewall deployment practice to use
the tripleo-ansible firewall role. This change creates a new
"firewall_rules" object which is queried using YAQL from the
"FirewallRules" resource.
A new parameter has been added allowing users to input
additional firewall rules as needed. The new parameter is
`ExtraFirewallRules` and will be merged on top of the YAQL
interface.
Depends-On: Ie5d0f51d7efccd112847d3f1edf5fd9cdb1edeed
Change-Id: I1be209a04f599d1d018e730c92f1fc8dd9bf884b
Signed-off-by: Kevin Carter <kecarter@redhat.com>
When upgrading from Rocky to Stein we moved also from using the docker
container engine into Podman. To ensure that every single docker container
was removed after the upgrade a post_upgrade task was added which made
use of the tripleo-docker-rm role that removed the container. In this cycle,
from Stein to Train both the Undercloud and Overcloud work with Podman, so
there is no need to remove any docker container anymore.
This patch removes all the tripleo-docker-rm post-upgrade task and in those
services which only included a single task, the post-upgrade-tasks section
is also erased.
Change-Id: I5c9ab55ec6ff332056a426a76e150ea3c9063c6e
The two services use the same parameter for the location of the
CA cert. This causes problems when trying to deploy both services
on the same machine, for example in standalone mode.
Change-Id: Ie67bac28ac6097cba810b51496493584be0edcc8
Moving all the container environments from lists to dicts, so they can
be consumed later by the podman_container ansible module which uses
dict.
Using a dict is also easier to parse, since it doesn't involve "=" for
each item in the environment to export.
Change-Id: I894f339cdf03bc2a93c588f826f738b0b851a3ad
Depends-On: I98c75e03d78885173d829fa850f35c52c625e6bb
Inflight validations are now properly deactivated within the
tripleoclient/tripleo-common code.
This reverts commit 1761fc81c252e3dd565fe4f27e13f2c26426c806.
Change-Id: I4ea9bfadbcc71c847232c8585d99f8698daffc9a
The systemd healthcheck timer first triggers 120s after activation.
The initial value for ExecMainStatus is 0, resulting in false positives if we
check this too early.
This changes waits (up to 5 mins) for ExecMainPID to be set and the service to
return to an inactive/failed state.
Change-Id: Iad4ebb283a7a6559b6fffead4145cc9bbad45e4e
Depends-On: Ia2897a6be3e000a9594103502b716431baa615b1
Related-bug: #1843555
The validation tasks added in I2c044e3d2af7f747acde5ad3bf256386b8c550a3 are not
valid on docker. As it's now deprecated we can just skip them.
Change-Id: I4ff530af8ad7f864b8038e5e509ec38840096c5d
Related-bug: #1842687
Instead of running "podman exec" to test the container healthchecks, we
should rather rely on the status of systemd timers which reflect the
real state of the healthchecks, since they run under a specific user and
pid.
Also, we should only test the healthchecks if
ContainerHealthcheckDisabled is set to False.
Change-Id: I2c044e3d2af7f747acde5ad3bf256386b8c550a3
Closes-Bug: #1842687
This patch removes fluentd composable service in favor of rsyslog composable service
and modifies *LoggingSource configuration accordingly.
Change-Id: I1e12470b4eea86d8b7a971875d28a2a5e50d5e07
We believe this change induced a regression[1] that is further breaking TripleO TLS-Everywhere deployments. Submitting a revert patch while we investigate and work on a more robust solution.
[1] - https://bugzilla.redhat.com/show_bug.cgi?id=1743485
This reverts commit fc914e96116532985fef5b7e02e1dbbc8842f81e.
Change-Id: I5dc334d5b5232b7e0097d0a0e735abc911060917
Before we start services on upgraded bootstrap
controller (usually controller-0), we need to
stop services on unupgraded controllers
(usually controller-1 and controller-2).
Also we need to move the mysql data transfer
to the step 2 as we need to first stop the
services.
Depends-On: I4fcc0858cac8f59d797d62f6de18c02e4b1819dc
Change-Id: Ib4af5b4a92b3b516b8e2fc1ae12c8d5abe40327f
In case the freeipa CA is a sub CA of an external CA the
InternalTLSVncCAFile requrested does not have the full CA
chain and only have the free IPA CA. As a result qemu
which can not verify the vnc certificate sent by the
vnc-proxy. The issue is in certmonger[1] as it does not return the
full CA chain.
As a workaround, until certmonger is fixed, this change points the
InternalTLSVncCAFile to /etc/ipa/ca.crt which has the full CA chain.
[1] - https://bugzilla.redhat.com/show_bug.cgi?id=1710632
Change-Id: I750c5572505ff58b8164906754f1bcaf4fd256e0
The tripleo-docker-rm role has been replaced by tripleo-container-rm [0].
This role will identify the docker engine via the container_cli variable
and perform a deletion of that container. However, these tasks inside the
post_upgrade_tasks section were thought to remove the old docker containers
after upgrading from rocky to stein, in which podman starts to be the
container engine by default.
For that reason, we need to ensure that the container engine in which the
containers are removed is docker, as otherwise we will be removing the
podman container and the deployment steps will fail.
Closes-Bug: #1836531
[0] - 2135446a35
Depends-On: https://review.opendev.org/#/c/671698/
Change-Id: Ib139a1d77f71fc32a49c9878d1b4a6d07564e9dc
The network for nova-vnc is not set correctly when determining
principals and hostnames for TLS certs.
Closes-Bug: #1832013
Change-Id: Ie8f31413a485c7a91a421ffeefe230a353266993
This converts all Docker*Image parameter varients into
Container*Image varients.
The commit was autogenerated with the following shell commands:
for file in $(grep -lr Docker.*Image --include \*.yaml --exclude-dir releasenotes); do
sed -e "s|Docker\([^ ]*Image\)|Container\1|g" -i $file
done
Change-Id: Iab06efa5616975b99aa5772a65b415629f8d7882
Depends-On: I7d62a3424ccb7b01dc101329018ebda896ea8ff3
Depends-On: Ib1dc0c08ce7971a03639acc42b1e738d93a52f98
With cellsv2 multicell in each cell there needs to be a novnc proxy as the
console token is stored in the cell conductor database. This change adds
the NovaVncProxy service to the CellController role and configures the
endpoint to the local public address of the cell.
Closes-Bug: #1822607
Depends-On: https://review.openstack.org/649265
Change-Id: Ia3a36d369fdc18685f4c965a9e371ca3143967bf
UpgradeRemoveUnusedPackages is not used anymore. All packages are
supposed to be removed on undercloud upgrade to 14.
Change-Id: Ie6b739390ec0ae0c5773a5a6c63b49422195623a
This change combines the previous puppet and docker files
into a single file that performs the docker service installation
and configuration.
Change-Id: I9bd5c9f007d9f69d7310cdd0106bcc923c1b0acd