With I57047682cfa82ba6ca4affff54fab5216e9ba51c Heat has added
a new template version for wallaby. This would allow us to use
2-argument variant of the ``if`` function that would allow for
e.g. conditional definition of resource properties and help
cleanup templates. If only two arguments are passed to ``if``
function, the entire enclosing item is removed when the condition
is false.
Change-Id: I25f981b60c6a66b39919adc38c02a051b6c51269
Let's remove these in master, they are not needed
now that we're fully Centos/Rhel 8-based on master.
Change-Id: I1192c263e08e98a7465d92d8565845ab191ea626
Nova vnc configuration right now uses NovaVncProxyNetwork,
NovaLibvirtNetwork and NovaApiNetwork to configure the different
components (novnc proxy, nova-compute and libvirt) for vnc.
If one of the networks get changed from internal_api, the service
configuration between libvirt, nova-compute and novnc proxy gets
inconsistent and the console is broken.
This changed to just use NovaLibvirtNetwork for configuring the vnc
endpoints and removes NovaVncProxyNetwork completely.
Change-Id: Icef2481b65b41b524ad44eeecfbee4451006e1d2
Closes-Bug: #1917719
Nova supports to configure resource provider inventory and traits using a
standardized YAML file format starting victoria release [1]. This introduces
CustomProviderInventories role parameter to configure the custom provider yaml.
[1] https://docs.openstack.org/nova/latest/admin/managing-resource-providers.html
Depends-On: If12d7f5a8c331e051eb543f88187c31e676f3b62
Depends-On: I509eec3bf37368640ae8a3df8271b769d29f70c4
Change-Id: I25ea828397fcc968d07b0d5e87bdc9445ac690e2
In order to ANSIBLE_INJECT_FACT_VARS=False we have to use ansible_facts
instead of ansible_* vars. This change switches our distribution and
hostname related items to use ansible_facts instead.
Change-Id: I49a2c42dcbb74671834f312798367f411c819813
Related-Bug: #1915761
Adding RootStackName variable to the scale tasks so that
we can reference it instead of the existing environment
variables. This will ensure that the scale down uses the
environment variables from clouds.yaml and get the
OS_CACERT while trying to speak with the overcloud endpoints
Change-Id: Ia8868172fb16b294208ee3d6b03c09442fe39443
Closes: #1913275
This change switches from using service facts to using systemctl
commands to do service checks. This is done to reduce the amount of
memory used as part of the deployment.
Change-Id: I0cd5b24933e50680baefd055d6e68e277ab09315
Related-Bug: #1915761
This was mainly there as an legacy interface which was
for internal use. Now that we pull the passwords from
the existing environment and don't use it, we can drop
this.
Reduces a number of heat resources.
Change-Id: If83d0f3d72a229d737a45b2fd37507dc11a04649
If rbd is used for glance, but compute is using local ephemeral storage,
nova-compute can direct download the images in this scenario from the
glance ceph pool via rbd, instead going through glance api.
This change introduce new compute role based parameters to enable direct
download of glance images via rbd. If NovaGlanceEnableRbdDownload is set,
per default the global RBD glance parameters are used, CephClientUserName
GlanceRbdPoolName and CephClusterName for the used ceph.conf.
Glance also support multi storage backends which can be configured using
GlanceMultistoreConfig. If additional RBD glance backends are configured,
the NovaGlanceRbdDownloadMultistoreID can be used to pointing to the
hash key (backend ID) of GlanceMultistoreConfig to use.
Depends-On: https://review.opendev.org/c/openstack/puppet-tripleo/+/772168
Depends-On: https://review.opendev.org/c/openstack/puppet-nova/+/770687
Change-Id: I020da468d909bd98819f1e3618bf905260791d9b
Add NovaLibvirtMaxQueues role parameter to set [libvirt]/max_queues in
nova.conf of the compute. Default 0 corresponds to not set meaning the
legacy limits based on the reported kernel major version will be used.
Depends-On: Ieaa29b51257f5ea3a5e4d6c678140fd9ae052d88
Change-Id: I353e8ca2676bbdceb056f8b2b084bc5102f52c1f
When a node has hugepages enabled, we can help with live migrations by
enabling NovaLiveMigrationPermitPostCopy and
NovaLiveMigrationPermitAutoConverge.
Related: https://bugzilla.redhat.com/1298201
Change-Id: I1133c210f35181d44f8ba56f09b52f00589e035c
After change [1] nova-compute launch libguestfs using the default
``qemu:///system``, but when ``inject_password` is set to true and
user tries to create vm, the vm creation is successful and we could
see libguestfs error in nova-compute logs.
This change forces libvirt to use ``direct`` when launching instances
on host.
[1] Ib55936ea562dfa965be0764647e2b8e3fa309fd6
Change-Id: I195358742c19d6ea0a3d32979896c0268e3b55a6
Closes-bug: #1912141
libvirt-daemon is part of the default overcloud image but it's also
possible that it's not installed or simply removed by operators. In this
case, tripleo_nova_libvirt_guests will fail.
Related: https://bugzilla.redhat.com/1810319
Change-Id: I0814bd8794ab82792837b27d0128e15c34b90adc
Add support for the [compute]/image_type_exclude_list parameter to
prevent image types being reported as supported by a compute node.
Depends-On: I389d4b586468720d73ac69b025a3c34df54fe73e
Change-Id: I326cb9facf33693fdf8f361f9bc58aa28b3c20af
Default CephAnsibleSkipClient to True and CephConfigPath to
/var/lib/tripleo-config/ceph (instead of /etc/ceph) and set
these paramters explicitly in scenario00{1,4}. This will
result in all Ceph client configuration being done not by
ceph-ansible but by the new tripleo-ceph-client role from
tripleo-ansible.
Add the CephClient service to all Controller* roles which will
use Ceph. The service could have always been there as there are
Ceph clients on the these controllers, but it was not because
ceph-ansible configured clients as a side effect. With new
CephConfigPath default they no longer overlap so the service
is required.
Add support for CephExternalMultiConfig via tripleo-ceph-client
by looping on the contents of the CephExternalMultiConfig list
and passing each map as the dcn variable while including the
tripleo-ceph-client role each time.
Related-Bug: #1708302
Depends-On: I938ab604859fda88f3491399444841a3a373d162
Change-Id: I784e6a476752ed701192b3a0155c42edd4836d97
We need an optional delay on nova-compute when it's waiting for ceph to
be healthy. This commit is adding a wrapper that will be deployed when
necessary.
Related: https://bugzilla.redhat.com/show_bug.cgi?id=1498621
Change-Id: Ie7ad2d835c1762dc4b9341e305e6a428cb087935
This change introduces a new CephConfigPath parameter that can be used
by all the OpenStack clients when looking for Ceph client related info
(ceph.conf and keyrings).
By overriding this parameter we can make the containers able to pull
data from different path than /etc/ceph wich was hardcoded.
On top of this change, a new bool is added to prevent the ceph-ansible
client role being executed.
When this boolean is true, the 'ceph_client' tag is added to the list
of tags that should be skipped in ceph-ansible.
By doing this, ceph-ansible won't run the client role [1] and the new,
tripleo_ceph_client role is imported and executed.
[1] https://github.com/ceph/ceph-ansible/blob/master/site-container.yml.sample#L269
Depends-On: Iaabb66cd26f0246defe391a4e34f4eab3c3c5fee
Depends-On: Ia60bc6d5d1a04bd560f2fcb05a4b64078015ae9d
Change-Id: I36673367411cc8d68ffb9ec4a2fbff64ebf12f29
https://review.opendev.org/q/I8df21d5d171976cbb8670dc5aef744b5fae657b2
introduced THT parameters to set libvirt/cpu_mode. The patch sets the
NovaLibvirtCPUMode wrong to 'none' string which results in puppet-nova
not to handle the default cases correct and sets libvirt/cpu_mode to
none which results in 'qemu64' CPU model, which is highly buggy and
undesirable for production usage. This changes the default to the
recommended CPU mode 'host-model', for various benefits documented
elsewhere.
Closes-Bug: #1905544
Change-Id: Iea8cccd77caac4b84764d84a213918ed57bd4e3e
It is best to avoid placing db creds on the compute nodes to limit the
exposure if an attacker succeeds in gaining access to the hypervisor
host.
Related patches in puppet-nova remove the credentials from nova.conf
however the current scope of db credential hieradata is all nova tripleo
services - so it will but written to the hieradata keys on compute
nodes.
This patch refactors the nova hieradata structure, splitting the
nova-api/nova database hieradata out into individual templates and
selectively including only where necessary, ensuring we have no db
creds on a compute node (unless it is an all-in-one api+compute node).
Depends-On: I07caa3185427b48e6e7d60965fa3e6157457018c
Change-Id: Ia4a29bdd2cd8e894bcc7c0078cf0f0ab0f97de0a
Closes-bug: #1871482
When using RHSM Service (deployment/rhsm/rhsm-baremetal-ansible.yaml) based
registration of the overcloud nodes and enabling the KSM using
NovaComputeEnableKsm=True the overcloud deployment will fail because the
RHSM registration and the ksm task run as host_prep task. The handling
of enable/disable ksm is now handled in deploy step 1.
Closes-Bug: #1904184
Change-Id: I75a59f3d4b640f3146f2a865eff8be3f1383e078
Trilio currently mounts an NFS export in /var/lib/nova to make it accessible
from within the nova_compute and nova_libvirt containers.
This can result in considerable delays when walking the directory tree to
ensure the ownership is correct.
This patch adds the ability to skip paths when recursively setting the
ownership and selinux context in /var/lib/nova. The list of paths to skip
can be set via te NovaStatedirOwnershipSkip heat parameter. This default to
the Trilio dir.
Change-Id: Ic6f053d56194613046ae0a4a908206ebb453fcf4
This exposes the nova workaround to disable downloading images from glance to
rbd (vs a cheap COW clone) when nova-compute and glance are not backed by the
same ceph cluster.
Related nova change: I069b6b1d28eaf1eee5c7fb8d0fdef9c0c229a1bf
Depends-On: I8329810d6c047c0d94e7b123e7cdc1263a7856cd
Change-Id: Ib5478e53eb1f216bf6924ff30ea8502cb8529d00
Since multiple types of computes can be deployed, we should allow the
customization of these containers to be role specific.
Change-Id: Ie91633c2bcc8011cc62b46452ea5b444cf12029f
This change adds new THT parameters `NovaLibvirtCPUMode`,
`NovaLibvirtCPUModels` and `NovaLibvirtCPUModelExtraFlags`
which allows to configure `libvirt/cpu_mode`, `libvirt/cpu_models`
and `libvirt/cpu_model_extra_flags` parameters respectively.
Change-Id: I8df21d5d171976cbb8670dc5aef744b5fae657b2
This change replaces the usage of deprecated image cache parameters of
the nova::compute::libvirt class, by the new nova::compute::image_cache
class which was recently added to puppet-nova.
Depends-on: https://review.opendev.org/#/c/757079/
Depends-on: https://review.opendev.org/#/c/757871/
Change-Id: I85001599cf25ee3c9c82bfc50cb3bd8a71c7bcd9
This change enforces the usage of internal api for token verification,
so that internal requests to keystone uses internal endpoint instead
of admin endpoint which is deployed on provisioning network by default.
Change-Id: I8b5ac36ff1da46844d18fa73f835175e52719a63
Closes-Bug: #1899266
Adds functionality whether to enable/disable irqbalance on compute
nodes.
Based on tuning recommendation for compute realtime nodes irqbalance
should be stopped and disabled. And tuned will be responsible for
managing IRQ balancing instead of irqbalance.
Change-Id: Ibefb8e472c68901a74d76769b5314bef81fd5b15
Add a single new parameter, NovaEnableVTPM, which will configure vTPM
support by setting nova's '[libvirt] swtpm_enabled' config option. We do
not yet expose nova's '[libvirt] swtpm_user' and '[libvirt] swtpm_group'
options since the Fedora RPM specfile, upon which CentOS' and RHEL's
specfiles are based, uses the standard user and group [1].
[1] https://src.fedoraproject.org/rpms/swtpm/blob/master/f/swtpm.spec
Change-Id: If90979c4b1bda279eca6dba46e3f53ab402b04c3
Depends-On: https://review.opendev.org/752904
Depends-On: https://review.opendev.org/753586
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Change I95cd5017fdbbec257d274b805be4509ec32f9019
added `NovaComputeOptVolumes` and `NovaComputeOptEnvVars`
but the description was not complete.
This change complete the description for the above parameters.
Change-Id: If91014ebca60dac43516d760d87171831816aca0
To support multiple vgpu types configuration, add new
parameter `NovaVGPUTypesDeviceAddressesMapping` where vgpu-type
is key and list of device_addresses are value.
Depends-On: https://review.opendev.org/#/c/750148/
Change-Id: Ifc30bbef66717cafb5ec2262be8fe07af1e60772
This exposes the remove_unused_original_minimum_age_seconds from nova.conf
which controls the time (in seconds) that nova compute should continue
caching an image once it is no longer used by and instances on the host.
Change-Id: Idfa892bab0cb59de5a418f0dc23a6e7d60100a49
1- Add specific mounts in nova_libvirt
They are needed in order to get SELinux support within the container
2- Remove now deprecated docker_enable condition
Since this one isn't needed anymore, just drop it.
3- Drop "z" flag from libvirt related mounts
This avoids relabelling issues from non-privileged containers
4- Set specific labels for the container itself.
See note 2 for more details.
Notes:
1- This will require to patch podman-1.6.4 in order to allow to actually
use security-opt when --privileged and/or --pid=host are passed[1].
2- The "container_share_t" filetype will be updated in a follow-up to
the newer version, "container_ro_file_t". This makes backports easier
to older releases that might not be aware of this new type.
The follow-up change is purely cosmetic in order to reflect the
actual behavior of SELinux and has no functional change.
Testing:
The first tests were done using a podman 1.9.3 in order to work around the
mentionned issues.
Newer tests were done using podman 1.6.4 scratch-builds in order to ensure
the reported issues were fixed.
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1846364
Depends-On: https://review.opendev.org/735063
Depends-On: https://review.opendev.org/745173
Co-Authored-By: Daniel Berrange <berrange@redhat.com>
Co-Authored-By: Kashyap Chamarthy <kchamart@redhat.com>
Change-Id: I9e0da2a48c23c35e084bea831fc744b9f053508b
Since we're using native podman healthcheck, we don't need to manage the
systemd service anymore.
Closes-Bug: #1889221
Change-Id: Id5d693b6111f5906a99ffe4fec4befffc3940e23
Now that the FFU process relies on the upgrade_tasks and deployment
tasts there is no need to keep the old fast_forward_upgrade_tasks.
This patch removes all the fast_forward_upgrade_tasks section from
the services, as well as from the common structures.
Change-Id: I39b8a846145fdc2fb3d0f6853df541c773ee455e
Generally it is recommended[1] to use native YAML syntax in ansible
instead of one line definition, because it brings some benefits like
clear difference detected in git.
This patch updates existing mount tasks to follow that recommendation.
[1] https://www.ansible.com/blog/ansible-best-practices-essentials
Change-Id: I42c55ee0f69234fd54003e9cc471570f668c17b6
There is some interplay between this option and other memory-related
options. Document this.
Change-Id: I408feb3ac8e30b67be8b01388926c6ab3d43bfac
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Related-Bug: #1882821
Paunch was deprecated in Ussuri and is now being retired, to be fully
replaced by the new tripleo-ansible role, tripleo_container_manage.
This patch:
- Removes common/container-puppet.py (was only useful when paunch is
enabled, since that script was converted to container_puppet_config
Ansible module in tripleo-ansible).
- Update all comments refering to paunch, and replace by
tripleo_container_manage.
- Deprecate EnablePaunch parameter.
- Remove paunch as python dependencies.
Depends-On: https://review.opendev.org/#/c/731545/
Change-Id: I9294677fa18a7efc61898a25103414c8191d8805
After removing libvirt dependencies default libvirt-client
systemd units are no longer on the host deployment fails
when using podman as container cli and NovaResumeGuestsStateOnHostBoot
enabled
Closes-Bug: #1880002
Change-Id: I2c899e80ddb42479322c948daddd67f990d5d5e6
When checking if keystone/nova healthchecks are healthy, make sure the
registered fact is set (which can slip to a further retry if podman
inspect took too much time to execute).
That way, we process the retries without an error like found in the bug
report.
Change-Id: I9f5063c9c3b598afd5bd01447f00a1146a20f4c3
Closes-Bug: #1878063
With ansible 2.9 IHA breaks like this:
TASK [If instance HA is enabled on the node activate the evacuation completed check] ***
Tuesday 05 May 2020 08:36:49 +0000 (0:00:00.403) 0:15:33.184 ***********
fatal: [compute-0]: FAILED! => {"msg": "The conditional check 'iha_nodes.stdout|lower | search('\"'+ansible_hostname|lower+'\"')'
failed. The error was: template error while templating string: no filter named 'search'. String: {% if iha_nodes.stdout|lower |
search('\"'+ansible_hostname|lower+'\"') %} True {% else %} False {% endif %}\n\nThe error appears to be in
'/var/lib/mistral/overcloud/ComputeInstanceHA/host_prep_tasks.yaml': line 391, column 5, but may\nbe elsewhere in the file depending
on the exact syntax problem.\n\nThe offending line appears to be:\n\n register: iha_nodes\n - file:
path=/var/lib/nova/instanceha/enabled state=touch\n ^ here\n"}
This is because in ansible 2.9 changed the way filters work.
We need to use the 'is' keyword just like we do in a number of other
places in tripleo-ansible.
Change-Id: Ie2c348df46519765d43330523e15d1861ed9a7b4
Closes-Bug: #1876897
This change adds two new parameters to manage vPMEM configuration
option and PMEM namespaces on the host server.
* `NovaPMEMMappings` sets Nova's `pmem_namespaces` conf option.
* `NovaPMEMNamespaces` creates PMEM namespaces on the host
using `ndctl` tool and ansible tripleo_nvdimm role.
Depends-On: https://review.opendev.org/717158
Change-Id: I270cb624a33a739aa06bba5e4faee4b01fb3cfb3
Closes-Bug: #1870455
They were disabled until the native podman healthcheck was integrated in
tripleo-ansible and it finally merged; so we can remove that safeguard
and it should be working.
Change-Id: I03361c33e54f0c8e71b420b144464ccb29a1ca4e
There is no need to bind mount /etc/ssh/ssh_known_hosts for all the
containers. It's only useful for nova_compute and nova_libvirt
containers where live migration needs that file to work.
Change-Id: I9765dedf43d2c6765922eafaa9d5791ce488b41f
The systemd healthchecks are moving away, so we can use the native
podman healthchecks interface.
See I37508cd8243999389f9e17d5ea354529bb042279 for the whole context.
This patch does the following:
- Migrate the healthcheck checks to use podman inspect instead of
systemd service status.
- Force the tasks to not run, because we first need
https://review.opendev.org/#/c/720061 to merge
Once https://review.opendev.org/#/c/720061 is merged, we'll remove the
condition workaround and also migrate to unify the way containers are
checked; and use the role in tripleo-validations.
Depends-On: https://review.opendev.org/720283
Change-Id: I7172d81d305ac8939bee5e7f64960b0a9fea8627
Id5503ed274bd5dc0c5365cc994de7e5cdcbc2fb6 is failing with permission
denied on rhel8 due to a selinux denial.
Change-Id: If7a565cdb14282261125d4e32488bb9c5ebc504e
Related-bug: #1869020