2913 Commits

Author SHA1 Message Date
Michele Baldessari
bad716070a Switch HA containers to k8s-file log-driver and make it a parameter
Currently in puppet-tripleo for the HA container we hardcode the following:
 options => "--user=root --log-driver=journald -e KOLLA_CONFIG_STRATEGY=COPY_ALWAYS${tls_priorities_real}",

Since at least podman had some changes in terms of supported driver
backends (and bugs) it's best if we make this configurable. While we're
at it we should also switch to k8s-file as a driver when podman is being
used which is what all other containers are using. When docker is the
default container_cli we will stick to journald as usual.

Tested this on a Train environment and successfully verified that
we still see the correct logs in /var/log/containers/.../...

Change-Id: I5b1483826f816d11a064a937d59f9a8f468315a5
Closes-Bug: #1853517
2019-11-22 11:36:37 +01:00
Zuul
0b024f58f5 Merge "Remove neutron wrappers" 2019-11-21 23:38:35 +00:00
Martin Magr
0e0b5a10f5 Synchronize connection configuration for metrics
Currently there are cases when collectd is trying to connect to QDR
on different interface than QDR is listening for connections.

This patch makes sure that those services are always in sync. And also
is changing interior configuration to work based on IPs instead
of hostnames.

Change-Id: Ia865bef9daf5b7e92b1b8c3712113416c9c8c176
2019-11-21 12:15:11 +01:00
Zuul
d02be09bda Merge "Improve stonith leves idempotency." 2019-11-20 18:43:07 +00:00
Giulio Fidente
1fc65d1d1b Ceph Grafana should not be exposed by HAProxy as a public service
We need to set ceph_grafana_vip on a network which is routable
and yet not the public network where the OpenStack APIs are exposed
but the service is *not* public.

Change-Id: I7d636c4513317162ec4b49aa12d88a959bf5c537
2019-11-18 13:22:57 +00:00
Emilien Macchi
afe7cecb59 Remove neutron wrappers
With I2feb9e81bc40e44cb2c7a2972366fa4b16590227, we don't need the
wrappers managed by Puppet anymore, everything is deployed by Ansible.

Blueprint: safe-side-containers
Depends-On: I2feb9e81bc40e44cb2c7a2972366fa4b16590227

Change-Id: I890fff9c7ead7e72fd4fe3a58b4ffce2e315b916
2019-11-15 12:22:02 +00:00
Martin Magr
b66ee38e82 Fill sslProfile only when it is defined
Currently when it is not defined the config contains sslProfile: undef which
makes nterior QDR communication malf`unctioning.

Change-Id: I62edac42204b28d9a81789723b331c79aa3358a6
2019-11-12 20:48:54 +01:00
Luca Miccini
7c30bd791f Improve stonith leves idempotency.
This commit improves the way stonith levels are set up and their
resiliency against redeployments by introducing a stonith_levels
custom fact that collects the current stonith levels defined for
the specific server, so we can compare against the desired number
of levels defined in hiera.

If these do not match (for example if there are additional levels
that are no longer necessary), the clean up step also introduced
by this commit takes care of deleting the ones no longer necessary.

Change-Id: Ifae73ac2bf4481d0a11e89c0ea0916e85dd2db1d
2019-11-12 13:03:18 +01:00
Zuul
18a9016b7a Merge "Fix upper case checks for SRIOV interface" 2019-10-25 19:47:09 +00:00
Zuul
72ccfcb217 Merge "ovn-dbs-bundle: Prepare for supporting new OVN version with separarte run dirs" 2019-10-25 19:47:08 +00:00
Martin Schuppert
f58dfd55ab Include ::nova::pci to nova api profile
pci_alias was removed from nova::api with
c3e5c7480f03949a824165349642b59a6077ec5d
We need to include ::nova::pci for the nova api service to have
pci/aliases configured in nova api.

Closes-Bug: #1849797

Change-Id: I5258028ff636e8a6287468499dd6974f6c7f6f6f
2019-10-25 10:56:56 +02:00
Zuul
f554d9662e Merge "Only configure libvirt-guests if enabled" 2019-10-24 19:47:55 +00:00
Zuul
5b85544787 Merge "Add configurable monitor timeouts for ovn dbs" 2019-10-23 23:35:06 +00:00
Martin Schuppert
9d3af781e7 Only configure libvirt-guests if enabled
If NovaResumeGuestsStateOnHostBoot is not true, there is no
reason to configure libvirt-guests.

Also a customized libvirt-guests systemd unit file is added
via [1] which already has the dependencies configured. Therefore
there is no need to manage them via puppet again.

Related-Bug: #1849264
Depends-On: https://review.opendev.org/690016

[1] https://github.com/openstack/tripleo-heat-templates/blob/master/deployment/nova/nova-compute-container-puppet.yaml#L777

Change-Id: Ic127937ad83eebb47e34c5f36c826f8dcf1048a9
2019-10-23 16:28:44 +00:00
Numan Siddique
0a7bf3a07c ovn-dbs-bundle: Prepare for supporting new OVN version with separarte run dirs
This patch prepares the ground for using the latest OVN. OVN is split from
openvswitch and it has its own code repo. After the split, OVN has its
own run dir (/var/run/ovn), db dir (/etc/ovn/), log dir (/var/logs/ovn)
and datadir - /usr/share/ovn/scripts.

With this patch, it supports running older version (2.11) or new
version (2.12) without any issues. It mounts the host directories accordingly
so that there is no impact when OVN is updated.

Change-Id: I5d778cbeb2863ec0fe649799863752e8eb16492f
Signed-off-by: Numan Siddique <nusiddiq@redhat.com>
2019-10-23 20:12:49 +05:30
053d643cb7 Update master for stable/train
Add file to the reno documentation build to show release notes for
stable/train.

Use pbr instruction to increment the minor version number
automatically so that master versions are higher than the versions on
stable/train.

Change-Id: I4ea46c9bb7b55ef6382cf9d52038e7d26d66e6eb
Sem-Ver: feature
2019-10-21 14:19:10 +00:00
Kamil Sambor
15e21010a8 Add configurable monitor timeouts for ovn dbs
Under pressure, the default monitor timeout value of 20 seconds is not
enough to prevent unnecessary failovers of the ovn-dbs pacemaker resource.
While spawning a few VMs in the same time this could lead to unnecessary
movements of master DB, then re-connections of ovn-controllers (slaves are
read-only), further peaks of load on DBs, and at the end it could lead to
snowball effect. Now this value can be configurable by dbs_timeout in
tripleo::profile::pacemaker::ovn_dbs_bundle and by default is set to 60s.

Change-Id: Ib95c6b7614631eed264d42e6cf61672b705e7893
Signed-off-by: Kamil Sambor <ksambor@redhat.com>
2019-10-21 14:59:09 +02:00
Emilien Macchi
c97dbd1946 Prepare RC + stable/train
Change-Id: I5ad516f1bf08369045d7729abafedb11e4cfc541
2019-10-18 11:18:19 -04:00
yogananth subramanian
f5daa76982 Fix upper case checks for SRIOV interface
Allow using upper case names for SRIOV interface names.

Fixes bug 1848483

Change-Id: I2d2cb42d87371f5807a4527eef22425416d4a774
2019-10-17 23:14:44 +05:30
Zuul
5f1b0010f4 Merge "Revert "Add support to configure token caching in keystone"" 2019-10-16 05:42:02 +00:00
63dd90aacc Revert "Add support to configure token caching in keystone"
Changing cache/enabled=False by default has dropped performance.
keystone local cache also got disabled with this.

This reverts commit 469d432195d1f5b5e15ce72ce1624d4ed4447e4e.


Depends-On: https://review.opendev.org/#/c/688770/
Closes-Bug: #1847585
Change-Id: I2af70755746f3fc3eb10eba2188ad2772704d988
2019-10-15 17:53:40 +00:00
Zuul
a98b9c0212 Merge "Allow the IHA OCF and fencing resource to be moved to the nova service user" 2019-10-15 16:37:36 +00:00
Zuul
11c9839df8 Merge "pacemaker: add support for Hash vs List in container environment" 2019-10-15 08:09:46 +00:00
Michele Baldessari
7e78ebdc0f Deep merge hiera keys for mysqld_options
Currently when adding some tuning options via hiera, galera won't start because
overriding even a single mysql option will reset the whole key in the hash. So
for example, when adding:
    tripleo::profile::base::database::mysql::mysql_server_options:
      mysqld:
        # MySQL InnoDB equally divided in 1GB instances
        innodb_buffer_pool_instances: 2
        # Query network write timeout raised to 120 seconds
        net_write_timeout: 120
        # Query network read timeout raised to 120 seconds
        net_read_timeout: 120
        # MySQL connection timeout set to 8 hours
        connect_timeout: 28800

Things will break because all the wsrep options that are set normally will be
overridden and galera will refuse to start

Tested by passing the above hiera keys and observing the deploy complete
successfully and the settings correctly applied to galera/mysql on the overcloud.

Change-Id: I30f03bc8eb81db0243c137d4af08924adeebc951
Closes-Bug: #1848060
2019-10-14 19:19:51 +02:00
Emilien Macchi
f8d9dfb497 pacemaker: add support for Hash vs List in container environment
We are transitioning from an array to an hash for the container
environment of each container:
I894f339cdf03bc2a93c588f826f738b0b851a3ad

Mainly to make it consummable by Ansible later; where the
podman_container module needs a dict instead of a list.

This patch just changes the default, and also adds support for an Hash
instead of a List, but still supporting the List.

Change-Id: I4e53a4a3464940660473bcbe74e30507a69a4019
2019-10-11 17:57:34 -04:00
Michele Baldessari
066a360ee5 Allow the IHA OCF and fencing resource to be moved to the nova service user
Currently both nova evacuate and fence compute in the Instance HA
setup of tripleo user the keystone admin user in order to query nova,
evacuate instances, disable/enable the nova-compute service and
call the nova force-down API.

With this patch we introduce the keystone_tenant parameter which is
needed when moving to the nova service user as it is different than
keystone_admin in that case.

Tested as follows:
1. Deployed a normal unpatched OSP13 with IHA
2. Run a redeploy with the following addition:
parameter_defaults:
  ExtraConfig:
    tripleo::profile::base::pacemaker::instance_ha::keystone_password: "%{hiera('nova::keystone::authtoken::password')}"
    tripleo::profile::base::pacemaker::instance_ha::keystone_admin: 'nova'
    tripleo::profile::base::pacemaker::instance_ha::keystone_tenant: 'service'
3. Observe the following:
3.1. Both the fence_compute and nova evacuate resources have updated attributes
3.2. IHA still works correctly

Change-Id: If6b19ad05e0f91425f93a1c123947e92cf2ba949
2019-10-11 22:18:43 +02:00
Zuul
06e901c215 Merge "Remove Tacker service" 2019-10-10 22:59:40 +00:00
Zuul
89ea5ef2c4 Merge "Workaround for /etc/pki/CA/certs/vnc.crt not present" 2019-10-10 22:59:24 +00:00
Grzegorz Grasza
2c241e3934 Workaround for /etc/pki/CA/certs/vnc.crt not present
When doing an upgrade to TLS Everywhere, vnc.crt is not always created
by the time the getcert command exits (even though it is run with the
-w flag). Puppet then ignores the instruction to change the file
permissions, resulting in an error at a later stage, when podman
tries to mount the file onto a container.

Change-Id: I0e0009d57cd1c90f8ae28a2cfc9337ecf8c75112
2019-10-08 17:44:54 +00:00
Zuul
4dee772a2b Merge "Add support to configure token caching in keystone" 2019-10-07 23:31:01 +00:00
Zuul
be1135c827 Merge "Configuration changes to support Qdr-mesh topology." 2019-10-07 17:09:21 +00:00
Zuul
d9a94dd694 Merge "Fix missing PXE directories for Conductor" 2019-10-04 08:10:39 +00:00
Bogdan Dobrelya
2f69faf666 Fix missing PXE directories for Conductor
When Ironic Conductor class is called, it expects the
PXE directories exist. That is only the case for the step 4.
While there is also a case when the conductor class invoked
for the step 3 & db sync case. For that case also inlcude
the missing ironic::pxe class to ensure the PXE directories
created.

Closes-Bug: #1845222

Change-Id: I394f56ba9b213c75378bdf21999d23509632523c
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
2019-10-03 09:47:57 +02:00
Wes Hayutin
8753a47f8a remove tripleo-ci-centos-7-scenario010-multinode-oooq-container
non-voting and failing.
removing

Change-Id: I75f7c7cdf4b086ea7dcdaa26009f6eda654801d1
2019-10-02 20:27:50 -06:00
Nagasai Vinaykumar Kapalavai
4cb50d58a5 Configuration changes to support Qdr-mesh topology.
These changes update router mode to edge or interior
depending on the node type. The mesh topology is
formed on controller nodes and remaining nodes will
connect to this topology.

Co-Authored-By: Martin Magr <mmagr@redhat.com>

Change-Id: I195dbf70f490984cedb32d48fe09f4562cbf94fa
2019-10-02 15:49:11 +02:00
Luca Miccini
72487d804e Add a configurable delay to Nova Evacuate calls
Add a configurable delay to Nova Evacuate calls

In case /var/lib/nova/instances resides on NFS we have seen migrations
failing with 'Failed to get "write" lock - Is another process using the
image' errors.

This has been tracked down to grace/lease timeouts not having expired
before attempting the migration/evacuate, so in this cases it might be
desirable to delay the nova evacuate call to give the storage time to
release the locks.

Related resource-agents change: https://review.opendev.org/#/c/684777/

Change-Id: I5ec6a5b0c66579e068e811f49aae10a5f406158a
Resolves: rhbz#1740069
2019-10-02 07:33:26 +00:00
Zuul
52c877f89e Merge "Use correct paths to configure ovn dbs certs" 2019-10-02 00:20:08 +00:00
Zuul
479398b4f2 Merge "Be able to set pcs resource op defaults" 2019-10-01 17:57:39 +00:00
Zuul
ba36ed1ee2 Merge "Add 'ipversion' to firewall/rule.pp" 2019-10-01 11:21:24 +00:00
Michele Baldessari
dfe3c07720 Be able to set pcs resource op defaults
Tested by adding the following hiera key to the deployment:
parameter_defaults:
  ExtraConfig:
    tripleo::profile::base::pacemaker::resource_op_defaults:
      timeout_test:
        name: 'timeout'
        value: '60s'

And correctly obtained:
[root@controller-0 tripleo]# pcs resource op defaults
timeout: 60

This allows an operator to raise global timeouts (which might be needed
in case podman performance turns out to be problematic)

Depends-On: https://review.opendev.org/664606

Change-Id: Ifa75eb9274705ea4b1c530b22659e4e106681250
2019-10-01 06:25:15 +00:00
Zuul
fb56cd0f84 Merge "Support networking-ansible-ml2 coordination" 2019-10-01 03:41:50 +00:00
Zuul
c6b992da66 Merge "Add collectd-sensubility configuration" 2019-09-25 20:36:35 +00:00
Harald Jensås
7264c75c37 Add 'ipversion' to firewall/rule.pp
Add the posibility to add 'ipversion' to the firewall
rule manifest.

Closes-Bug: #1845153
Change-Id: Id872c55cfc6b958fef3ccda2d923f821a1fe6a13
2019-09-25 18:36:44 +00:00
Zuul
ebe6f1a300 Merge "Update log-driver value for podman" 2019-09-25 08:35:41 +00:00
Zuul
69626b0575 Merge "Add unit tests for manila manifests" 2019-09-25 04:55:30 +00:00
Zuul
58bdf67b7e Merge "Use memcached for token caching in manila authtoken" 2019-09-25 04:55:28 +00:00
Kamil Sambor
ad7d818e20 Use correct paths to configure ovn dbs certs
Change-Id: Id68fb8a79a04f7c864b103ef354698725a5dc64c
2019-09-23 16:54:01 +02:00
Cédric Jeanneret
0976e4eeb3 Update log-driver value for podman
Depending on the podman version, "json-file" is set to noop and makes
podman crash (true for at least podman 1.4.1), while older versions
re-add this json-file as an alias to k8s-file (true since 1.4.3).

Ensuiring we're using k8s-file will prevent issues depending on the
podman version.

Relates to https://bugzilla.redhat.com/show_bug.cgi?id=1754416
Closes-Bug: #1844856

Change-Id: I70eba8af06741ed81173689a03c4867421917cd6
2019-09-23 13:49:31 +02:00
Zuul
f494131c2d Merge "Use memcached for token caching in authtoken for telemetry services" 2019-09-23 03:30:09 +00:00
Zuul
10310d4dfd Merge "Support deploying multiple Cinder Pure Storage backends" 2019-09-21 20:59:12 +00:00