We don't need these implementations anymore because fluentd support was
already removed from tripleo-heat-templates.
Depends-on: https://review.opendev.org/#/c/668851/
Change-Id: If8bca34b9893fc49f598e8c86cd45bc55848363f
With I2feb9e81bc40e44cb2c7a2972366fa4b16590227, we don't need the
wrappers managed by Puppet anymore, everything is deployed by Ansible.
Blueprint: safe-side-containers
Depends-On: I2feb9e81bc40e44cb2c7a2972366fa4b16590227
Change-Id: I890fff9c7ead7e72fd4fe3a58b4ffce2e315b916
Depending on the podman version, "json-file" is set to noop and makes
podman crash (true for at least podman 1.4.1), while older versions
re-add this json-file as an alias to k8s-file (true since 1.4.3).
Ensuiring we're using k8s-file will prevent issues depending on the
podman version.
Relates to https://bugzilla.redhat.com/show_bug.cgi?id=1754416
Closes-Bug: #1844856
Change-Id: I70eba8af06741ed81173689a03c4867421917cd6
This patch adds possibility to configure collectd-exec to execute
collectd-sensubility and configure this extension.
Change-Id: Ieb5042603ff76fd22f867a17a853bb1ec6a744f2
logrotate.pp should support dateext and related parameters.
By this change, a filename of a rotated file can be easily distinguished
by rotated date.
Change-Id: I798304a472df41b86a88611c97c2c99131faa0ad
In order to get a more complete container logging, we now enable
the file logging for the podman containers.
This will output container stdout/stderr in a file located in the
new /var/log/containers/stdouts location.
This follows the other efforts already made with paunch[1] and
docker-puppet.py (now named container-puppet.py)[2]
Notes:
- podman supports only "json-file", allowing to push files in the
location we want via the "path" log option
- docker doesn't have the "path" log option and push its log in
/var/lib/docker/containers/ID/ID-json.log - unusable since it's
destroyed upon container removal.
[1] https://review.openstack.org/635437
[2] https://review.openstack.org/635438
Change-Id: Ibaa8bca52ea2f68afa1effc989b04d2e6213813a
...so the wrappers' logs can be found via the host's journalctl
Closes-bug: #1821794
Change-Id: I4174e6d5852a6939e71d4113a547cf3dc25b9f47
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
Current implementation of collect_gnocchi plugin configuration blocks
usage of other python-based plugins. This patch leverages puppet-collectd
classes properly to enable configuration of more than one python-based
plugin together.
Change-Id: I248859bf0e4b70e3a057e96b5fb74be64f4008ed
In order to prevent the removal of unwanted files, the "find"
commands in the post-rotate have been removed, in favor of a
tmpwatch call (see Depends-On patch).
Depends-On: https://review.openstack.org/641608
Change-Id: Ideaecc7559664684a8665292f77a385a87224582
After moving mlnx interfaces to switchdev mode in sriov, it will
reset that interface and may change its name and also will need ifup
in order to get back the previous configuration.
So adding a udev rule and ifup command the save the interface name
and its configuration
Change-Id: Ib4f384da344344f9e2ec666b0d8dbae441f24568
Closes-Bug: 1816710
Currently we spawn haproxy with: ip netns exec ${NETNS}
/usr/sbin/haproxy -Ds $ARGS.
The reason for that was that with -Ds we keep a process in the foreground:
-Ds Start in systemd daemon mode, keeping a process in foreground.
Since haproxy 1.8 removed the
haproxy-systemd-wrapper it also removed the '-Ds' option. In order to
keep things running in the foreground we can just switch to using '-Ws'
Which is the multiworker mode with systemd support which keeps the
process in the foregroud.
This commit keeps backward compatibility with current HAProxy to ease
the transition to new HAProxy.
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: Ia914de9b3438976d24bf09ad680e806a0fb6644e
When configuring vf-lag, it will not work properly when configuring switchdev
capability while there are some vfs bounded.
So removing all the binding codes as it's not needed anymore.
Closes-Bug: 1809733
Change-Id: I135cef33bece6fd31363e093e53617caac413ce0
Neutron services failing with below Error when running with podman(0.12.1)
and container-selinux(2.77):-
relabel failed "/run/netns": operation not supported
Until this is fixed in podman/container-selinux, temporary remove selinux
relabel on /run/netns.
Change-Id: I596074fcc2318ebb3d7efb0128a2b25527e19808
Partial-Bug: #1809218
Adapt wrapper containers for podman, which has no a socket available.
Add container_cli parameter for base neutron class, default to docker.
Possible values: podman/docker (default). It is used by the wrappers
tooling to issue CLI commands to the host containers system.
Deprecate bind_socket so it does nothing for podman CLI.
Additionally, add debug triggers for the wrapper scripts messages to
become captured to the wrapper containers' stdout.
Do not stop and remove the existing container before launching a new
one. Allow the neutron parent process to control the process life
cycle. Although make the wraper containers cleaning up any exited
containers after its main process terminated by the neutron parent
process. Additionally, If a name is already taken by a container,
give it an unique name and assume all the smooth transitioning work
to be done by the parent neutron process and that clean up logic
in the wrapper.
Closes-Bug: #1799484
Change-Id: Ib3c41a8bee349856d21f360595e41a9eafd79323
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
This reimplements commit 67a7dc70f2885b7db2a42bc28c25ece0bbeba3e4.
Copytruncate becomes a default for containerized logrotate. The
solution based on signals processing goes away.
As long as key deployment framework components heat-engine and
mistral-engine do not tolerate SIGHUP copytruncate should be used.
There is more openstack services, like neutron-server, nova-scheduler
that cannot handle SIGHUP nicely yet.
Nor can we fall back to that predates the containerization of services
because of the following reasons:
* We cannot and should not use the restart command in postrotate as it
was before containerization of services. For that a container needs
to be privileged and granted a docker socket bind-mount, which is a
total security antipattern and defeats the very purpose of
containerization. Things may change with future adoption of Podman
and/or kubelet control plane though. If/when that happens, we might
consider an option for postrotate to terminate a process with
SIGTERM, to have the process instantly respawned via its systemd
unit/kubelet restart policies.
* Individual services' logrotate configs worth nothing, when still
being handled by a central logrotation container running crond. And
it needs to remain centralized as individual containers neither do
run crond nor contain logrotate, nor lightweight containers following
12-factors apps recommendations should do anything like that. Nor the
host logrotate/crond can do rotation of logs for containers as we do/
should not install required packages on the host, but only in
containers. See also the spec [0] explaining the reasoning better.
All of that makes copytruncate a global choice for logs rotation of
containerized services as we just cannont be sure, if a service foo
*really* does correct processing of SIGHUP. We leave that option for
future implementation in the hope things get fixed eventually. As well
as the aforementioned systemd/kublet option, or the option to provide
stdout only logging [0] and let the logrotate thing go.
[0] https://review.openstack.org/#/c/462900
Closes-Bug: #1795411
Related-Bug: #1276694
Change-Id: Ibdad7859a389d0ff37bbf7bfd9f4c521a05a5ea1
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
Lsof +L1 locates unlinked and open files and does not work for
logrotate, neither with copyteuncate not w/o that option.
Instead, find *.X (X - number) files held and notify the processes
owning those to make an apropriate actions and reopen new log files to
stop writing to the rotated files.
The actions to be taken by such processes are:
* For httpd processes, use USR1 to gracefully reload
* For neutron-server, restart the container as it cannot process
HUP signal well (LP bug #1276694, LP bug #1780139).
* For nova-compute, restart the container as it cannot process
HUP signal well (LP bug #1276694, LP bug #1715374).
* For other processes, use HUP to reload
This also fixes the filter to match logfiles ending with *err,
like rabbitmq startup errors log.
Closes-Bug: #1780139
Closes-Bug: #1785659
Closes-Bug: #1715374
Change-Id: I5110426aa26e5fce7ebb4d80d8a2082cbf80519c
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
It is possible to configure bond over two virtual functions
for the vms in case of using mellanox interfaces.
Change-Id: Iaeee31a9edaefec25498a734cac6eda389c38ec5
The change I5029a4b9c76268455812696290aaf82f1a0c2c23 had caused a
regression so that the filter is stopped working and matches nothing.
The postscript command is executed inside of the logrotate-crond
container, like this way:
()[root@5c78303fb8c2 /]# /sbin/lsof -nPs +L1 +D /var/log/containers
2>&1 | awk '/\S+\s+[0-9]+\s.*\/var\/log\/.*\(deleted\)/ {print}'
neutron-s ... /var/log/neutron/server.log.1 (deleted)
httpd .../var/log/httpd/keystone_wsgi_admin_access.log.1 (deleted)
So you can see that the real path should be examined for deleted
(open and unlined) logs is /var/log/ and not /var/log/containers. The
latter is only used to apply the logrotation over the bind-mounted
host path. It cannot affect any host logs outside of
/var/log/containers, so the change
I5029a4b9c76268455812696290aaf82f1a0c2c23 needs to be partially
discarded.
Additionally, send USR1 instead of HUP to reload httpd in containers
gracefully. That part follows-up the reverted
I15fa0eab1625ac63fd57b6a6d5cd22a6ac85f221 as we want to keep that
change and it fits the subject bug scope as well.
Change-Id: Ibb017463b0fbbccda035aeb1fff5f6998bbf2d1e
Closes-Bug: #1776533
Related-Bug: #1785659
Copytruncate cannot fix the postrotate filter for lsof searching for
deleted (unlinked and open) files. Copytruncate instead makes the
filter matching nothing as it makes files never deleted after rotation
happens.
This reverts commit 67a7dc70f2885b7db2a42bc28c25ece0bbeba3e4.
Change-Id: I8a73819b4aa45813cbac310452b348681496032a
Use copytruncate and 'hourly' log rotation by default. Increase the
default max number of rotated files to 336, which corresponds to 14
days, so that default period retained as is.
With the copytruncate option enabled, logs should be hourly rotated to
decrease disk IO load when copying log files around. The default
maxsize of 10M is better maintained for often rotations done within a
day as well, so log files will not happen to become unexpectedly huge
at the end of it.
W/o copytruncate, the containerized logrotate sends no signals to
processes, as files are only renamed and not unlinked. That makes the
files deletion based filter failing, until the default period of 14
days expires. To fix that non-copytruncate case, post-rotate always
sends HUP (USR1 for httpd) signals to all processes holding open files
in the /var/log/containers host path. That also makes all services
reloaded hourly (there is still a random splay applied by cron though)
as a side effect.
With copytruncate ON, each rotation ensures the old log files will also be
deleted, so only affected services will be reloaded.
Additionally, send USR1 instead of HUP to reload httpd in containers
gracefully.
Closes-Bug: #1785659
Change-Id: I15fa0eab1625ac63fd57b6a6d5cd22a6ac85f221
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
This is necessary in ha because if we let the puppet module generate
the rndc key it will be different on all controllers and they won't
be able to talk to each other.
Change-Id: I4f030cd419511be43e9e4189dbc4418d5a6c6c61
The correct values or auth modes for Gnocchi are 'basic' and 'keystone'.
This patch fixes the 'simple' usage to 'basic'. Note that without this rename
the deployment works because when 'simple' is used the parameter is not used
in config file, so value 'basic' is used by python-collectd-gnocchi, because
it is implicit default.
Change-Id: I05632137ed12c59a41a5219189c431983935d461
So currently the logrotate_crond container has a few issues issues:
A) In the postrotate it matches pids multiple times and sends SIGHUPs multiple time to processes:
======== /var/log/messages =====
Jun 3 09:01:15 overcloud-controller-0 logrotate-crond: kill -HUP 1575
Jun 3 09:01:15 overcloud-controller-0 rsyslogd: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1575" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jun 3 09:01:15 overcloud-controller-0 logrotate-crond: kill -HUP 1575
Jun 3 09:01:15 overcloud-controller-0 rsyslogd: [origin software="rsyslogd" swVersion="8.24.0" x-pid="1575" x-info="http://www.rsyslog.com"] rsyslogd was HUPed
Jun 3 09:01:15 overcloud-controller-0 logrotate-crond: kill -HUP 1575
Jun 3 09:01:15 overcloud-controller-0 rsyslogd: [origin software="rsyslogd
...
Adding sort -u in the pipeline of the postrotate script takes care of
that.
B) The logrotate_crond container should not rotate logs for services
running on the host outside of containers (i.e. rsyslog has its own
/etc/logrotate.d/rsyslog rules). Doing so violates the principle of
least surprise.
Using 'lsof ..+D /var/log/containers' takes care of this as we won't
match any non containerized processes
C) The find command matches older files to be deleted but the SIGHUP is
never sent so we actually can end up in a situation where we remove a
file but the new one never gets created because the service does not get
a SIGHUP signal:
ls -la /var/log/containers/httpd/*/*
-rw-r--r--. 1 root root 52046652 May 29 14:10 /var/log/containers/httpd/aodh-api/aodh_wsgi_access.log.1
-rw-r--r--. 1 root root 0 May 24 19:14 /var/log/containers/httpd/aodh-api/aodh_wsgi_error.log
-rw-r--r--. 1 root root 5894 May 24 19:14 /var/log/containers/httpd/aodh-api/error_log
-rw-r--r--. 1 root root 50755274 May 29 14:10 /var/log/containers/httpd/cinder-api/cinder_wsgi_access.log.1
-rw-r--r--. 1 root root 4138 May 25 11:58 /var/log/containers/httpd/cinder-api/cinder_wsgi_error.log
-rw-r--r--. 1 root root 5894 May 24 19:13 /var/log/containers/httpd/cinder-api/error_log
Using 'lsof ..+D /var/log/containers' fixes this case as well because
now we correctly match the processes that have a deleted file that is
open and we send a proper SIGHUP to them.
Tested by doing the following:
1) Logging rotation of containerized services (B, C)
1.1) Stopped the keystone container
1.2) Made the /var/log/container/keystone/keystone.log file 21M large
1.3) Started the keystone container and observed that it was logging
correctly to /var/log/container/keystone/keystone.log
1.4) Inside the logrotate_crond container we ran the following:
/usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf
1.5) We observed correct log rotation and keystone was notified via
SIGHUP and started logging correctly:
-rw-r--r--. 1 42425 42425 21628706 Jun 13 08:43 keystone.log.1
-rw-r--r--. 1 42425 42425 999 Jun 13 08:43 keystone.log
2) No SIGHUP to host processes (A)
2.1) stopped rsyslog on the host and made one of its log files > 10M:
-rw-r--r--. 1 root root 28M Jun 13 08:59 /var/log/messages
2.2) restart rsyslog
2.3) Ran the logrotation inside the container
/usr/sbin/logrotate -s /var/lib/logrotate/logrotate-crond.status /etc/logrotate-crond.conf
2.4) Observed that no SIGHUP was sent to rsyslog on the host
Co-Authored-By: Damien Ciabrini <dciabrin@redhat.com>
Change-Id: I5029a4b9c76268455812696290aaf82f1a0c2c23
Closes-Bug: #1776533
Neutron uses namespaces with different prefixes depending on
configuration and the nature of the resource. This patch changes the
wrappers to use the "ip netns identify" command to determine the target
namespace for the sidecar instead of trying to guess from the command
line options.
Change-Id: If58bb9dabebf201b592fb450a663ae2f24374e00
Closes-Bug: #1773823
Right now the default stunnel.conf log level is set at 'notice'
which, when we deploy redis, fills up the logs with the following
messages:
May 09 14:18:36 controller-1.redhat.local dockerd-current[19810]: 2018.05.09 14:18:36 LOG5[1:139972682520320]: connect_blocking: connected 127.0.0.1:6379
May 09 14:18:36 controller-1.redhat.local dockerd-current[19810]: 2018.05.09 14:18:36 LOG5[1:139972682520320]: Service [redis] connected remote server from 127.0.0.1:60412
May 09 14:18:36 controller-1.redhat.local stunnel[41495]: LOG5[1:139972682409728]: Service [redis] accepted connection from 172.17.1.21:60770
May 09 14:18:36 controller-1.redhat.local dockerd-current[19810]: 2018.05.09 14:18:36 LOG5[1:139972682409728]: Service [redis] accepted connection from 172.17.1.21:60770
May 09 14:18:36 controller-1.redhat.local stunnel[41495]: LOG5[1:139972682409728]: connect_blocking: connected 127.0.0.1:6379
May 09 14:18:36 controller-1.redhat.local dockerd-current[19810]: 2018.05.09 14:18:36 LOG5[1:139972682409728]: connect_blocking: connected 127.0.0.1:6379
May 09 14:18:36 controller-1.redhat.local stunnel[41495]: LOG5[1:139972682409728]: Service [redis] connected remote server from 127.0.0.1:60418
May 09 14:18:36 controller-1.redhat.local dockerd-current[19810]: 2018.05.09 14:18:36 LOG5[1:139972682409728]: Service [redis] connected remote server from 127.0.0.1:60418
Those messages are from the haproxy healthceck. Let's move the
default debug config to warning which will ignore the above errors.
Closes-Bug: #1770180
Change-Id: I93bd0048e85864fa9e62dc38c3575ec7b48e5df5
Set the logrotate maxage parameter to purge_after_days
as well.
Rework additional retention rules of files in
/var/log/containers in the containerized logrotate
postrotate script. The rules are based on any of the
listed criteria met:
* time of last access of contents (atime) exceeds
purge_after_days,
* time of last modification of contents (mtime) exceeds
purge_after_days,
* time of last modification of the inode (metadata, ctime)
exceeds purge_after_days.
Forcibly purge expired files with each containerized
logrotate run triggered via cron. Note that the files creation
time (the Birth attribute) is not taken into account as it
cannot be accessed normally by system operators (depends on FS
type). Retention policies based on the creation time must
be managed elsewhere.
Related-Bug: #1771543
Change-Id: I9afa22f7dd344a29747206b286520a76d70d704b
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
After purge_after_days, defaults to a 14, forcibly remove
any rotated and compressed logs of containerized services
in /var/log/containers. This overrides any related
containerized logrotate configuration used for
containerized services.
Allow to alter rotation interval for log files managed
via containerized logrotate. Defaults to 'daily'
and rotate 14 (days).
Use sharedscripts to clean up files in the postrotate
script only once.
Additionally, to enforce GDPR compliance of log files
in /var/log/containers, put them under logrotate management
(minsize 1) and always compress. Prohibit the size option
as it does not honor time-based contstraints required by
GDPR. Forcibly remove all files but those rotated and
compressed logs, via the postscript section.
Partial-bug: #1771543
Change-Id: Id8e4717a5ecda53bc9cd39f1c2efaa80b56bd45e
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
The neutron agents use subprocesses like dnsmasq and keepalived as part
of their implementation. Running these "subprocesses" in separate
containers prevent dataplane breakages/unnecessary failover on agent
container restart.
Also amends docker daemon options to allow including additional unix
domain sockets to bind to the docker daemon. The paths can be mounted by
containers that launch containers instead of mounting /run/docker.sock.
This avoids issues if the docker daemon is restarted while the containers
are running.
Related-Bug: #1749209
Change-Id: Icd4c24ac686d957391548a04722266cefc1bce27
This allows us to force a TLS version for stunnel, which we
set to TLSv1.2. This ensures that we're compliant with FedRamp,
which requires a minimum version of TLSv1.1.
Unfortunately, using the "option" key didn't work in the configuration
as was tried in a previous commit. This option would have only only
disabled the versions we set, instead of only allowing one, like
"sslVersions" does. This seems to be the only alternative we have at
the moment.
Related-Bug: #1754368
Change-Id: I353f893ee5dcc265269704e23f65aa0460724078
Disable curl globbing to allow Swift ringbuilder to upload to IPv6
upload addresses. Also dicable globbing in the other places curl
is used.
Change-Id: Iba51cc75bea26b775f790849f0b466a6528ee627
Closes-Bug: #1757118
Perl is missing in kolla containers. Replace it with
awk.
bz: #1553077
Closes-Bug: #1756343
Change-Id: Ie51bd1fa08d7690ac76a01ee2c558e86fb52bb2d
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
On some machines it's failing to run the devlink and ethtool commands
without the full path of these commands
Closes-Bug: #1745821
Change-Id: If2f7c7a46fb1b52cce9ffbfa31a3161fc07f1334