It might happen a container takes time to stop its running process - it
therefore will call "SIGTERM" or "SIGKILL", and conmon will return the
actual process exit code.
Since the exit code won't be 0, it will be marked as "failed" in systemd
Note that 137 is, actually, SIGKILL (137-128 = 9) and 143 is SIGTERM
(143-128 = 15).
While systemd accepts actual SIGTERM by default, it doesn't recognize that
143 exit status. We therefore have to namely point to this status code.
Also, by default, SIGKILL isn't accepted as a valid, successful exit status.
This change will needs to be backported down to stable/ussuri - and is
the equivalent of Iffcfc8bd18a999ae6921a4131d40241df40050f1
When the main PID (i.e., common) of a container is killed because of
some reasons, systemd won't execute ExecStop command.
Current podman doesn't detect this failure ang recognize that container
is still running and this causes failure when systemd tries to restart
This patch introduces ExecStopPost configuration into systemd unit
files so that stop operation is executed even when a container fails
because of killed main process. The stale container should be cleaned
up by ExecStopPost task before systemd tries to restart it.
Note that the similar change has been introcuded to "podman generate
systemd" command already.
Sometimes, actions taken as the part of a container's entrypoint script
should become the pre-stop actions. Allow such configurations composed
for managed containers.
Signed-off-by: Bogdan Dobrelya <email@example.com>
Now that Podman natively supports healthchecks, let's use them; which
will reduce our footprint in how we consume Podman.
Using native healthcheck brings a few benefits:
- Less Ansible tasks to manage the systemd resources, so deployment
should be slightly faster.
- Leverage features into the container tooling directly; not in tripleo.
This patch does the following:
- Fix the podman arguments for healthcheck options in podman_container
module, transparent for the end-user. Indeed, the args are "health-*".
- Remove the management of timers and healthcheck services and their
- New playbook "healthcheck_cleanup" to cleanup previous systemd
healthchecks if they exist.
- Update molecule default testing to test if new healthchecks work fine.
- Update the role manual for healthchecks usage.
This patch should be transparent for the end-users except that the
systemd healthchecks won't exist anymore:
Instead of running: "systemctl status tripleo_keystone_healthcheck.timer
status", we would run "podman healthcheck run keystone" or check the
output of "podman inspect keystone".
The document has also been updated in the role manual.
It requires at least Podman 1.6 where this patch has been tested.
Separate the creation of systemd files & service restarts so we don't
call systemd too many times and makes the deployment faster.
It also uses a new filter that will read register data to figure out
what systemd files changed so what containers need a restart.
Some containers doesn't have the "default" user set to root (which is
good). This lead to healthcheck_port() function to return a message
because the non-root user isn't allowed to call "ss" command as itself.
Ensuring we're running the healthchecks as root will also allow to stop
duplicating some commands, making them faster and smaller for the
This was discovered and discussed on Red Hat bugzilla first, then ported
This patch is the port of I2e49d4dd5b385237f4f79929c70365424f6fa22d to
tripleo-ansible "container-manage" role.
All roles that have a hyphen in them need to be renamed to use an
underscore. This change creates a symlink to all roles using their
original name which will ensure we maintain compatibility with
the rest of the TripleO stack. This is being done because roles with
hyphens are no longer valid within collections.
A temp PBR update has been made to accomodate all of the symlinks to
the legacy role names.
Signed-off-by: Kevin Carter <firstname.lastname@example.org>