postgresql: Optimize restart behavior

* add preStop hook to trigger Fast Shutdown
* disable readiness probe by default

When Kubernetes terminates a pod, the container runtime typically sends
a SIGTERM signal to pid 1 in each container [0]. PostgreSQL interprets
SIGTERM as a request to do a "Smart Shutdown" [1]. This can take minutes
(often exhausting the termination grace period), and during this time,
new connections are not being serviced.

Now that postgresql has a single replica, this behavior is undesirable.
If we kill the pod (e.g. in an upgrade), we probably want it to come
back as soon as possible.

This change adds a preStop hook that sends a SIGINT to postgresql in
order to trigger a "Fast Shutdown". In addition, the readiness probe is
disabled by default, since it adds no value in a single-replica
scenario.

0: https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-termination
1: https://www.postgresql.org/docs/9.6/server-shutdown.html

Change-Id: Ib5f3d2a49e55332604c91f9a011e87d78947dbef
This commit is contained in:
Phil Sphicas 2020-10-23 07:16:39 +00:00
parent a10699c4e0
commit c43331d67a
3 changed files with 9 additions and 2 deletions

View File

@ -15,7 +15,7 @@ apiVersion: v1
appVersion: v9.6
description: OpenStack-Helm PostgreSQL
name: postgresql
version: 0.1.4
version: 0.1.5
home: https://www.postgresql.org
sources:
- https://github.com/postgres/postgres

View File

@ -191,6 +191,13 @@ spec:
- /tmp/start.sh
{{ dict "envAll" . "component" "server" "container" "postgresql" "type" "liveness" "probeTemplate" (include "livenessProbeTemplate" . | fromYaml) | include "helm-toolkit.snippets.kubernetes_probe" | trim | indent 10 }}
{{ dict "envAll" . "component" "server" "container" "postgresql" "type" "readiness" "probeTemplate" (include "readinessProbeTemplate" . | fromYaml) | include "helm-toolkit.snippets.kubernetes_probe" | trim | indent 10 }}
lifecycle:
preStop:
exec:
command:
- bash
- -c
- kill -INT 1
volumeMounts:
- name: pod-tmp
mountPath: /tmp

View File

@ -98,7 +98,7 @@ pod:
timeoutSeconds: 5
failureThreshold: 10
readiness:
enabled: true
enabled: false
params:
initialDelaySeconds: 30
timeoutSeconds: 5