Extend startuptime in collectd's pmon config file

The maintenance Process Monitor (pmond) experiences a spawn
timeout while restarting or recovering the collectd process
over a failure in Debian.

Collectd's current process monitor config file startuptime is 3 secs.

Collectd version 5.8.1 spawn time in CentOS is very quick, ~1 sec
 log example: collectd spawned in 1.000 secs
 centos rpm : collectd-5.8.1-4.el7.x86_64

Collectd version 5.12.0 spawn time in Debian is 3-6 seconds
 log example: collectd spawned in 6.000 secs (also 3,4,5 secs)
 debian pkg : base-bullseye.lst:collectd  5.12.0-7

It seems the new version of collectd in the Debian environment
takes longer to spawn, exceeding the current 3 second timeout.

This update extends the collectd pmon config file startuptime
to 10 seconds to account for this version and environment change.

Test Plan:

PASS: Verify collectd process manual restart with new startuptime in
      - both CentOS and Debian and on both vbox and real hw
PASS: Verify collectd process kill recovery with new startuptime in
      - both CentOS and Debian and on both vbox and real hw

Regression:

PASS: Verify soak of collectd process restart 50+ loops
PASS: Verify soak of collectd process recovery 50+ loops
PASS: Verify process manual restart of all pmon monitored processes
PASS: Verify process kill recovery of all pmon monitored processes

Story: 2009968
Task: 45917
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: I9849e5cca5d5a3c216ffb26a788f310c2820a984
This commit is contained in:
Eric MacDonald
2022-07-30 20:37:50 +00:00
parent 237b3856f2
commit ad6c1383c6

View File

@@ -9,7 +9,7 @@ interval = 5 ; number of seconds to wait between restarts
debounce = 10 ; number of seconds that a process needs to remain
; running before degrade is removed and retry count
; is cleared.
startuptime = 3 ; Seconds to wait after process start before starting the debounce monitor
startuptime = 10 ; Seconds to wait after process start before starting the debounce monitor
mode = passive ; Monitoring mode: passive (default) or active
; passive: process death monitoring (default: always)
; active : heartbeat monitoring, i.e. request / response messaging