Extend startuptime in collectd's pmon config file
The maintenance Process Monitor (pmond) experiences a spawn
timeout while restarting or recovering the collectd process
over a failure in Debian.
Collectd's current process monitor config file startuptime is 3 secs.
Collectd version 5.8.1 spawn time in CentOS is very quick, ~1 sec
log example: collectd spawned in 1.000 secs
centos rpm : collectd-5.8.1-4.el7.x86_64
Collectd version 5.12.0 spawn time in Debian is 3-6 seconds
log example: collectd spawned in 6.000 secs (also 3,4,5 secs)
debian pkg : base-bullseye.lst:collectd 5.12.0-7
It seems the new version of collectd in the Debian environment
takes longer to spawn, exceeding the current 3 second timeout.
This update extends the collectd pmon config file startuptime
to 10 seconds to account for this version and environment change.
Test Plan:
PASS: Verify collectd process manual restart with new startuptime in
- both CentOS and Debian and on both vbox and real hw
PASS: Verify collectd process kill recovery with new startuptime in
- both CentOS and Debian and on both vbox and real hw
Regression:
PASS: Verify soak of collectd process restart 50+ loops
PASS: Verify soak of collectd process recovery 50+ loops
PASS: Verify process manual restart of all pmon monitored processes
PASS: Verify process kill recovery of all pmon monitored processes
Story: 2009968
Task: 45917
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
Change-Id: I9849e5cca5d5a3c216ffb26a788f310c2820a984
This commit is contained in:
@@ -9,7 +9,7 @@ interval = 5 ; number of seconds to wait between restarts
|
||||
debounce = 10 ; number of seconds that a process needs to remain
|
||||
; running before degrade is removed and retry count
|
||||
; is cleared.
|
||||
startuptime = 3 ; Seconds to wait after process start before starting the debounce monitor
|
||||
startuptime = 10 ; Seconds to wait after process start before starting the debounce monitor
|
||||
mode = passive ; Monitoring mode: passive (default) or active
|
||||
; passive: process death monitoring (default: always)
|
||||
; active : heartbeat monitoring, i.e. request / response messaging
|
||||
|
||||
Reference in New Issue
Block a user