StarlingX Bare Metal and Node Management, Hardware Maintenance
Go to file
Eric MacDonald a42301c19b Make successful pmon-restart clear failed restarts count
The pmon-restart service, through a call to respawn_process,
increments that process's restarts counter but does not clear
that counter after a successful restart.

So, each pmon-restart mistakenly contributes to that process's
failure count. This has the effect of pre-loading that process's
restart counter by one for every pmon-restart of that process.

The effect is best described by example.
Say a process is pmon-restart'ed 4 times during one day which
increments that process's restart counter to 4. So assuming its
conf file specifies its threshold is 3 ; its already exceeded
its threshold. Then, even days later that process experiences
a real failure pmon will immediate take the severity action
because the failure threshold had already been exceeded.

This update ensures a process's restart counter is cleared
after successful pmon-restart operation ; in the process pid
registration phase of recovery.

Test Plan:

PASS: Verify pmon-restart continues to work.
PASS: Verify proper thresholding of failed process following
      many pmon-restart operations.
PEND: Verify pmon-restart and process failure automated test script
      against this update. 5 loops, all processes.

Change-Id: Ib01446f2e053846cd30cb0ca0e06d7c987cdf581
Closes-Bug: 1853330
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2019-11-21 14:58:28 +00:00
api-ref/source Clean up and standardize landing pages 2019-01-09 09:34:38 -08:00
bsp-files Merge "Fix packaging of OPAE FPGA drivers" 2019-10-28 14:22:39 +00:00
devstack Add redfish support detection to maintenance 2019-08-19 14:03:37 +00:00
doc Fix the error links for metal docs 2019-07-03 09:20:25 -04:00
installer Configurable Host HTTP/HTTPS Port Binding 2019-02-06 16:04:07 -06:00
inventory Merge "fix the spelling mistakes" 2019-10-07 14:41:20 +00:00
kickstart Add openSUSE OBS Artifacts for Maintenance services 2019-09-20 09:18:54 -05:00
mtce Make successful pmon-restart clear failed restarts count 2019-11-21 14:58:28 +00:00
mtce-common Add urlencoding to ip address for redfish requests 2019-11-15 12:15:49 -05:00
mtce-compute Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
mtce-control Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
mtce-storage Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
python-inventoryclient Update openSUSE OBS artifacts to build MTCE packages 2019-10-01 11:07:10 -05:00
releasenotes Update config for release notes to include project name 2019-02-05 14:14:17 -08:00
.gitignore Update tox.ini files to use stein constraints 2019-06-25 13:20:35 -04:00
.gitreview OpenDev Migration Patch 2019-04-19 19:52:33 +00:00
.zuul.yaml Turn off devstack as a zuul job 2019-10-17 12:58:08 -05:00
CONTRIBUTORS.wrs StarlingX open source release updates 2018-05-31 07:36:43 -07:00
LICENSE StarlingX open source release updates 2018-05-31 07:36:43 -07:00
README.rst Followup opendev cleanup and test jobs 2019-04-22 16:42:03 +00:00
centos_build_layer.cfg Build layering, add layer build config file 2019-10-15 19:19:45 +08:00
centos_iso_image.inc Remove Resource Monitor ; aka rmon, from the load 2019-03-19 16:12:38 -04:00
centos_pkg_dirs SysInv Decoupling: Create Inventory Service 2018-12-06 13:17:35 -05:00
test-requirements.txt pep8 job enable and fix pep8 reported issue 2018-09-06 09:45:51 +08:00
tox.ini Update tox.ini files to use stein constraints 2019-06-25 13:20:35 -04:00

README.rst

metal

StarlingX Bare Metal Management