metal/mtce
Eric MacDonald 6d0cc6a2a8 Prevent early active monitoring of compute processes in AIO
The commit shown below introduced a main loop audit that
mistakenly registers subfunction processes that are in the
waiting for /var/run/.compute_config_complete 'polling'
state during unlock enable.

By doing so inadvertently changes its monitor FSM stage
from 'Poll' to 'Manage' before configuration is complete.

Since config is not complete, the hbsClient has not initialized
its socket interface and is unable to service active monitoring
requests. This leads to quorum failure and watchdog reboot.

commit 537935bb0c
Author: Eric MacDonald <eric.macdonald@windriver.com>
Date:   Mon Jul 9 08:36:22 2018 -0400
Reorder process restart operations to prevent pmond futex deadlock

The Fix: Don't run the audit for processes that are in the
waiting for 'polling' state.

Test Plan:

Provision AIO , verify no quorum failure and inspect logs for
correct behavior.

Change-Id: I179c78309517a34285783ee99bbb3d699915cb83
Closes-Bug: 1804318
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2018-11-21 10:04:00 -05:00
..
centos Prevent early active monitoring of compute processes in AIO 2018-11-21 10:04:00 -05:00
src Prevent early active monitoring of compute processes in AIO 2018-11-21 10:04:00 -05:00
PKG-INFO Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00