metal/mtce-common/src/common
Eric MacDonald da7b2e94f1 Modify Mtce Reinstall FSM to first power-off BMC provisioned hosts
This update only applies to servers that support and are provisioned
for Board Management Control (BMC).

The BMC of some servers silently reject the 'set next boot device',
a command while it is executing BIOS.

The current reinstall algorithm when the BMC is provisioned starts by
detecting the power state of the target server. If the power is off
it will 'first power it on' and then proceed to 'set next boot device'
to pxe followed by a reset. For the initial power off state case, the
timing of these operations is such that the server is in BIOS when the
'set next boot device' command is issued.

This update modifies the host reinstall algorithm to first power-off
a server followed by setting the next boot device while the server is
confirmed to be powered off, then powered on. This ensures the server
gets and handles the set next boot device command operation properly.

This update also fixes a race condition between the bmc_handler and
power_handler by moving the final power state update in the power
handler to the power done phase.

Test Plan:

Verify all new reinstall failure path handling via fault insertion testing
Verify reinstall of powered off host
Verify reinstall of powered on host
Verify reinstall of Wildcat server with ipmi
Verify reinstall of Supermicro server with ipmi and redfish
Verify reinstall of Ironpass server with ipmi
Verify reinstall of WolfPass server with redfish and ipmi
Verify reinstall of Dell server with ipmi

Over 30 reinstalls were performed across all server types, with initial
power on and off using both ipmi and redfish (where supported).

Change-Id: Iefb17e9aa76c45f2ceadf83f23b1231ae82f000f
Closes-Bug: 1862065
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2020-02-12 15:44:26 +00:00
..
Makefile Add redfish support detection to maintenance 2019-08-19 14:03:37 +00:00
alarmUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
alarmUtil.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
bmcUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
bmcUtil.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
fitCodes.h Add alarm retry support to maintenance alarm handling daemon 2019-10-07 09:07:49 -04:00
hostClass.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostClass.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostUtil.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
httpUtil.cpp MTCE: reading BMC passwords from Barbican secret storage. 2019-02-14 09:04:46 -05:00
httpUtil.h Remove all nova and libvirt files from mtce-common 2019-03-19 15:23:36 -05:00
ipmiUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
ipmiUtil.h Redfish support for Sensor Monitoring in hwmond 2019-09-12 01:56:42 +08:00
jsonUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
jsonUtil.h Remove all nova and libvirt files from mtce-common 2019-03-19 15:23:36 -05:00
keyClass.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
keyClass.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
logMacros.h fix spelling error 2019-11-15 14:11:52 +08:00
msgClass.cpp Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
msgClass.h Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
nlEvent.cpp Add 50 byte hostname support to maintenance 2019-07-12 12:20:08 +00:00
nlEvent.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
nodeBase.cpp Modify Mtce Reinstall FSM to first power-off BMC provisioned hosts 2020-02-12 15:44:26 +00:00
nodeBase.h Modify Mtce Reinstall FSM to first power-off BMC provisioned hosts 2020-02-12 15:44:26 +00:00
nodeEvent.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeEvent.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeMacro.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeTimers.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
nodeTimers.h Add redfish power/reset/reinstall bmc support to maintenance 2019-09-26 15:59:35 -04:00
nodeUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
nodeUtil.h Make Mtce system mode scan case in-sensitive 2019-05-06 19:14:14 +00:00
pingUtil.cpp Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
pingUtil.h Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
redfishUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
redfishUtil.h Add redfish power/reset/reinstall bmc support to maintenance 2019-09-26 15:59:35 -04:00
regexUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
regexUtil.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
returnCodes.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
secretUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
secretUtil.h Improve BMC password first fetch handling in hwmon 2019-09-17 18:57:08 +00:00
threadUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
threadUtil.h Enable protocol switch between ipmi and redfish for hwmon 2019-09-22 22:28:30 -04:00
timeUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
timeUtil.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
tokenUtil.cpp Remove references to ceilometer in maintenance 2019-04-30 14:28:12 -04:00
tokenUtil.h MTCE: reading BMC passwords from Barbican secret storage. 2019-02-14 09:04:46 -05:00