metal/mtce-common/src/common
Eric MacDonald 55d5f43edb Fix heartbeat messaging when interface is set to 'lo'
Maintenance heartbeat service should not be multicast
messaging over an 'lo' interface which in IPv6 leads
to socket failures, log flooding and the inability to
detect and report pmond process failure.

To fix that this update
 - configures pulse messaging to unicast for monitored
   networks configured as 'lo'.
 - prevents heartbeating over the cluster network if both
   it and the management network are both configured on
   the 'lo' interface.
 - improves logging to avoid flooding in the presence of
   socket setup or access errors.
 - stops logging netlink events (interface state changes)
   on unmonitored network interfaces.
 - maintains heartbeat disabled state until the management
   network is up.
 - modifies hbsAgent socket failure handling and its pmon
   conf file so that a persistent socket failure during
   startup is alarmed as an hbsAgent process failure.

Test Plan:

PASS: Verify logging over system install and socket errors
PASS: Verify unicast messaging when cluster is set to 'lo'
PASS: Verify no cluster network heartbeat when it and mgmnt
      are set to 'lo'.

Regression:

PASS: Verify heartbeat messaging and cluster info
PASS: Verify pmond process failure alarm management
PASS: Verify heartbeat failure detection and graceful recovery
PASS: Verify AIO SX IPv6 system install and run
PASS: Verify AIO DX IPv6 system install and run
PASS: Verify Standard IPv6 system install and run
PASS: Verify Storage system IPv6 install and run
PASS: Verify Storage system IPv4 install and run
PASS: Verify MNFA handling in IPv6 storage system

Change-Id: I5a2a0b2dee0c690617c4e0b0e2ab8b1172b2dc49
Closes-Bug: 1884585
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2020-06-26 14:16:41 +00:00
..
Makefile Add redfish support detection to maintenance 2019-08-19 14:03:37 +00:00
alarmUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
alarmUtil.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
bmcUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
bmcUtil.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
fitCodes.h Add mtcAgent socket initialization failure retry handling. 2020-04-01 19:24:22 +00:00
hostClass.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostClass.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
hostUtil.h Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
httpUtil.cpp MTCE: reading BMC passwords from Barbican secret storage. 2019-02-14 09:04:46 -05:00
httpUtil.h Remove all nova and libvirt files from mtce-common 2019-03-19 15:23:36 -05:00
ipmiUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
ipmiUtil.h Redfish support for Sensor Monitoring in hwmond 2019-09-12 01:56:42 +08:00
jsonUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
jsonUtil.h Remove all nova and libvirt files from mtce-common 2019-03-19 15:23:36 -05:00
keyClass.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
keyClass.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
logMacros.h fix spelling error 2019-11-15 14:11:52 +08:00
msgClass.cpp Fix mtce-common build error with gcc-8.2.1 2020-04-03 14:49:09 +08:00
msgClass.h Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
nlEvent.cpp Fix heartbeat messaging when interface is set to 'lo' 2020-06-26 14:16:41 +00:00
nlEvent.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
nodeBase.cpp Modify Mtce Reinstall FSM to first power-off BMC provisioned hosts 2020-02-12 15:44:26 +00:00
nodeBase.h Fix heartbeat messaging when interface is set to 'lo' 2020-06-26 14:16:41 +00:00
nodeEvent.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeEvent.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeMacro.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
nodeTimers.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
nodeTimers.h Add redfish power/reset/reinstall bmc support to maintenance 2019-09-26 15:59:35 -04:00
nodeUtil.cpp Prevent pmond process recovery when system is not running 2020-06-15 11:09:47 -04:00
nodeUtil.h Prevent pmond process recovery when system is not running 2020-06-15 11:09:47 -04:00
pingUtil.cpp Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
pingUtil.h Fix BMC access loss handling 2020-01-03 09:34:37 -05:00
redfishUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
redfishUtil.h Add redfish power/reset/reinstall bmc support to maintenance 2019-09-26 15:59:35 -04:00
regexUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
regexUtil.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
returnCodes.h Refactor infrastructure network in mtce code 2019-04-18 09:32:41 -04:00
secretUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
secretUtil.h Improve BMC password first fetch handling in hwmon 2019-09-17 18:57:08 +00:00
threadUtil.cpp Refactor BMC provisioning in Maintenance 2019-12-09 09:39:49 -05:00
threadUtil.h Enable protocol switch between ipmi and redfish for hwmon 2019-09-22 22:28:30 -04:00
timeUtil.cpp Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
timeUtil.h Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
tokenUtil.cpp Remove references to ceilometer in maintenance 2019-04-30 14:28:12 -04:00
tokenUtil.h MTCE: reading BMC passwords from Barbican secret storage. 2019-02-14 09:04:46 -05:00