metal/mtce/src/scripts
Eric MacDonald 14bb67789e Add pxeboot network mtcAlive messaging to Maintenance
The introduction of the new pxeboot network requires maintenance
verify and report on messaging failures over that network.

Towards that, this update introduces periodic mtcAlive messaging
between the mtcAgent and mtcClinet.

Test Plan:

PASS: Verify install and provision each system type with a mix
             of networking modes ; ethernet, bond and vlan
             - AIO SX, AIO DX, AIO DX plus
             - Standard System 2+1
             - Storage System 2+1+1
PASS: Verify feature with physical on management interface
PASS: Verify feature with vlan on management interface
PASS: Verify feature with bonded management interface
PASS: Verify feature with bonded vlans on management interface
PASS: Verify in bonded cases handling with 2, 1 or no slaves found
PASS: Verify mgmt-combined or separate cluster-host network
PASS: Verify mtcClient pxeboot interface address learning
             - for worker and storage nodes       ; dhcp leases file
             - for controller nodes before unlock ; dhcp leases file
             - for controller nodes after unlock  ; static from ifcfg
             - from controller within 10 seconds of process restart
PASS: Verify mtcAgent pxeboot interface address learning from
             dnsmasq.hosts file
PASS: Verify pxeboot mtcAlive initiation, handling, loss detection
             and recovery
PASS: Verify success and failure handling of all new pxeboot ip
             address learning functions ;
             - dhcp - all system node installs.
             - dnsmasq.hosts - active controller for all hosts.
             - interfaces.d - controller's mtcClient pxeboot address.
             - pxeboot req mtcAlive - mtcAgent mtcAlive request message.
PASS: Verify mtcClient pxeboot network 'mtcAlive request' and 'reboot'
             command handling for ethernet, vlan and bond configs.
PASS: Verify mtcAlive sequence number monitoring, out-of-sequence
             detection, handling and logging.
PASS: Verify pxeboot rx socket binding and non-blocking attribute
PASS: Verify mtcAgent handling stress soaking of sustained incoming
             500+ msgs/sec ; batch handling and logging.
PASS: Verify mtcAgent and mtcClient pxeboot tx and rx socket messaging,
             failure recovery handling and logging.
PASS: Verify pxeboot receiver is not setup on the oam interface on
             controller-0 first install until after initial config
             complete.

Regression:

PASS: Verify mtcAgent/mtcClient online and offline state management
PASS: Verify mtcAgent/mtcClient command handling
      - over management network
      - over cluster-host network
PASS: Verify mtcClient interface chain log for all iface types
      - bond    : vlan123 -> pxeboot0 (802.3ad 4) -> enp0s8 and enp0s9
      - vlan    : vlan123 -> enp0s8
      - ethernet: enp0s8
PASS: Verify mtcAgent/mtcClient handling and logging including debug
      logging for standard operations
      - node install and unlock
      - node lock and unlock
      - node reinstall, reboot, reset
PASS: Verify graceful recovery handling of heartbeat loss failure.
      - node reboot
      - management interface down
PASS: Verify systemcontroller and subcloud install with dc-libvirt
PASS: Verify no log flooding, coredumps, memory leaks

Story: 2010940
Task: 49541
Change-Id: Ibc87b85e3e0e07c3b8c40b5291bd3372506fbdfb
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2024-03-28 15:28:27 +00:00
..
55-crash-dump-manager.preset Update crashDumpMgr to source config from envfile 2023-10-06 23:06:54 +00:00
collect_bmc.sh Add maintenance BMC info collect script 2020-10-15 15:41:51 -04:00
config Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
config.service Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
crash-dump-manager Update crashDumpMgr to source config from envfile 2023-10-06 23:06:54 +00:00
crash-dump-manager.service Update crashDumpMgr to source config from envfile 2023-10-06 23:06:54 +00:00
crash-dump-manager_envfile Update crashDumpMgr to source config from envfile 2023-10-06 23:06:54 +00:00
crashdump.logrotate Update crashDumpMgr to source config from envfile 2023-10-06 23:06:54 +00:00
dmemchk.sh Remove Resource Monitor ; aka rmon, from the load 2019-03-19 16:12:38 -04:00
goenabled Add LSB headers to mtce service scripts 2019-08-29 11:20:14 -05:00
goenabled.service De-branding in starlingx/metal: Titanium Cloud -> StarlingX 2020-04-03 07:58:25 +02:00
hbs-query [Trivial Fix] fix typos in docstrings 2019-02-21 14:46:06 +08:00
hbsAgent Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
hbsClient Fix failing mtce services on Debian 2022-01-14 10:50:09 -03:00
hbsClient.conf Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
hbsClient.service De-branding in starlingx/metal: Titanium Cloud -> StarlingX 2020-04-03 07:58:25 +02:00
hwclock.service Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
hwclock.sh Fix remaining failing mtce services on Debian 2022-01-25 12:10:39 -03:00
mgmtlinkup Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
mtc.conf Add pxeboot network mtcAlive messaging to Maintenance 2024-03-28 15:28:27 +00:00
mtc.ini Change compute node to worker node personality 2018-12-13 13:08:48 -05:00
mtcAgent mtcAgent: Run in active mode 2022-09-13 21:38:50 +00:00
mtcClient Fix remaining failing mtce services on Debian 2022-01-25 12:10:39 -03:00
mtcClient.conf Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
mtcClient.service De-branding in starlingx/metal: Titanium Cloud -> StarlingX 2020-04-03 07:58:25 +02:00
mtcTest Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
mtce.logrotate Modify mtce daemon log rotation config files 2021-04-07 20:47:54 +00:00
mtce.syslog Setup mtce logfile config 2020-10-16 10:43:01 -04:00
mtcinit Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
mtclog Fix failing mtce services on Debian 2022-01-14 10:50:09 -03:00
mtclog.service De-branding in starlingx/metal: Titanium Cloud -> StarlingX 2020-04-03 07:58:25 +02:00
mtclogd.conf Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
runservices Add LSB headers to mtce service scripts 2019-08-29 11:20:14 -05:00
runservices.service De-branding in starlingx/metal: Titanium Cloud -> StarlingX 2020-04-03 07:58:25 +02:00
sched_trace Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
sensor_hp360_v1_ilo_v4.profile Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
sensor_hp380_v1_ilo_v4.profile Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
sensor_integration_profile.README Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
sensor_quanta_v1_ilo_v4.profile Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
store_trace Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
stress_ras.sh Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
stress_swact.sh Decouple Guest-server/agent from stx-metal 2018-09-18 17:15:08 -04:00
wipedisk Add mpath support to wipedisk script 2023-04-10 17:10:22 -03:00