Go to file
Eric MacDonald 2cb728678f Fix mtcAgent AIO simplex subfunction failure handling over unlock
The mtcAgent is seen to get stuck handling a subfunction failure
detected over self (controller-0) unlock of an AIO simplex controller.

It gets stuck reporting that is it already handling the failure,
but isn't.

log flooding: 'controller-0 already handling force full enable'

This issue only exists in AIO simplex when the subfunction enable
handler detects the failure. This issue was introduced by the
following update:

Remove Start Host Service Launch in mtcAgent & enhance fault detection
https://opendev.org/starlingx/metal/commit/
      6106051f1c

Test Plan:

PASS: Verify an AIO simplex self unlock subfunction failure leads to
      'degrade' state with 'enable failure' alarm.
PASS: Verify same issue for the standby controller leads to
      'failure' state with 'enable failure' alarm.

Regression:

PASS: Verify spontaneous unhealthy active controller is degraded.
PASS: Verify spontaneous unhealthy standby controller is failed.

Closes-Bug: 2119449
Change-Id: I5ab5e6d85906f1923a0828211dbf94d2f82e73f8
Signed-off-by: Eric MacDonald <eric.macdonald@windriver.com>
2025-08-05 10:14:06 -04:00
2025-01-28 13:32:10 +00:00
2023-08-29 16:50:22 -04:00
2019-04-19 19:52:33 +00:00
2023-04-28 12:38:51 -04:00
2018-05-31 07:36:43 -07:00
2023-07-19 12:32:13 -03:00
2022-12-26 23:26:54 +00:00

metal

The starlingx/metal repository handles StarlingX Bare Metal Management1.

This repository is not intended to be developed standalone, but rather as part of the StarlingX Source System, which is defined by the StarlingX manifest2.

References


  1. https://docs.starlingx.io/api-ref/metal↩︎

  2. https://opendev.org/starlingx/manifest.git↩︎

Description
StarlingX Bare Metal and Node Management, Hardware Maintenance
Readme 16 MiB
Languages
C++ 83.1%
Shell 10.1%
Python 3.3%
C 2.5%
Makefile 1%