Go to file
Hediberto C Silva e8cbdd0f7b Preserving the current states of Ceph SM services
The ceph-osd and ceph-mon SM services have been presenting a state
mismatch in some network recovery scenarios, where they are expected
to be disabled by the SM, but reports as enabled-active based on Ceph
availability.

In this change, the proposal is to save the current SM state in the
/var/run/ceph/.sm-ceph-mon-state and /var/run/ceph/.sm-ceph-osd-state
files based on Start and Stop actions, and return disabled only if
the save state is Stopped. In other cases, the ceph-init-wrapper script
will process the status as before.

In the ceph-init-wrapper script, we are limiting the current state
update to executions called by the SM based on the parent process.
This prevent external script uses such as manual interventions and
PMON calls from affecting the behavior of SM, which is only used
in AIO-DX setups.

In addition, the flag used by the ceph-storage-network script has been
renamed to maintain a pattern across created flags.

This change is part of a solution to avoid the scenario where
there is no active controller after a network recovery. From a storage
perpective, all services are responding accordinly to SM requests.
Other solutions for SM, mtcCLient or other services are needed to be
adressed in future investigations.

Test Plan:
  PASS: Fresh install for AIO-SX and AIO-DX.
  PASS: In virtual environments, simulate switch failure shutting down
        all interfaces from both controllers at same time and checking
        if Ceph services states are not in mismatch state.
  PASS: On AIO-DX, execute host-swact operations sucessfully.
  PASS: On AIO-DX, simulate BMC shutdown for standby controller and
        checking Ceph services states are correctly after booting.
  PASS: On AIO-DX, simulate BMC shutdown for active controller and
        checking if the Uncontrolled Swact happens sucessfully and
        if the Ceph services are correctly after booting.
  PASS: On AIO-DX, simulate DOR scenario shutting down both controllers
        at same time. Check if Ceph services are correctly
        after booting.

Closes-bug: 2122117

Change-Id: Iafc0e30a441b1975ccfb98c16c4b30a53383d83e
Signed-off-by: Hediberto C Silva <Hediberto.CavalcantedaSilva@windriver.com>
2025-09-05 09:28:40 -03:00
2023-08-29 16:52:04 -03:00
2024-05-01 16:39:19 -04:00
2024-05-01 16:39:19 -04:00
2019-01-08 11:42:04 -05:00
2019-04-19 19:52:31 +00:00
2023-09-06 17:54:55 -03:00
2021-09-09 19:05:36 +03:00
2018-05-31 07:36:35 -07:00
2025-03-10 09:13:52 -03:00

integ

StarlingX Integration

Description
StarlingX Integration and packaging
Readme 60 MiB
Languages
JavaScript 31.7%
Shell 27.2%
Python 17.3%
Perl 9.4%
Makefile 5.7%
Other 8.6%