e8cbdd0f7bbfc9d4c8651a4b63b827fa8d3310d8
The ceph-osd and ceph-mon SM services have been presenting a state
mismatch in some network recovery scenarios, where they are expected
to be disabled by the SM, but reports as enabled-active based on Ceph
availability.
In this change, the proposal is to save the current SM state in the
/var/run/ceph/.sm-ceph-mon-state and /var/run/ceph/.sm-ceph-osd-state
files based on Start and Stop actions, and return disabled only if
the save state is Stopped. In other cases, the ceph-init-wrapper script
will process the status as before.
In the ceph-init-wrapper script, we are limiting the current state
update to executions called by the SM based on the parent process.
This prevent external script uses such as manual interventions and
PMON calls from affecting the behavior of SM, which is only used
in AIO-DX setups.
In addition, the flag used by the ceph-storage-network script has been
renamed to maintain a pattern across created flags.
This change is part of a solution to avoid the scenario where
there is no active controller after a network recovery. From a storage
perpective, all services are responding accordinly to SM requests.
Other solutions for SM, mtcCLient or other services are needed to be
adressed in future investigations.
Test Plan:
PASS: Fresh install for AIO-SX and AIO-DX.
PASS: In virtual environments, simulate switch failure shutting down
all interfaces from both controllers at same time and checking
if Ceph services states are not in mismatch state.
PASS: On AIO-DX, execute host-swact operations sucessfully.
PASS: On AIO-DX, simulate BMC shutdown for standby controller and
checking Ceph services states are correctly after booting.
PASS: On AIO-DX, simulate BMC shutdown for active controller and
checking if the Uncontrolled Swact happens sucessfully and
if the Ceph services are correctly after booting.
PASS: On AIO-DX, simulate DOR scenario shutting down both controllers
at same time. Check if Ceph services are correctly
after booting.
Closes-bug: 2122117
Change-Id: Iafc0e30a441b1975ccfb98c16c4b30a53383d83e
Signed-off-by: Hediberto C Silva <Hediberto.CavalcantedaSilva@windriver.com>
integ
StarlingX Integration
Description
Languages
JavaScript
31.7%
Shell
27.2%
Python
17.3%
Perl
9.4%
Makefile
5.7%
Other
8.6%