app-rook-ceph/stx-rook-ceph-helm/files/rook-mon-exit.sh
Robert Church 41adbba935 Enable optional AIO-DX floating monitor
This will enable integration of the floating monitor chart into the
application with:
- SM service monitor changes:
  - Add and remove floating monitor placement labels in the start/stop
    functions. This will ensure that when SM is transitioning activity
    labels will align on the active controller.
  - The stop function will delete the pod to force a reschedule.
  - The status function will detect the presence of the DRBD mounted
    filesystem and adjust the labeling accordingly in case start/stop
    functions did not label as desired.
- application plugin changes:
  - Add constants support for 'rook-ceph-floating-monitor' helmrelease
  - Provide initial utility functions to detect if the DRBD controller
    filesystem is enabled and if the floating monitor is assigned (via a
    helm use override)
  - Add a new function to get the IP family from the cluster-pod network
    to set overrides and determine the IPv4/IPv6 static address
  - Update the ceph cluster plugin to use a new utility function for
    detecting the IP family
  - Add the floating monitor helm plugin to generate the ip_family and
    static ip_address based on that family. Initial support provided for
    the cluster-pod network
  - Update the lifecycle plugin to optionally remove the floating
    monitor helm release on application remove
- application metadata
  - disable the 'rook-ceph-floating-monitor' chart by default
- FluxCD manifest changes
  - Change helmrepository API to v1 to clean up an error
  - Add manifests for the 'rook-ceph-floating-monitor' helm release
  - Temporarily set deletionPropagation in the rook-ceph-cluster, the
    rook-ceph-provisioner and rook-ceph-floating-monitor helmreleases to
    provide more predictive delete behavior
  - Update rook-ceph-cluster-static-overrides.yaml to add network
    defaults and disable the host network as the default provider. This
    was done to avoid port conflicts with the floating monitor. The
    cluster-pod network will now be the network used for the ceph
    cluster and its pods

Enable monitor at runtime:
 - system helm-override-list rook-ceph -l
 - system helm-override-show rook-ceph rook-ceph-floating-monitor \
     rook-ceph
 - system helm-override-update rook-ceph rook-ceph-floating-monitor \
     rook-ceph  --set assigned="true"
 - system helm-override-show rook-ceph rook-ceph-floating-monitor \
     rook-ceph
 - system application-apply rook-ceph

Disable monitor at runtime:
 - system helm-override-list rook-ceph -l
 - system helm-override-show rook-ceph rook-ceph-floating-monitor \
     rook-ceph
 - system helm-override-update rook-ceph rook-ceph-floating-monitor \
     rook-ceph --set assigned="false"
 - system helm-override-show rook-ceph rook-ceph-floating-monitor \
     rook-ceph
 - system application-apply rook-ceph

Future Improvements:
- Pickup the desired network from the storage backend (cluster-pod,
  cluster-host, etc) and
  - update _get_ip_family() to use this value
  - update _get_static_floating_mon_ip() to get address pool range and
    calculate an appropriate static IP address for the monitor

Test Plan:
PASS - Pkg build + ISO generation
PASS - Successful AIO-DX Installation
PASS - Initial Rook deployment without floating monitor.
PASS - Initial Rook deployment with floating monitor.
PASS - Runtime override enable of Rook floating monitor + reapply
PASS - Runtime override disable of Rook floating monitor + reapply

Change-Id: Ie1ff75481b6c2f0d9d34eb228d3019465e36bc1e
Depends-On: https://review.opendev.org/c/starlingx/config/+/926374
Story: 2011066
Task: 50838
Signed-off-by: Robert Church <robert.church@windriver.com>
2024-08-15 12:54:12 -05:00

125 lines
4.0 KiB
Bash

#!/bin/bash
#
# Copyright (c) 2020 Intel Corporation, Inc.
# Copyright (c) 2024 Wind River Systems, Inc.
#
# SPDX-License-Identifier: Apache-2.0
#
RETVAL=0
DRBD_MOUNT="/var/lib/ceph/mon-float"
DRBD_MAJ_DEV_NUM="147"
REQUEST_TIMEOUT='5s'
################################################################################
# Start Action
################################################################################
function start {
# Add label for pod scheduling
# NOTE: Because SM and k8s can be restarted indpendently the k8s API may not
# be available at the time of the start action. Don't fail. Confirm label is
# applied in the status check
kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
label node $(hostname) \
ceph-mon-float-placement=enabled
RETVAL=0
}
################################################################################
# Stop Action
################################################################################
function stop {
# Add remove label to prevent pod scheduling
# NOTE: Because SM and k8s can be restarted indpendently the k8s API may not
# be available at the time of the start action. Don't fail. Confirm label is
# applied in the status check
kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
label node $(hostname) \
ceph-mon-float-placement-
# Get floating monitor pod running on this node
POD=$(kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
get pod -n rook-ceph \
-l app="rook-ceph-mon,mon=float" --no-headers=true \
--field-selector=spec.nodeName=$(hostname) \
-o=custom-columns=NAME:.metadata.name)
# Is there a floating monitor here?
if [ ! -z "${POD}" ]; then
# delete detected pod to force a reschedule
kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
delete pod -n rook-ceph \
${POD}
fi
RETVAL=0
}
################################################################################
# Status Action
################################################################################
function status {
# Status is based on if this host is labeled correctly to run the floating
# monitor
# Is this host labeled for the floating monitor
NODE_LABELED=$(kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
get nodes \
-l ceph-mon-float-placement --no-headers=true \
--field-selector=metadata.name=$(hostname) \
-o=custom-columns=NAME:.metadata.name)
mountpoint -d ${DRBD_MOUNT} | grep -q ^${DRBD_MAJ_DEV_NUM}
if [ $? -eq 0 ]; then
if [ -z "${NODE_LABELED}" ]; then
kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
label node $(hostname) \
ceph-mon-float-placement=enabled
fi
RETVAL=0
else
if [ ! -z "${NODE_LABELED}" ]; then
kubectl --kubeconfig=/etc/kubernetes/admin.conf \
--request-timeout ${REQUEST_TIMEOUT} \
label node $(hostname) \
ceph-mon-float-placement-
fi
RETVAL=1
fi
}
################################################################################
# Main Entry
################################################################################
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
status)
status
;;
*)
echo "usage: $0 { start | stop | status | restart }"
exit 1
;;
esac
exit $RETVAL