monasca-thresh: Allow topology check and removal in storm

Patch adds a script in the monasca-thresh image that can be used
to check if a topology exists in Storm, and optionally kill it.

This is part of a bug in kolla-ansible where topologies were
not submitted to Storm, but run locally.  This patch includes
a topology check script enabled by KOLLA_BOOTSTRAP which will exit
kolla_start if the topology exists, and optionally enables topology
removal (to allow replacement) enabled by TOPOLOGY_REPLACE.

Topology names and various timeouts may be customized.  If the
new env variables are not set, existing behavior is unchanged.

Partial-Bug: #1808805
Change-Id: If8f0730031435dda4235b7f2d2c23e5f5f767f87
This commit is contained in:
Scott Shambarger 2021-05-23 20:34:46 -07:00
parent ea6ceef686
commit 0a410a5460
4 changed files with 111 additions and 1 deletions

View File

@ -66,8 +66,14 @@ RUN cd /monasca-common-source/java \
# Overwrite the script inherited from Storm
COPY extend_start.sh /usr/local/bin/kolla_extend_start
# Add bootstrap script
COPY topology_bootstrap.sh /usr/local/bin/topology_bootstrap
RUN touch /usr/local/bin/kolla_monasca_extend_start \
&& chmod 755 /usr/local/bin/kolla_extend_start /usr/local/bin/kolla_monasca_extend_start
&& chmod 755 /usr/local/bin/kolla_extend_start \
/usr/local/bin/kolla_monasca_extend_start \
/usr/local/bin/topology_bootstrap
{% block monasca_thresh_footer %}{% endblock %}
{% block footer %}{% endblock %}

View File

@ -42,3 +42,9 @@ if [[ $(ls -Ab ${MONASCA_WORKER_DIR}) != "" ]]; then
fi
. /usr/local/bin/kolla_monasca_extend_start
# Bootstrap and exit if KOLLA_BOOTSTRAP variable is set. This catches all cases
# of the KOLLA_BOOTSTRAP variable being set, including empty.
if [[ "${!KOLLA_BOOTSTRAP[@]}" ]]; then
. /usr/local/bin/topology_bootstrap
fi

View File

@ -0,0 +1,90 @@
#!/bin/sh
# This script should be sourced by kolla_extend_start when bootstrapping
#
# Optional env(<default>):
# TOPOLOGY_NAME("monasca-thresh") - topology name to check
# TOPOLOGY_KILL_TIMEOUT(5) - secs to wait for topology kill
# STORM_WAIT_RETRIES(24) - retries to check for storm
# STORM_WAIT_TIMEOUT(20) - secs to wait for storm list
# STORM_WAIT_DELAY(5) - secs between storm list attempts
# - If topology exists, then:
# a) if TOPOLOGY_REPLACE is set, the existing topology is killed
# and script falls through (topology may be added)
# b) otherwise script exits with 0 (topology already exists)
# - If topology doesn't exist, script falls through (topology may be added)
# - If storm cannot be reached, or kill fails, script exits with 1
TOPOLOGY_NAME=${TOPOLOGY_NAME:-monasca-thresh}
TOPOLOGY_KILL_TIMEOUT=${TOPOLOGY_KILL_TIMEOUT:-5}
# defaults from monasca-thresh
STORM_WAIT_RETRIES=${STORM_WAIT_RETRIES:-24}
STORM_WAIT_TIMEOUT=${STORM_WAIT_TIMEOUT:-20}
STORM_WAIT_DELAY=${STORM_WAIT_DELAY:-5}
STORM="/opt/storm/bin/storm"
echo "Waiting for storm to become available..."
success="false"
for i in $(seq "$STORM_WAIT_RETRIES"); do
if timeout "$STORM_WAIT_TIMEOUT" "$STORM" list; then
echo "Storm is available, continuing..."
success="true"
break
else
echo "Connection attempt $i of $STORM_WAIT_RETRIES failed"
sleep "$STORM_WAIT_DELAY"
fi
done
if [ "$success" != "true" ]; then
echo "Unable to connect to Storm! Exiting..."
sleep 1
exit 1
fi
locate_topology() { # <topology>
echo "Searching for topology $1 in the storm"
topologies=$("$STORM" list | awk '/-----/,0{if (!/-----/)print $1}')
found="false"
for topology in $topologies; do
if [ "$topology" = "$1" ]; then
echo "Found storm topology with name: $topology"
found="true"
break
fi
done
}
# search for existing topology
locate_topology "$TOPOLOGY_NAME"
if [ "$found" = "true" ]; then
if [[ ! "${!TOPOLOGY_REPLACE[@]}" ]]; then
echo "Topology $TOPOLOGY_NAME found, submission not necessary"
exit 0
fi
echo "Topology replacement requested, killing old one..."
"$STORM" kill "$TOPOLOGY_NAME" -w "$TOPOLOGY_KILL_TIMEOUT"
echo "Wait $TOPOLOGY_KILL_TIMEOUT secs for topology to reap its artifacts..."
sleep "$TOPOLOGY_KILL_TIMEOUT"
for i in $(seq "$STORM_WAIT_RETRIES"); do
locate_topology "$TOPOLOGY_NAME"
[ "$found" != "true" ] && break
echo "... wait some more..."
sleep "$STORM_WAIT_DELAY"
done
if [ "$found" = "true" ]; then
echo "Unable to kill existing topology, giving up..."
exit 1
fi
echo "Topology successfully killed, continuing..."
else
echo "Topology not found, continuing..."
fi

View File

@ -0,0 +1,8 @@
---
fixes:
- |
Adds an option to the monasca-thresh container which checks
if the topology is currently submitted (KOLLA_BOOTSTRAP), with
an option to kill it (TOPOLOGY_REPLACE). Topology names
and various timeouts may be customized.
`LP#1808805 <https://launchpad.net/bugs/1808805>`__