vault-armada-app

History

Michel Thebeau dc79220541 add health query timeout for HA This change causes vault-manager to not pause for long periods when a configured vault server is not responsive. Use curl --connect-timeout for queries to vault server /sys/health. During HA recovery it is known that the server is non-responsive, so vault-manager should not wait the default time, which is 60s or 5m depending on the google search result. It is observed that vault-manager appears to hang for long periods during HA recovery. Watching the $PVCDIR/pods.txt confirms that vault-manager is inactive for minutes at a time. This changes the default behavior to timeout within 2 seconds during the HA recovery scenario. In addition to not waiting, the vault-manager log will show the 'sealed' status as empty string when the query times-out. Test Plan: PASS - vault ha 3 replicas PASS - vault 1 replica PASS - kubectl exec kill vault process PASS - kubectl delete vault pod PASS - short network downtime PASS - long network downtime PASS - rates including 1, 5 PASS - wait intervals including 0, 1, 3, 15 PASS - kubectl delete 2 vault pods PASS - kubectl delete 3 (all) vault pods Story: 2010393 Task: 47701 Change-Id: I4fd916033f6dd5210078126abb065393d25851cd Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>	2023-03-23 11:52:37 -04:00
..
debian	add health query timeout for HA	2023-03-23 11:52:37 -04:00
vault-helm	add health query timeout for HA	2023-03-23 11:52:37 -04:00

Michel Thebeau dc79220541 add health query timeout for HA

This change causes vault-manager to not pause for long periods when a
configured vault server is not responsive.

Use curl --connect-timeout for queries to vault server /sys/health.
During HA recovery it is known that the server is non-responsive, so
vault-manager should not wait the default time, which is 60s or 5m
depending on the google search result.

It is observed that vault-manager appears to hang for long periods
during HA recovery. Watching the $PVCDIR/pods.txt confirms that
vault-manager is inactive for minutes at a time.  This changes the
default behavior to timeout within 2 seconds during the HA recovery
scenario.

In addition to not waiting, the vault-manager log will show the 'sealed'
status as empty string when the query times-out.

Test Plan:
PASS - vault ha 3 replicas
PASS - vault 1 replica
PASS - kubectl exec kill vault process
PASS - kubectl delete vault pod
PASS - short network downtime
PASS - long network downtime
PASS - rates including 1, 5
PASS - wait intervals including 0, 1, 3, 15
PASS - kubectl delete 2 vault pods
PASS - kubectl delete 3 (all) vault pods

Story: 2010393
Task: 47701

Change-Id: I4fd916033f6dd5210078126abb065393d25851cd
Signed-off-by: Michel Thebeau <michel.thebeau@windriver.com>

2023-03-23 11:52:37 -04:00

debian

add health query timeout for HA

2023-03-23 11:52:37 -04:00

vault-helm

add health query timeout for HA

2023-03-23 11:52:37 -04:00