ironic/releasenotes/notes/graceful_shutdown_wait-9a62627714b86726.yaml
Steve Baker 6a9e319fbe On rpc service stop, wait for node reservation release
Instead of clearing existing reservations at the beginning of
del_host, wait for the tasks holding them to go to completion. This
check continues indefinitely until the conductor process exits due to
one of:
- All reservations for this conductor are released
- CONF.graceful_shutdown_timeout has elapsed
- The process manager (systemd, kubernetes) sends SIGKILL after the
  configured graceful period

Because the default values of [DEFAULT]graceful_shutdown_timeout and
[conductor]heartbeat_timeout are the same (60s) no other conductor
will claim a node as an orphan until this conductor exits.

Change-Id: Ib8db915746228cd87272740825aaaea1fdf953c7
2023-02-27 11:10:31 +13:00

15 lines
655 B
YAML

---
features:
- |
On shutdown the conductor will wait for at most
``[DEFAULT]graceful_shutdown_timeout`` seconds for existing lock node
reservations to clear. Previously lock reservations were cleared
immediately, which in some cases would result in nodes going into a failed
state.
upgrade:
- |
``[DEFAULT]graceful_shutdown_timeout`` defaults to 60s. Systemd
``TimeoutStopSec`` defaults to 30s. Kubernetes
``terminationGracePeriodSeconds`` defaults to 90s. It is recommended to
align the value of ``[DEFAULT]graceful_shutdown_timeout`` with the graceful
timeout of the process manager of the conductor process.