ironic/releasenotes/notes/force-out-hung-ipmitool-process-519c7567bcbaa882.yaml
Ilya Etingof 9efb9e313d Kill misbehaving ipmitool process
We can't trust ipmitool to terminate in time. We may have to kill
the process if it's running for longer than we asked it to.

On the other hand, abrupt IPMI exchange termination is said to be
dangerous to the state of the BMC being managed. Therefore this patch
only kills timed out IPMI "power status" call.

For the purpose of killing hung `ipmitool` we inject the time-capped
`popen.wait` call before the uncapped `popen.communicate` is called
internally. Then just kill stuck `ipmitool` process and go on.

Story: 2004449
Task: 28127
Change-Id: I7e1eafb334fe3a3337926aca27c14fe559ce0e39
2018-12-05 09:19:59 +01:00

12 lines
555 B
YAML

---
fixes:
- |
Kill ``ipmitool`` process invoked by ironic to read node's power state if
``ipmitool`` process does not exit after configured timeout expires. It
appears pretty common for ``ipmitool`` to run for five minutes (with
current ironic defauls) once it hits a non-responsive bare metal node.
This could slow down the management of other nodes due periodic tasks
slots exhaustion. The new behaviour could is enabled by default, but
could be disabled via the ``[ipmi]kill_on_timeout`` ironic configuration
option.