adec0f6f01
I've seen a situation where heartbeats managed to completely saturate the conductor workers, so that no API requests could come through that required interaction with the conductor (i.e. everything other than reads). Add periodic tasks for a large (thousands) number of nodes, and you get a completely locked up Ironic. This change reserves 5% (configurable) of the threads for API requests. This is done by splitting one executor into two, of which the latter is only used by normal _spawn_worker calls and only when the former is exhausted. This allows an operator to apply a remediation, e.g. abort some deployments or outright power off some nodes. Partial-Bug: #2038438 Change-Id: Iacc62d33ffccfc11694167ee2a7bc6aad82c1f2f
7 lines
234 B
YAML
7 lines
234 B
YAML
---
|
|
fixes:
|
|
- |
|
|
Each conductor now reserves a small proportion of its worker threads (5%
|
|
by default) for API requests and other critical tasks. This ensures that
|
|
the API stays responsive even under extreme internal load.
|