91e29079a0
There is concern over the ability for compute nodes to reasonably determine which events should count against its consecutive build failures. Since the compute may erronenously disable itself in response to mundane or otherwise intentional user-triggered events, this patch adds a scheduler weigher that considers the build failure counter and can negatively weigh hosts with recent failures. This avoids taking computes fully out of rotation, rather treating them as less likely to be picked for a subsequent scheduling operation. This introduces a new conf option to control this weight. The default is set high to maintain the existing behavior of picking nodes that are not experiencing high failure rates, and resetting the counter as soon as a single successful build occurs. This is minimal visible change from the existing behavior with default configuration. The rationale behind the default value for this weigher comes from the values likely to be generated by its peer weighers. The RAM and Disk weighers will increase the score by number of available megabytes of memory (range in thousands) and disk (range in millions). The default value of 1000000 for the build failure weigher will cause competing nodes with similar amounts of available disk and a small (less than ten) number of failures to become less desirable than those without, even with many terabytes of available disk. Change-Id: I71c56fe770f8c3f66db97fa542fdfdf2b9865fb8 Related-Bug: #1742102 |
||
---|---|---|
.. | ||
notes | ||
source |