Don't treat failed requirement jobs as ready

When we decide whether it's time to request nodes for a job which
requires something from another build, if the job providing the
requirement has failed, we currently say that our job is ready.
This will cause us to submit a node request which we will never
use.

That's because the act of checking whether the requirement is ready
has the side effect of marking our job as failed (since our
requirement failed).  So the next time we go through the loop,
we'll see that failure and ignore the job from then on (never
accepting or returning the requested nodeset).

This can lead to a leak of nodes due to idle node requests.

This situation is not detected in the tests because we used empty
nodesets for these tests, and so no node requests were made for them.

To correct this, use non-empty nodesets in the relevant tests and
also indicate that the requirements are not ready in the case that
the providing job failed.  This will cause us to skip requesting nodes
in the first iteration of the loop, then the resulting failure state
will avoid that in subsequent iterations.

Change-Id: Ib6e7d81f2c7129b78cdba3957c9f5b46939004db
This commit is contained in:
James E. Blair 2021-08-31 09:53:38 -07:00
parent f33e4c798b
commit 9fe8be43a2
2 changed files with 5 additions and 1 deletions

View File

@ -35,6 +35,10 @@
- job:
name: base
parent: null
nodeset:
nodes:
- name: controller
label: controller
run: playbooks/base.yaml
- job:

View File

@ -2960,7 +2960,7 @@ class QueueItem(object):
fakebuild.result = 'FAILURE'
self.addBuild(fakebuild)
self.setResult(fakebuild)
ret = True
ret = False
return ret
def findJobsToRun(self, semaphore_handler):