Don't treat failed requirement jobs as ready

When we decide whether it's time to request nodes for a job which requires something from another build, if the job providing the requirement has failed, we currently say that our job is ready. This will cause us to submit a node request which we will never use. That's because the act of checking whether the requirement is ready has the side effect of marking our job as failed (since our requirement failed). So the next time we go through the loop, we'll see that failure and ignore the job from then on (never accepting or returning the requested nodeset). This can lead to a leak of nodes due to idle node requests. This situation is not detected in the tests because we used empty nodesets for these tests, and so no node requests were made for them. To correct this, use non-empty nodesets in the relevant tests and also indicate that the requirements are not ready in the case that the providing job failed. This will cause us to skip requesting nodes in the first iteration of the loop, then the resulting failure state will avoid that in subsequent iterations. Change-Id: Ib6e7d81f2c7129b78cdba3957c9f5b46939004db
2021-08-31 09:53:38 -07:00 · 2021-08-31 09:53:38 -07:00 · 9fe8be43a2
parent f33e4c798b
commit 9fe8be43a2
2 changed files with 5 additions and 1 deletions
--- a/tests/fixtures/layouts/provides-requires.yaml
+++ b/tests/fixtures/layouts/provides-requires.yaml
@ -35,6 +35,10 @@
 - job:
    name: base
    parent: null
+    nodeset:
+      nodes:
+        - name: controller
+          label: controller
    run: playbooks/base.yaml

 - job:
--- a/zuul/model.py
+++ b/zuul/model.py
@ -2960,7 +2960,7 @@ class QueueItem(object):
            fakebuild.result = 'FAILURE'
            self.addBuild(fakebuild)
            self.setResult(fakebuild)
-            ret = True
+            ret = False
        return ret

    def findJobsToRun(self, semaphore_handler):