From 9fe8be43a2a3bf36ba1d8318ec2984515018e13b Mon Sep 17 00:00:00 2001 From: "James E. Blair" Date: Tue, 31 Aug 2021 09:53:38 -0700 Subject: [PATCH] Don't treat failed requirement jobs as ready When we decide whether it's time to request nodes for a job which requires something from another build, if the job providing the requirement has failed, we currently say that our job is ready. This will cause us to submit a node request which we will never use. That's because the act of checking whether the requirement is ready has the side effect of marking our job as failed (since our requirement failed). So the next time we go through the loop, we'll see that failure and ignore the job from then on (never accepting or returning the requested nodeset). This can lead to a leak of nodes due to idle node requests. This situation is not detected in the tests because we used empty nodesets for these tests, and so no node requests were made for them. To correct this, use non-empty nodesets in the relevant tests and also indicate that the requirements are not ready in the case that the providing job failed. This will cause us to skip requesting nodes in the first iteration of the loop, then the resulting failure state will avoid that in subsequent iterations. Change-Id: Ib6e7d81f2c7129b78cdba3957c9f5b46939004db --- tests/fixtures/layouts/provides-requires.yaml | 4 ++++ zuul/model.py | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/tests/fixtures/layouts/provides-requires.yaml b/tests/fixtures/layouts/provides-requires.yaml index 3b35408d01..17b17bab15 100644 --- a/tests/fixtures/layouts/provides-requires.yaml +++ b/tests/fixtures/layouts/provides-requires.yaml @@ -35,6 +35,10 @@ - job: name: base parent: null + nodeset: + nodes: + - name: controller + label: controller run: playbooks/base.yaml - job: diff --git a/zuul/model.py b/zuul/model.py index 288a4c157c..4834543892 100644 --- a/zuul/model.py +++ b/zuul/model.py @@ -2960,7 +2960,7 @@ class QueueItem(object): fakebuild.result = 'FAILURE' self.addBuild(fakebuild) self.setResult(fakebuild) - ret = True + ret = False return ret def findJobsToRun(self, semaphore_handler):