It is intended to match the existing pattern used by BuildSet.
Usage is currently evenly split between NodeSet and Nodeset.
Change-Id: Iab96ed90a9ada0cb4709c0bb8b04923ab0c765e8
When a request is either fulfilled or failed, pass it through to
the scheduler which will accept the request (which means deleting
it in the case of a failure) and pass it on to the pipeline manager
which will set the result of the requesting job to NODE_FAILURE
and cause any sub-jobs to be SKIPPED.
Adjust the request algorithm to only request nodes for jobs that
are ready to run. The current behavior requests all jobs for a
build set asap, but that has two downsides: it may request and
return nodes more aggressively than necessary (if you have chosen
to create a job tree, you *probably* don't want to tie up nodes
until they are actually needed). However, that's a grey area,
and we may want to adjust or make that behavior configurable later.
More pressing here is that it makes the logic of when to return
nodes *very* complicated (since SKIPPED jobs are represented by
fake builds, there is no good opportunity to return their nodes).
This seems like a good solution for now, and if we want to make
the node request behavior more aggressive in the future, we can
work out a better model for knowing when to return nodes.
Change-Id: Ideab6eb5794a01d5c2b70cb87d02d61bb3d41cce
Check that at the end of every test, there are no outstanding
nodepool requests and no locked nodes.
Move final state assertions into the tearDown method so that
they run right after the end of the test but before any
cleanup handlers are called (which can interfere with the
assertion checking by, say, deleting the zookeeper tree we
are trying to check). Move the cleanup in test_webapp to
tearDown so that it ends the paused job that the tests in
that class use before the assertion check.
Fix some bugs uncovered by this testing:
* Two typos.
* When we re-launch a job, we need a new nodeset, so make sure
to remove the nodeset from the buildset after the build
completes if we are going to retry the build.
* Always report build results to the scheduler even for non-current
buildsets so that it can return used nodes for aborted builds.
* Have the scheduler return the nodeset for a completed build rather
than the pipeline manager to avoid the edge case where a build
result is returned after a configuration that removes the pipeline
(and therefore, there is no longer a manager to return the nodeset).
* When canceling jobs, return nodesets for any jobs which do not yet
have builds (such as jobs which have nodes but have not yet
launched).
* Return nodes for skipped jobs.
Normalize the debug messages in nodepool.py.
Change-Id: I32f6807ac95034fc2636993824f4a45ffe7c59d8
We ended up using classes from model.py instead to better interact
with the rest of Zuul, so remove these.
Change-Id: I4fa4a06b27d9ef6cc7f7878f29a92aafd7ffe9d1
When canceling a running or requested job, cancel outstanding
nodepool requests or return unused nodes.
Change-Id: I77f8869b9d751ccd6c9f398ed03ef5ac482cc204
While we immediately lock a node given to us by nodepool, we delay
setting the node to 'in-use' until we actually request that the job
be launched so that if we end up canceling the job before it is
run, we might return the node unused to nodepool.
Change-Id: I2d2c0f9cdb4c199f2ed309e7b0cfc62e071037fa
Nodesets are collections of nodes, but they, and the Node objects
they contain, have a dual role representing both the configuration
of a job and the actual nodes returned from nodepool. So that
multiple jobs sharing the same nodeset configuration don't end up
sharing the same actual nodes, create a copy of a job's nodeset
for use in interacting with nodepool/zk.
Change-Id: I83f503ba22fc3f92b8c90b15ccfb6b07dc3c4709
I frequently add something like this ad-hoc when debugging a test.
Make it a convenient function that's easy to add when needed, and
also run it at the completion of every test so a developer can
easily survey the logs to see what happened.
Change-Id: I3d3810f51245d92855f086b875edfd52bdd86983
* When a test timeout occurs, output the state debug information at
error level so that it shows up in all logs.
* Add some more info to that output.
* Further restrict the (often not useful) chatty gear logs by default.
Change-Id: Ib275441172c5b1598593d0931cef0168d02e521d
Add a fake nodepool that immediately successfully fulfills all
requests, but actually uses the Nodepool ZooKeeper API.
Update the Zuul Nodepool facade to use the Nodepool ZooKeeper API.
Change-Id: If7859f0c6531439c3be38cc6ca6b699b3b5eade2
Add a requirement on kazoo and add a Zookeeper chroot to the test
infrastructure.
This is based on similar code in Nodepool.
Change-Id: Ic05386aac284c5542721fa3dcb1cd1c8e52d4a1f
This makes the tenant name required in the URL, which I believe is desired
for now, but maybe there will be an all-tenant view or a default tenant
in the future.
Note that the setUp method for this test suite actually had a bug that
was only approving change A twice, rather than approving B.
Change-Id: I25bd11088719e8b3465563863123d93eff16030e
Story: 2000773
Task: 3433
Change I6d7d8d0f7e49a11e926667fbe772535ebdd35e89 erroneously altered
test_rerun_on_abort to match the observed behavior.
Change I6e64ef03cbb10ce858b22d6a4590f58ace0a5332 restored the values
in the test but then erroneously changed the accounting system to
match the observed behavior.
The actual problem is that this test also exercises the 'attempts' job
attribute, and does so by specifying it in a custom test configuration.
The test was failing because that attribute, which instructs zuul to
attempt to run a job for a non-default number of retries, was not being
read.
It was not being read because the test was still using the old configuration
loading scheme. It updated the "layout_file" which isn't a thing anymore
and asked the scheduler to reload. The scheduler *did* reload, but it
simply reloaded the same configuration. The solution to this is to either
create a new configuration, or, in this case, since the additional
configuration needed is compatible with the configuration used by the tests
siblings, simply add it to the active config file for the test.
Once the test is loading the correct configuration, one can observe that
the 'attempts' attribute was not added to the validator. That is corrected
in this change as well.
With all of this complete, the test passes in its original form and no
modifications to the job retry accounting system.
Change-Id: Icf6d697cbae0166bc516faf5b7e60cac05885ab0
This test was added to verify a fix involving a deadlock in the old
jenkins launcher, which has been removed. I don't believe it adds
coverage otherwise.
Change-Id: I9ba5a24af5accff41057a7634baf4320a4afca48
Story: 2000773
Task: 3430
Minimal changes are needed, just translation of the old layout to new
and tenant API differences.
Change-Id: I3563fd1998dcc16426d665d50e26644b45198be0
Story: 2000773
Task: 3429
The layout simply needed to be translated to the new format. The test
itself still functions withoutmodification.
Change-Id: Ibdeef6e3a303faa6e67be8a2f7ed71b8529ecaf3
Story: 2000773
Task: 3428
These are emitted when the command socket is being shutdown and is
only used to wake the thread. It should not be accepted as a command.
Change-Id: I0e7b30cbe60f5a96daec3697f24b9a97516027e2
Ie26fdc29c07430ebfb3df8be8ac1786d63d7e0fe re-implemented the retry logic
for v3, and started at 1, but the checking logic was still in place from
pre-v3 and expected it to count from 0, not 1.
Change-Id: I6e64ef03cbb10ce858b22d6a4590f58ace0a5332
Story: 2000827
Task: 3421