Separate the 4 driver interfaces into abstract interface classes,
and also add an abstract driver class. Make the existing driver
implementations inherit from these as appropriate.
This should help clearly express which methods a given driver
needs to implement.
This gets both running again, but the second fails because of a
behavior change in v3 which should be addressed separately.
This change, while substantial, is mostly organizational.
Currently, connections, sources, triggers, and reporters are
discrete concepts, and yet are related by virtue of the fact that
the ConnectionRegistry is used to instantiate each of them. The
method used to instantiate them is called "_getDriver", in
recognition that behind each "trigger", etc., which appears in
the config file, there is a class in the zuul.trigger hierarchy
implementing the driver for that trigger. Connections also
specify a "driver" in the config file.
In this change, we redefine a "driver" as a single class that
organizes related connections, sources, triggers and reporters.
The connection, source, trigger, and reporter interfaces still
exist. A driver class is responsible for indicating which of
those interfaces it supports and instantiating them when asked to
Zuul instantiates a single instance of each driver class it knows
about (currently hardcoded, but in the future, we will be able to
easily ask entrypoints for these). That instance will be
retained for the life of the Zuul server process.
When Zuul is (re-)configured, it asks the driver instances to
create new connection, source, trigger, reporter instances as
necessary. For instance, a user may specify a connection that
uses the "gerrit" driver, and the ConnectionRegistry would call
getConnection() on the Gerrit driver instance.
This is done for two reasons: first, it allows us to organize all
of the code related to interfacing with an external system
together. All of the existing connection, source, trigger, and
reporter classes are moved as follows:
zuul.connection.FOO -> zuul.driver.FOO.FOOconnection
zuul.source.FOO -> zuul.driver.FOO.FOOsource
zuul.trigger.FOO -> zuul.driver.FOO.FOOtrigger
zuul.reporter.FOO -> zuul.driver.FOO.FOOreporter
For instance, all of the code related to interfacing with Gerrit
is now is zuul.driver.gerrit.
Second, the addition of a single, long-lived object associated
with each of these systems allows us to better support some types
of interfaces. For instance, the Zuul trigger maintains a list
of events it is required to emit -- this list relates to a tenant
as a whole rather than individual pipelines or triggers. The
timer trigger maintains a single scheduler instance for all
tenants, but must be able to add or remove cron jobs based on an
individual tenant being reconfigured. The global driver instance
for each of these can be used to accomplish this.
As a result of using the driver interface to create new
connection, source, trigger and reporter instances, the
connection setup in ConnectionRegistry is much simpler, and can
easily be extended with entrypoints in the future.
The existing tests of connections, sources, triggers, and
reporters which only tested that they could be instantiated and
have names have been removed, as there are functional tests which
When a request is either fulfilled or failed, pass it through to
the scheduler which will accept the request (which means deleting
it in the case of a failure) and pass it on to the pipeline manager
which will set the result of the requesting job to NODE_FAILURE
and cause any sub-jobs to be SKIPPED.
Adjust the request algorithm to only request nodes for jobs that
are ready to run. The current behavior requests all jobs for a
build set asap, but that has two downsides: it may request and
return nodes more aggressively than necessary (if you have chosen
to create a job tree, you *probably* don't want to tie up nodes
until they are actually needed). However, that's a grey area,
and we may want to adjust or make that behavior configurable later.
More pressing here is that it makes the logic of when to return
nodes *very* complicated (since SKIPPED jobs are represented by
fake builds, there is no good opportunity to return their nodes).
This seems like a good solution for now, and if we want to make
the node request behavior more aggressive in the future, we can
work out a better model for knowing when to return nodes.
Check that at the end of every test, there are no outstanding
nodepool requests and no locked nodes.
Move final state assertions into the tearDown method so that
they run right after the end of the test but before any
cleanup handlers are called (which can interfere with the
assertion checking by, say, deleting the zookeeper tree we
are trying to check). Move the cleanup in test_webapp to
tearDown so that it ends the paused job that the tests in
that class use before the assertion check.
Fix some bugs uncovered by this testing:
* Two typos.
* When we re-launch a job, we need a new nodeset, so make sure
to remove the nodeset from the buildset after the build
completes if we are going to retry the build.
* Always report build results to the scheduler even for non-current
buildsets so that it can return used nodes for aborted builds.
* Have the scheduler return the nodeset for a completed build rather
than the pipeline manager to avoid the edge case where a build
result is returned after a configuration that removes the pipeline
(and therefore, there is no longer a manager to return the nodeset).
* When canceling jobs, return nodesets for any jobs which do not yet
have builds (such as jobs which have nodes but have not yet
* Return nodes for skipped jobs.
Normalize the debug messages in nodepool.py.
While we immediately lock a node given to us by nodepool, we delay
setting the node to 'in-use' until we actually request that the job
be launched so that if we end up canceling the job before it is
run, we might return the node unused to nodepool.
Nodesets are collections of nodes, but they, and the Node objects
they contain, have a dual role representing both the configuration
of a job and the actual nodes returned from nodepool. So that
multiple jobs sharing the same nodeset configuration don't end up
sharing the same actual nodes, create a copy of a job's nodeset
for use in interacting with nodepool/zk.
I frequently add something like this ad-hoc when debugging a test.
Make it a convenient function that's easy to add when needed, and
also run it at the completion of every test so a developer can
easily survey the logs to see what happened.
* When a test timeout occurs, output the state debug information at
error level so that it shows up in all logs.
* Add some more info to that output.
* Further restrict the (often not useful) chatty gear logs by default.
Add a fake nodepool that immediately successfully fulfills all
requests, but actually uses the Nodepool ZooKeeper API.
Update the Zuul Nodepool facade to use the Nodepool ZooKeeper API.