Zuul's new rate limiting feature would ignore jobs started that are
later moved outside the window (due to the window shrinking). Zuul
properly dealt with this jobs and changes when they slid back into the
window but it did so inefficiently. Go back to processing the entire
queue but check if each individual item is actionable before preparing
refs on it or starting jobs.
Change-Id: Ib76a68f9023652205003e0d164a78b8f67956adf
This was causing a problem with window sizes on reconfiguration because
the ChangeQueue objects were persisting across the reload via the local
reference inside of QueueItem. Instead of adding more complexity to
reset those on reEnqueue, drop that and instead find the change queue
via the change's project when needed.
Also fix the fact that the QueueItem pipeline reference was not being
updated (it was set to None before a re-enqueue but then not set to
the new pipeline value).
Change-Id: I7f7050bfec985972ad7a1bc89da02d7b0753b798
When changes report failure reduce the total number of changes that will
be tested concurrently in the pipeline queue the reported change
belonged too. Increase the number of changes that will be tested when
changes report success. This implements simple rate limiting which
should reduce resource thrash when tests are unstable.
Change-Id: Id092446c83649b3916751c4e4665d2adc75d0458
When doing a doPromote we should keep the enqueue times of items.
However, the teardown and rebuild of the queues means that items
are fully destroyed and created a new.
Change the addChange api call so that it takes an optional enqueue
time to set on the item, which it can do internally once the new
item has gotten created.
This should address the issue where enqueue_times get lost when
we do a promote of the queue.
Now with unit tests! (Also ensured that if I removed the scheduler
change these tests failed, so the test is testing a correct thing)
Change-Id: I4dce23d00128280c7add922ca9ef5018b57d1cf3
We want to report on "changes" triggered by a timer. Since those
aren't really changes, they are represented by a class called
NullChange which carries as much of the interface for a Change
object as possible (not much) so that they can be enqueued in
pipelines.
If a NullChange is reportable, then pretty much any kind of
change should be considered reportable. So remove the flag that
indicates that NullChanges and Refs are not reportable.
Only attempt to format a change report if there is an action
defined for that pipeline (in case the default formatting process
attempts to access a change attribute that is inappropriate).
Remove checks that try to avoid formatting or sending a report
based on attributes of the change (which are no longer relevant).
Add a test for using a timer trigger with an smtp reporter.
Add validation of the attributes that an smtp reporter can use in
the layout file.
Allow the operator to configure a Subject for smtp reports.
Change-Id: Icd067a7600c2922a318b9ade11b3946df4e53065
It'll help us to determine whether we're running a version of zuul that has
added support for some new feature.
The version_string() from pbr's version_info used to collect zuul version.
Change-Id: Id451f15538258ab2dec8e3e8f000cff4a8b7b20d
Currently we can filter pipeline triggers by email address but not
username. This is an issue for users that have no email addresses
such as Jenkins.
This patch adds a new filter "username_filter" to the gerrit trigger
section.
Change-Id: I66680ab7e9e5ff49466269175c8fb54aef30e016
Provide the short name of a project (anything after the last '/') to
project templates as the variable 'name'. If 'openstack/nova' invoked
a template, the variable 'name' would automatically be set to 'nova'
within the template.
Ideally this means that most template invocations in OpenStack's layout
will not need any variables defined.
Change-Id: I596744917c30c92041b8ea5b1f518d50fb64e59b
Accept multiple template invocations per project, and also allow
adding individual jobs to a project that uses templates.
Change-Id: I6c668dd434c12bec96b9a27afd9fd2eca7a11d0a
Takes one or more changes and promotes them to the head of the queue.
Also, change the command line syntax for the enqueue command to accept
change IDs in the form 'change,patchset' in order to match the syntax
of promote, as well as be potentially more compatible with future
triggers.
Change-Id: Ic7ded9587c68217c060328bf4c3518e32fe659e3
Create a new management event queue to handle external requests
for actions that are not related to triggers or results.
Use this for reconfiguration events. Subsequent patches will use
it for other kinds of events.
Change-Id: Ia018d9a7acc35aaf615ca85f75c9f4c630a5287f
When combining change queues, given 3 projects that were transitively
connected by shared jobs, depending on the order of processing, it
was possible for them not to be combined. To correct this, repeat
the combining operation until the resulting set can be combined no
further.
In order to make the test (and actual usage) behavior more deterministic,
the list of projects returned by the pipeline is now sorted by name.
A test is added for this.
Change-Id: If1386cad4118257efee9aa9918ad12a626927038
Add a command line client called 'zuul' that supports one command
to start with: 'enqueue'. It allows an operator (one with access
to the gearman server) to enqueue an arbitrary change in a specified
pipeline. It uses gearman to communicate with the Zuul server, which
now has an added RPC listener component to answer such requests via
gearman.
Add tests for the client RPC interface.
Raise an exception if a Gerrit query does not produce a change. Unlike
events from Gerrit, user (or admin) submitted events over the RPC bus
are more likely to reference invalid changes. To validate those, the
Gerrit trigger will raise an exception (and remove from its cache) changes
which prove to be invalid.
Change-Id: Ife07683a736c15f4db44a0f9881f3f71b78716b2
So that the status display and logs read more better.
Also, include the Zuul ref in JSON output so that the status
screen can do something intelligent with unprocessed items (also
it's an important bit of into).
Change-Id: I1429344917856edfaeefa4e9655c2dad8081cae5
If A is the head in A <- B <- C, and B failed, then C would be
correctly reparented to A. Then if A failed, B and C would be
restarted, but C would not be reparented back to B. This is
because the check around moving a change short-circuited if
there was no change ahead (which is the case if C is behind A
and A reports).
The solution to this is to still perform the move check even if
there is no change currently ahead (so that if there is a NNFI
change ahead, the current change will be moved behind it). This
effectively means we should remove the "not item ahead" part of
the conditional around the move.
This part of the conditional serves two additional purposes --
to make sure that we don't dereference an attribute on item_ahead
if it is None, and also to ensure that the NNFI algorithm is not
applied to independent queues.
So the fix moves that part of the conditional out so that we can
safely reference the needed attributes if there is a change ahead,
and also makes explicit that we ignore the situation if we are
working on an independent change queue.
This also adds a test that failed (at the indicated position) with
the previous code.
Change-Id: I4cf5e868af7cddb7e95ef378abb966613ac9701c
Update the scheduler algorithm to NNFI -- Nearest Non-Failing Item.
A stateless description of the algorithm is that jobs for every
item should always be run based on the repository state(s) set by
the nearest non-failing item ahead of it in its change queue.
This means that should an item fail (for any reason -- failure to
merge, a merge conflict, or job failure) changes after it will
have their builds canceled and restarted with the assumption that
the failed change will not merge but the nearest non-failing
change ahead will merge.
This should mean that dependent queues will always be running
jobs and no longer need to wait for a failing change to merge or
not merge before restarting jobs.
This removes the dequeue-on-conflict behavior because there is
now no cost to keeping an item that can not merge in the queue.
The documentation and associated test for this are removed.
This also removes the concept of severed heads because a failing
change at the head will not prevent other changes from proceeding
with their tests. If the jobs for the change at the head run
longer than following changes, it could still impact them while
it completes, but the reduction in code complexity is worth this
minor de-optimization.
The debugging representation of QueueItem is changed to make it
more useful.
Change-Id: I0d2d416fb0dd88647490ec06ed69deae71d39374
Allows multiple reports per a patchset to be sent to pluggable
destinations. These are configurable per pipeline and, if not
specified, defaults to the legacy behaviour of reporting back only
to gerrit.
Having multiple reporting methods means only certain success/failure
/start parameters will apply to certain reporters. Reporters are
listed as keys under each of those actions.
This means that each key under success/failure/start is a reporter and the
dictionaries under those are sent to the reporter to deal with.
Change-Id: I80d7539772e1485d5880132f22e55751b25ec198
The conditional that did a 'git remote update' for ref-updated
events (which otherwise don't affect the Merger) was wrong.
Change-Id: Icb2596df023279442613e10e13104a3621d867d9
Revert "Fix checkout when preparing a ref"
This reverts commit 6eeb24743a.
Revert "Don't reset the local repo"
This reverts commit 96ee718c4b.
Revert "Fetch specific refs on ref-updated events"
This reverts commit bfd5853957.
Change-Id: I50ae4535e3189350d3cc3a7527f89d5cb8eec01d
If an exception was received during a report, _reportItem would
erroneously indicate that it had been reported without error.
If a merge was expected, isMerged would be called which may then
raise a further exception which would stop queue processing.
Instead, set the default return value for _reportItem to True
because trigger.report returns a true value on error. This will
cause the change to be marked as reported (with a value of ERROR),
the merge check skipped, and the change will be quickly removed
from the pipeline.
Change-Id: I08b7cee486111200ac9857644d478727c635908d
If a change is removed outside of the main process method (eg
it is superceded), stats were not reported. Report them in that
case.
Change-Id: I9e753599dc3ecdf0d4bffc04f4515c04d882c8be
The current behavior is that for every event, run
'git remote origin update', which is quite a bit of overhead and
doesn't match what the comments say should be happening. The goal
is to ensure that when new tags arrive, we have them locally in
our repo. It's also not a bad idea for us to keep up with remote
branch movements as well.
This updates the event pre-processor to fetch the ref for each
ref-updated event as they are processed. This is much faster than
the git remote update that was happening before. It also adds
a git remote update to the repo initialization step so that when
Zuul starts, it will pick up any remote changes since it last ran.
Change-Id: I671bb43eddf41c7403de53bb4a223762101adc3c
We can more closely approximate Gerrit's behavior by using the
'resolve' git merge strategy. Make that the default, and leave
the previous behavior ('git merge') as an option. Also, finish
and correct the partially implemented plumbing for other merge
strategies (including cherry-pick).
(Note the previous unfinished implementation attempted to mimic
Gerrit's option names; the new implementation does not, but rather
documents the alignment. It's not a perfect translation anyway,
and this gives us more room to support other strategies not
currently supported by Gerrit).
Change-Id: Ie1ce4fde5980adf99bba69a5aa1d4e81026db676
We have a graph on our status page showing all of the jobs Zuul
launched, but it's built from more than 1000 graphite keys which
is a little inefficient. Add a key for convenience that rolls
up all of the job completions in a pipeline, so that such a graph
can be built with only about 10 keys.
Change-Id: Ie6dbcca68c8a118653effe90952c7921a9de9ad1
If job_name_in_report is false, the 'name' variable may be
undefined. This makes sure it is always defined.
Change-Id: Ie544ccdf1661e08e2aa4c8055999f16e20d7584b
Add the ability for Zuul to accept inputs from multiple trigger
sources simultaneously.
Pipelines are associated with exactly one trigger, which must now
be named in the configuration file.
Co-Authored-By: Monty Taylor <mordred@inaugust.com>
Change-Id: Ief2b31a7b8d85d30817f2747c1e2635f71ea24b9
For every job completed, record the result of that job separately
to statsd. For successful and failed jobs, record the runtimes
of the jobs separately by result (others are not interesting).
Also, substitute '_' for '.' in job names in statsd keys.
This is backwards-incompatible with current statsd keys.
Change-Id: I7b6152bcc7ea5ce6e37bf90ed41aee89baa29309
Add an option to the syntax validator to test that job
referenced in the layout are defined in a file. Creating the
file with the list of jobs is an exercise for user.
Change-Id: Iceb74440cb004e9ebe6fc08a4eedf7715de2d485
Metajobs were being applied in dict key order, which meant that
if more that one metajob matched a job, the actual attributes
applied were non-deterministic. This was compounded by the fact
that attributes of each metajob were being strictly copied to
the real job, so the attributes of the second metajob would always
completely replace the first.
Instead, keep metajobs in config file order, and only copy attributes
that are non-null. Boolean attributes are still last-wins, and
so must be set explicitly by each matching metajob.
Change-Id: Ie255658719d5ded1663c3513dae1fc297ce357c4
Each pipeline was maintaining the trigger cache only based on
its own current usage, which means that pipelines were clearing
the cache of changes currently being used by other pipelines.
This considers all changes in use in all pipelines when maintaining
the cache.
Change-Id: I3ab14c69acd80ecc613b63628c837511594744d0
Reviewed-on: https://review.openstack.org/36699
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Tested-by: Jenkins
Allow, eg, jobs in a gate pipeline to take precedence over
a check pipeline.
Change-Id: Idf91527704cc75b00a336291f91b908286f8e630
Reviewed-on: https://review.openstack.org/36552
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Change-Id: I9ac6e301d361b7bac8561a1f468c40be23e1ddcc
Reviewed-on: https://review.openstack.org/36281
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Change-Id: I69563ee47dd6f3777a52b67999ff1a03247f1e1e
Reviewed-on: https://review.openstack.org/35324
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Store the results of the configuration (pipelines, jobs, and all)
in a new Layout object. Return such an object from the parseConfig
method in the scheduler. This is a first step to reloading the
configuration on the fly -- it supports holding multiple
configurations in memory at once.
Change-Id: Ide56cddecbdbecdc4ed77b917d0b9bb24b1753d5
Reviewed-on: https://review.openstack.org/35323
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Make the scheduler idempotent. The idea is that after any event,
the scheduler should be able to run and examine the state of every
item in the queue and act accordingly. This is a change from the
current state where most events are dealt with in context. This
should ease maintenance as it should facilitate reasoning about
the different actions Zuul might take -- centralizing major
decisions into one function.
Also add a new class QueueItem, which represents a Change(ish)
in a queue. Currently, Change objects themselves are placed
in the queue, which is confusing information about a change (for
instance: it's number and patchset) as well as information about
the processing of that change in the queue (e.g., the build
history, current build set, merge status, etc.).
Change objects are now cached, which should reduce the number of
queries to Gerrit (except the current algorithm to update them is
very naive and queries Gerrit again on any event relating to a
change). Changes are expired from the cache when they are not
present or related to any change currently in a pipeline.
There are now two things that need to be asserted at the end of
each test, so use addCleanup in setUp to call a method that
performs those assertions after the test method completes. Also,
move the existing shutdown method to use addCleanup as well,
because testr experts say that's a best practice.
Change-Id: Id2bf4c484c9e681456c69d99787e7a5b3a247690
Reviewed-on: https://review.openstack.org/34653
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
Otherwise they can prevent exiting on timeout which confuses testr.
Change-Id: I239ab46f44fd09fe6b69fb70fdf4043e3c1daa67
Reviewed-on: https://review.openstack.org/35321
Reviewed-by: Monty Taylor <mordred@inaugust.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
It is possible the host system does not have git properly configured,
which results in merge failures because the git client is complain. For
example:
GitCommandError: 'git merge FETCH_HEAD' returned exit status 128:
*** Please tell me who you are.
Run
git config --global user.email "you@example.com"
git config --global user.name "Your Name"
to set your account's default identity.
Omit --global to set the identity only in this repository.
Now we can pass user.name and user.email settings to git, if configured
to do so.
Change-Id: I896194d8d1f5334026954b02f3a1a8dd82bed2ac
Signed-off-by: Paul Belanger <paul.belanger@polybeacon.com>
Reviewed-on: https://review.openstack.org/29015
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
report times as 1h 22m 08s
only report hours if the job runs longer than an hour
only report minutes if the job runs longer than a minute
don't zero pad whatever the leading time unit is, making it a
little easier to scan and see differences.
Change-Id: Ibb58be233fdef1bbdf4e90a83731d43eb0be47f1
Reviewed-on: https://review.openstack.org/28143
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: matthew-wagoner <matthew.wagoner@hp.com>
Reviewed-by: Antoine Musso <hashar@free.fr>
Approved: James E. Blair <corvus@inaugust.com>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
The ability to configure success and failure URL patterns (cf.
'success-pattern' and 'failure-pattern') obsoletes the need to
guess-by-fetching an appropriate link for the build status, which can be
extremely expensive. (Wikimedia's Zuul instance makes three HTTP requests per
invocation -- 'testReport', which 302s to 'testReport/', which 404s, and then
'consoleFull', which often runs to hundreds of kilobytes.)
Also corrects a small typo in README.rst.
Change-Id: Ib222f544c98253152a5e787ec0cdf28fa2d80cf6
Reviewed-on: https://review.openstack.org/28128
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Tested-by: Jenkins
On setup where Zuul ends up triggering hundreds of projects, you end up
having projects using roughly the same pipeline/jobs. Whenever one want
to add a job in all the similiar project, he has to edit each project
one by one.
To save some precious time, this patch introduces the concept of project
templates. It lets you define a set of pipeline and attached jobs
though the job names can be passed parameters defined on a per project
basis. Thus, updating similiar projects is all about editing a single
template.
A basic example is provided in the documentation.
The voluptuous schema has been updated. It does check whether all
parameters are properly passed to a template but does NOT check whether
the resulting job name exist.
The parameter expansion in templates is borrowed from Jenkins Job
Builder (deep_format function). It has been tweaked to also expand
dictionary keys.
Layout test plan:
$ nosetests -m layout --nocapture
Test layout file validation ...
<...>
bad_template1.yaml
required key not provided @
data['projects'][0]['template']['project']
bad_template2.yaml
extra keys not allowed @
data['projects'][0]['template']['extraparam']
good_template1.yaml
ok
<...>
$
A basic test hasbeen added to verify whether a project-template properly
triggers its tests:
$ nosetests --nocapture \
tests/test_scheduler.py:testScheduler.test_job_from_templates_launched
Test whether a job generated via a template can be launched ... ok
----------------------------------------------------------------------
Ran 1 test in 0.863s
OK
$
Change-Id: Ib82e4719331c204de87fbb4b20c198842b7e32f4
Reviewed-on: https://review.openstack.org/21881
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: James E. Blair <corvus@inaugust.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
In a config test, the scheduler is created without a config object.
Handle that case gracefully in the pipeline constructor.
Add a test that runs the configtest.
Change-Id: Id59b3194aff7b5347fa04fe1cc29b1069d2a3f7f
Reviewed-on: https://review.openstack.org/27585
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins
The new format will be:
"http://logs.example.com/6/1/gate/project2-merge/2 : SUCCESS in 00:00:00"
And the option is enabled by default.
Change-Id: Ib50c4948ea0a7b552d46a0e72ecb6c1a9609a771
Reviewed-on: https://review.openstack.org/27570
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Approved: Jeremy Stanley <fungi@yuggoth.org>
Tested-by: Jenkins
Add an additional job parameter, 'file', that will cause that
job to only run if the change touches files that match the
specification.
Change-Id: I8c8fd3d029e02e338fd1dd266443b9ac56c0e5ac
Reviewed-on: https://review.openstack.org/23710
Reviewed-by: Clark Boylan <clark.boylan@gmail.com>
Reviewed-by: Jeremy Stanley <fungi@yuggoth.org>
Reviewed-by: Monty Taylor <mordred@inaugust.com>
Approved: James E. Blair <corvus@inaugust.com>
Tested-by: Jenkins