During debugging of ioctl failures one of the things we explored was
that this magic number may no longer be correct. Turns out it is
correct, but documenting the source of this value may aid future
When we open the ioctl file to run the openafs setpag syscall,
we previously used the high-level open method, which apparently
issues an unwanted TCGETS ioctl which crashes the program with
a kernel error under certain versions of python+openafs+linux
(3.10.6, 1.8.9, 5.15.0, respectively).
Switch to a low-level open to avoid this call.
The set autohold api endpoint incorrectly handled supplied values
such that if the user supplied a change without a ref it would always
use the default ref (.*). This corrects the case handling and
This changes the bell icon at the top to link directly to the
Currently, the bell icon opens a notification drawer which shows
one rendition of a list of config errors, and if the user clicks
on any error, it navigates to the config-errors page which shows
Rather than having two ways of viewing config errors, let's just
This updates the config-errors page to show errors in table format
along with a filter bar.
This allows users to see all of the errors at a glance, or filter
by project or error type.
The table rows show a summary of error information, and each row
can be expanded to show the full error message.
Administrators may not realize until it's too late that it would
have been better for them to save copies of project keys in case of
a disaster impacting the keystore in ZooKeeper. Zuul provides the
tools necessary to facilitate this, so document them prominently in
the places newcomers are likely to look when first installing or
learning to operate the software.
It can be confusing to know what merge modes are considered valid for a
project when Zuul tells you: "Merge mode foo is not supported by project
bar." Update this error message to include the list of merge modes that
Zuul considers to be valid for the project to reduce confusion.
This is the first in a series of changes to improve the usability
of the web view of config errors. The end goal is to be able to
display them in a more structured manner. A secondary goal is to
eventually add warnings (eg, deprecation warnings) which is
really only feasible if we have structured presentation of
This change does the following:
* Adds severity and error names to existing configuration errors
* And makes them available via the config-errors API endpoint
* Reduces the call sites for the error accumulator
* Unifies the calling convention for the accumulator
(we stop passing in Exception objects)
This change fixes a bug related to circular dependency resolution where
non-cycle changes could be enqueued between changes of the same cycle.
This violated the invariant assumption that changes of the same
dependency cycle are enqueued in sequence. This could cause the pipeline
processor to loop indefinitely under certain conditions.
The idea behind this fix is to treat all unprocessed dependencies of
other changes in the same cycle as if they were direct dependencies of
the current change. By that we will try to enqueue dependencies of any
change in the cycle ahead of the whole cycle.
We render ANSI colors in 3 places:
* Streaming console
* Build task summary tab
* Build console tab
(We do not render ANSI colors in the Build log tab.)
In the streaming console, the default color scheme is white-on-black
which is probably what most devs expect when they render color output.
The other two locations maintain the black-on-white Patternfly default
scheme. This means that some colors (eg yellow) are difficult to read.
In order to accommodate cases where ANSI colors are used, the
color scheme of the preformatted sections in the Build task summary or
Build console tabs is changed to white on black iff an ANSI color
sequence is detected. Otherwise, it remains the Patternfly default
of black on white.
Change I3824af6149bf27c41a8d895fc682236bd0d91f6b intended to refresh
builds from ZK only when necessary. Before that change we would
refresh them only if they did not have a result (because once they
have a result they won't change). That change should have reduced
that so that in the cases where we don't have a result yet, we
still only refresh if the build has changed.
In other words, that change should have used an "and" instead of
Logically, if builds can't change after they have a result, then
the check of whether we have a result is not necessary. So rather
that change the operator, we can just drop the build.result check
altogether and rely on the object update check. This has the effect
of making the code more future proof as well, in that we remove
the assumption that the build will never change after receiving a
This change also surfaced a bug in the original implementation:
because refreshing the Build objects happens inside the deserialize
method of the BuildSet object, the BuildSet has not actually updated
its build_versions variable from ZK yet, which means our comparisons
in shouldRefreshBuild were using outdated data. To correct, we now
pass in the newly deserialized value. And the same for Jobs.
The reason for this is that containers for zuul services need to run
privileged in order to successfully run bwrap. We currently only expect
users to run the executor as privilged and the new bwrap execution
checks have broken other services as a result. (Other services load the
bwrap system bceause it is a normal zuul driver and all drivers are
loaded by all services).
This works around this by add a check_bwrap flag to connection setup and
only setting it to true on the executor. A better longer term followup
fixup would be to only instantiate the bwrap driver on the executor in
the first place. This can probably be accomplished by overriding the
ZuulApp configure_connections method in the executor and dropping bwrap
creation in ZuulApp.
Temporarily stop running the quick-start job since it's apparently not
using speculative images.
A new version of pyjwt was released which alters an internal call
it makes to urllib which we have mocked in the unit tests. Our
mock must be updated to match.
The previous version would call urlopen with a string url argument,
newer versions call it with a Request object (urllib accepts both).
This change updates the mock to accept both.
There has also been a recent alembic release which alters the function
signature of alter_column so that most arguments are now required to
be keyword arguments. This causes one of our migrations to fail
since it was not passing the new type in as a positional argument.
Notably, the type was not the next positional argument in the
signature ("nullable" is), so this explains why we had two
"change patchset to string" migrations: the first one did not actually
This change alters the migration to do what it effectively did rather
that what it was intended to do. The later migration continues to
correct the error.
Newer bwrap has added the ability to disable additional nested user
namespace creation from with the bwrap execution context. Take advantage
of this feature in Zuul if we are able to in order to fortify Zuul's
In particular we need two conditions to take advantage of this. 1) bwrap
must be new enough to support the feature (>=0.8.0) and 2) we must be
running with user namespaces enabled. We explicitly check for both
conditions and add the appropriate invocation flags to bwrap when the
conditions are met.
Even after increasing the grace time for Github app installation tokens
to 5min we were still seeing exceptions related to expired app tokens.
Upon furhter investigation it turned out that the current grace time had
no effect at all since we passed the *adjusted* expiry time to the
Github client, which takes it at face value and raises an exception if
the expiry time is exceeded.
To fix this we'll store the original expiry time in the token cache and
pass that directly to the Github cliendt. We then adjust the cutoff time
by the 5min grace time when checking if a token should still be
Use the `completeBuild()` method for reporting the result on transient
errors during repo update instead of the old `sendWorkComplete()` API.
2023-04-28 12:29:49,173 ERROR zuul.AnsibleJob: [e: ...] [build: ...] Exception while executing job
Traceback (most recent call last):
File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 1144, in do_execute
File "/opt/zuul/lib/python3.10/site-packages/zuul/executor/server.py", line 1345, in _execute
AttributeError: 'FrozenJob' object has no attribute 'sendWorkComplete'
This updates the Gerrit driver to match the pattern in the GitHub
driver where instead of specifying individual trigger
requirements such as "require-approvals", instead a complete ref
filter (a la "requirements") can be embedded in the trigger
The "require-approvals" and "reject-approvals" attributes are
deprecated in favor of the new approach.
Additionally, all require filters in Gerrit are now available as
And finally, the Gerrit filters are updated to return
FalseWithReason so that log messages are more useful, and the
Github filters are updated to improve the language, avoid
apostraphes for ease of grepping, and match the new Gerrit
This mimics a useful feature of the Gerrit driver and allows users
to configure pipelines that trigger on events but only if certain
conditions of the PR are met.
Unlike the Gerrit driver, this embeds the entire require/reject
filter within the trigger filter (the trigger filter has-a require
or reject filter). This makes the code simpler and is easier for
users to configure. If we like this approach, we should migrate the
gerrit driver as well, and perhaps the other drivers.
The "require-status" attribute already existed, but was undocumented.
This documents it, adds backwards-compat handling for it, and
Some documentation typos are also corrected.