The following race was observed: 1) Several hours before the error, an event caused a change to be queried and added to the cache. 2) The change was enqueued in a pipeline for a while and therefore stayed in the relevant set. 3) The change was removed from the pipelines. 4) A cache prune process started shortly before the error and calculated the relevant set (the change was not in this set) and also the changes that were last modified > 1 hour ago (the change was in this set). This combination means the entry is subject to pruning. 5) The cache cleanup starts slowly deleting changes (this takes about 3 minutes). 6) An event arrives for the change. Gerrit is queried and the updated change is inserted into the cache. 7) The cache cleanup method gets around to deleting the change from the cache. 8) Subsequent queue processes can't find the change in the cache and raise an exception. Or, in fewer words, the change was updated between the decision time for the deletion and the deletion itself. The kazoo delete method takes a version argument which will alert us if the znode it would delete is of a different version than specified. If we remember the version of the cache entry from when we decide to delete it, we can avoid the race by ensuring that the deleted znode hasn't been updated since our decision. This change implements that. The 'recursive' parameter is removed since it causes the version check to always pass. There are no children under the cache entry, so it's not necessary. It was likely only added to simplify the case where we delete a node which is already deleted (NoNodeError). To account for that, we handle that exception explicitly. Change-Id: Ica840225fd52585a29452c80d90a4aa5e7763c8a
Zuul
Zuul is a project gating system.
The latest documentation for Zuul v3 is published at: https://zuul-ci.org/docs/zuul/
If you are looking for the Edge routing service named Zuul that is related to Netflix, it can be found here: https://github.com/Netflix/zuul
If you are looking for the Javascript testing tool named Zuul, it can be found here: https://github.com/defunctzombie/zuul
Getting Help
There are two Zuul-related mailing lists:
- zuul-announce
-
A low-traffic announcement-only list to which every Zuul operator or power-user should subscribe.
- zuul-discuss
-
General discussion about Zuul, including questions about how to use it, and future development.
You will also find Zuul developers in the #zuul channel on Freenode IRC.
Contributing
To browse the latest code, see: https://opendev.org/zuul/zuul To clone the latest code, use git clone https://opendev.org/zuul/zuul
Bugs are handled at: https://storyboard.openstack.org/#!/project/zuul/zuul
Suspected security vulnerabilities are most appreciated if first reported privately following any of the supported mechanisms described at https://zuul-ci.org/docs/zuul/user/vulnerabilities.html
Code reviews are handled by gerrit at https://review.opendev.org
After creating a Gerrit account, use git review to submit patches. Example:
# Do your commits
$ git review
# Enter your username if prompted
Join #zuul on Freenode to discuss development or usage.
License
Zuul is free software. Most of Zuul is licensed under the Apache License, version 2.0. Some parts of Zuul are licensed under the General Public License, version 3.0. Please see the license headers at the tops of individual source files.
Python Version Support
Zuul requires Python 3. It does not support Python 2.
Since Zuul uses Ansible to drive CI jobs, Zuul can run tests anywhere Ansible can, including Python 2 environments.