17 KiB
. display in 68x40 .. display in 88x24
dissolve
Test Slide
images/testslide.ans
Preshow
images/cursor.ans images/cursor2.ans
Zuul
images/title.ans
Red Hat
images/redhat.ans
OpenStack
images/openstack.ans
OpenDev
"most insane CI infrastructure I've ever been a part of"
-- Alex Gaynor
"like the SpaceX of CI"
-- Emily Dunham
Zuul
images/zuul.ans
What Zuul does
- "Speculative Future State"
- multiple repositories
- integrated deliverable
- gated commits
- testing like deployment
Developer Process In a Nutshell
- Code Review (yay Gerrit!) - nobody has direct commit/push access
- Gated Commits - nobody has submit permission
- Every change gated on Code Analysis, Unit Tests, Functional Tests and End to End Integration Tests
- Run all tests (at least) twice:
- on patchset upload
- between change approval and merge
Developer Workflow
- Who has submitted a patch?
- Who wants to?
- (Who is here because the name of this talk is weird?)
Hack Review Test
========= ========== ==========
push approve
+-------------+ +-------------+
| | | |
+------+--+ +--v----+--+ +--v-------+
| | | | | |
| $EDITOR | | Gerrit | | Zuul |
| | | | | |
+------^--+ +--+----^--+ +--+-------+
| | | |
+-------------+ +-------------+
clone submit
Gerrit
explain patch upload, zuul runs, test results displayed in gerrit this is all the interface to zuul users need to see
switch to actual gertty screenshot
also show zuul status page
but zuul is doing a lot of work behind the scenes, and if you look closer, this is what you see
images/color-gertty.ans
Gerrit Installation
- 60G RAM
- 16 VCPU
- 8x git replicas running Gitea
- 2.13 (sssh, don't tell Luca)
Zuul in a nutshell
- Listens for code events
- Prepares appropriate job config and git repo states
- Allocates nodes for test jobs
- Pushes git repo states to nodes
- Runs user-defined Ansible playbooks
- Collects/reports results
- Potentially merges change
All in Service of Gating
No Tests / Manual Tests
- No test automation exists or ...
- Developer runs test suite before pushing code
- Prone to developer skipping tests for "trivial" changes
- Doesn't scale organizationally
Periodic Testing
- Developers push changes directly to shared branch
- CI system runs tests from time to time - report if things still work
- "Who broke the build?"
- Leads to hacks like NVIE model
Post-Merge Testing
- Developers push changes directly to shared branch
- CI system is triggered by push - reports if push broke something
- Frequently batched / rolled up
- Easier to diagnose which change broke things
- Reactive - the bad changes are already in
Pre-Review Testing
- Changes are pushed to code review (Gerrit Change, GitHub PR, etc)
- CI system is triggered by code review change creation
- Test results inform review decisions
- Proactive - testing code before it lands
- Reviewers can get bored waiting for tests
- Only tests code as written, not potential result of merging code
Gating
- Changes are pushed to code review
- Gating system is triggered by code review approval
- Gating system merges code IFF tests pass
- Proactive - testing code before it lands
- Future state resulting from merge of code is tested
- Reviewers can fire-and-forget safely
Mix and Match
- Zuul supports all of those modes
- Zuul users frequently combine them
- Run pre-review (check) and gating (gate) on each change
- Post-merge/post-tag for release/publication automation
- Periodic for catching bitrot
Check Jobs
- Run on patchset upload
- Verify patch as written
- Avoid wasting reviewer time on broken changes
check pipeline
- pipeline:
name: check
manager: independent
precedence: low
require:
gerrit:
open: True
current-patchset: True
trigger:
gerrit:
- event: patchset-created
- event: change-restored
- event: comment-added
comment: (?i)^(Patch Set [0-9]+:)?( [\w\\+-]*)*(\n\n)?\s*recheck
- event: comment-added
require-approval:
- Verified: [-1, -2]
username: zuul
approval:
- Workflow: 1
success:
gerrit:
Verified: 1
failure:
gerrit:
Verified: -1
Gate Triggering in Gerrit
- Gate jobs run between Code Review Approval and Merging
- "Workflow" Label in Gerrit
- Approvers have ability to vote +1 in Workflow
- Nobody sees the Submit button
- Zuul runs Gate jobs on Workflow+1 - clicks Submit on Success
gate pipeline
- pipeline:
name: gate
manager: dependent
post-review: True
require:
gerrit:
open: True
current-patchset: True
approval:
- Workflow: 1
trigger:
gerrit:
- event: comment-added
approval:
- Workflow: 1
start:
gerrit:
Verified: 0
success:
gerrit:
Verified: 2
submit: true
failure:
gerrit:
Verified: -2
Submit Hook?
- We'd love to have a hook point between Submit and Merge ...
Multi-repository integration
- Multiple source repositories are needed for deliverable
- Future state to be tested is the future state of all involved repos
To test proposed future state
- Get tip of each project. Merge appropriate change(s). Test.
- Changes must be serialized, otherwise state under test is invalid.
- Integrated deliverable repos share serialized queue
Speculative Execution
- Correct parallel processing of serialized future states
- Create virtual serial queue of changes for each deliverable
- Assume each change will pass its tests
- Test successive changes with previous changes applied to starting state
Nearest Non-Failing Change
(aka 'The Jim Blair Algorithm')
- If a change fails, move it aside
- Cancel all test jobs behind it in the queue
- Reparent queue items on the nearest non-failing change
- Restart tests with new state
Zuul Simulation
pan
- todo
images/zsim-00.ans
Zuul Simulation
cut
- todo
images/zsim-01.ans
Zuul Simulation
cut
- todo
images/zsim-02.ans
Zuul Simulation
cut
- todo
images/zsim-03.ans
Zuul Simulation
cut
- todo
images/zsim-04.ans
Zuul Simulation
cut
- todo
images/zsim-05.ans
Zuul Simulation
cut
- todo
images/zsim-06.ans
Zuul Simulation
cut
- todo
images/zsim-07.ans
Zuul Simulation
cut
- todo
images/zsim-08.ans
Zuul Simulation
cut
- todo
images/zsim-09.ans
Zuul Simulation
cut
- todo
images/zsim-10.ans
Zuul Simulation
cut
- todo
images/zsim-11.ans
Zuul Simulation
cut
- todo
images/zsim-12.ans
Zuul Simulation
cut
- todo
images/zsim-13.ans
Zuul Simulation
cut
- todo
images/zsim-14.ans
Zuul Simulation
cut
- todo
images/zsim-15.ans
Zuul Simulation
cut
- todo
images/zsim-16.ans
Zuul Simulation
cut
- todo
images/zsim-17.ans
Zuul Simulation
cut
- todo
images/zsim-18.ans
Zuul Simulation
cut
- todo
images/zsim-19.ans
Zuul Simulation
cut
- todo
images/zsim-20.ans
Zuul Simulation
cut
- todo
images/zsim-21.ans
Zuul Simulation
cut
- todo
images/zsim-22.ans
Explicit Cross-Project Dependencies
- Developers can mark changes as being dependent
- Depends-On: footer - in commit or PR
- Zuul uses depends-on when constructing virtual serial queue
- Will not merge changes in gate before depends-on changes
- Works cross-repo AND cross-source
Cross Source
trigger:
gerrit:
- event: patchset-created
- event: change-restored
- event: comment-added
comment: (?i)^(Patch Set [0-9]+:)?( [\w\\+-]*)*(\n\n)?\s*recheck
github:
- event: pull_request
action:
- opened
- changed
- reopened
- event: pull_request
action: comment
comment: (?i)^\s*recheck\s*$
start:
github:
status: pending
comment: false
success:
gerrit:
Verified: 1
github:
status: 'success'
failure:
gerrit:
Verified: -1
github:
status: 'failure'
Cross Source
- Explicit Dependency between projects from different sources
- Change to Zuul that depends on change to Gerrit
commit 737d61c116ff5f32770ef72e2dd82a031ab32591
Author: James E. Blair <jeblair@redhat.com>
Date: Mon Aug 19 14:58:20 2020 -0700
Add Support for Gerrit Checks Plugin
Depends-On: https://gerrit-review.googlesource.com/c/plugins/checks/+/232079
Change-Id: I8e5903f4429c5a1273a6120e0d09c57169e8f938
Lock Step Changes
- Circular Dependencies are not supported on purpose
- Rolling upgrades across interdependent services
- HOWEVER - many valid use cases (go/rust/c++) - support expected
Live Configuration Changes
Zuul is a distributed system, with a distributed configuration.
- tenant:
name: openstack
source:
gerrit:
config-projects:
- opendev/project-config
untrusted-projects:
- zuul/zuul-jobs
- zuul/zuul
- zuul/nodepool
- ansible/ansible
- openstack/openstacksdk
Zuul Startup
- Read config file
Zuul Startup
- Read config file
- Ask mergers for branches of each repo
images/startup1.ans
Zuul Startup
Read config file
Ask mergers for branches of each repo
Ask mergers for .zuul.yaml for each branch
of each repo
images/startup2.ans
When .zuul.yaml Changes
Zuul looks for changes to .zuul.yaml
Asks mergers for updated content
Splices into configuration used for that change
Works with cross-repo dependencies
("This change depends on a change to the job definition")
Zuul Architecture
Zuul is comprised of several services (mostly python3)
- zuul-scheduler
- zuul-executor
- zuul-merger
- zuul-web
- zuul-fingergw
- zuul-dashboard (javascript/react)
- zuul-proxy (c++)
- nodepool-launcher
- nodepool-builder
- RDBMS
- Gearman
- Zookeeper
Where Does Job Content Run?
Nodepool
- A separate service that works very closely with Zuul
- Zuul requires Nodepool but Nodepool can be used independently
- Creates and destroys zero or more node resources
- Resources can include VMs, Containers, COE contexts or Bare Metals
- Static driver for allocating pre-existing nodes to jobs
- Optionally periodically builds images and uploads to clouds
Nodepool Launcher
Where build nodes should come from
- OpenStack
- Static
- Kubernetes Pod
- Kubernetes Namespace
- AWS
In work / coming soon:
- Azure
- GCE
Jobs
- Define node types needed from nodepool
- Define which ansible playbooks to run
- Jobs may be defined centrally or in the repo being tested
- Jobs have contextual variants that simplify configuration
- Jobs definitions support inheritance
Job Libraries
- Jobs are defined in git repos
- Naturally directly sharable between Zuul installations
- https://opendev.org/zuul/zuul-jobs
Simple Job
- job:
name: tox
pre-run: playbooks/setup-tox.yaml
run: playbooks/tox.yaml
post-run: playbooks/fetch-tox-output.yaml
What about job content?
- Written in Ansible
- Ansible is excellent at running one or more tasks in one or more places
- The answer to "how do I" is almost always "Ansible"
Checks Plugin
- Learned about checks plugin in Gothenburg
- Added support for checks plugin to Zuul
- Expanded HTTP support in Zuul's Gerrit driver
- Connected OpenDev Zuul to Gerrit's Gerrit
OpenDev Zuul Verifies Checks Plugins Changes Using Checks API
OpenDev Zuul Verifies Checks Plugins Changes Using Checks API
https://gerrit-review.googlesource.com/c/plugins/checks/+/245796
https://opendev.org/zuul/project-config/src/branch/master/zuul.d/pipelines.yaml#L183-L235
Future Work
- Robot comments instead of inline comments
- https://review.opendev.org/#/c/682601/4/.zuul.yaml
- Sub-checks
Sub-checks
- Zuul jobs are speculative future content
- https://review.opendev.org/#/c/630406/ vs https://review.opendev.org/#/c/690511
- One checker per project per zuul pipeline
- Sub-checks is spec for dynamic runtime checks
- Zuul team collaborating on sub-checks spec
Using Zuul on Gerrit's Gerrit
- Sub-checks implemented in Gerrit and Zuul
- Add GCE support for Nodepool
- Work with Luca to run GCE-enabled Zuul
Who Is Running Zuul?
- Zuul is in production for OpenStack for 7 years (in OpenStack VMs)
Also running at:
- Volvo
- BMW (control plane in OpenShift)
- GoDaddy (control plane in Kubernetes)
- GoodMoney (control plane in EKS, adding GKE)
- Le Bon Coin
- Easystack
- Western Digital
- TungstenFabric
- Huawei OpenLab
- IBM
- Red Hat
- others ...
Zuul as a Service: https://vexxhost.com/solutions/managed-zuul/
Important Links
- https://opendev.org/inaugust/inaugust.com/raw/branch/master/src/zuulv3/gus2019.rst
- https://opendev.org/zuul
- https://zuul-ci.org/
- https://zuul-ci.org/docs/zuul
- https://zuul-ci.org/docs/zuul-jobs/
- https://docs.openstack.org/infra/openstack-zuul-jobs/
- freenode:#zuul
Questions
images/questions.ans
Presentty
pan
Presentty
- Console presentations written in reStructuredText
- Cross-fade, pan, tilt, cut transitions
- Figlet, cowsay!
- https://pypi.python.org/pypi/presentty