20 KiB
. display in 68x24 .. display in 88x24
dissolve
Test Slide
images/testslide.ans
Preshow
images/cursor.ans images/cursor2.ans
Zuul
images/title.ans
This Talk
git clone https://opendev.org/inaugust/inaugust.com
cd src/zuulv3
- Then:
cat overview.rst
- Or:
pip install presentty
presentty overview.rst
Red Hat
images/redhat.ans
Ansible
images/ansible.ans
OpenDev
"most insane CI infrastructure I've ever been a part of"
-- Alex Gaynor
"like the SpaceX of CI"
-- Emily Dunham
Zuul
images/zuul.ans
What Zuul Does
- "Speculative Future State"
- gated changes
- one or more git repositories
- integrated deliverable
- testing like deployment
Underlying Philosophy
- All changes flow through code review
- Changes only land if they pass all tests
- End-to-end integration testing is essential
- Computers are cheaper than humans
Ramifications of Philosophy
- No direct push access for anyone
- Software should be installable from source
- Testing should be automated and repeatable
- Developers write tests with their patches
- Code always works
Getting to Gating
No Tests / Manual Tests
- No test automation exists or ...
- Developer runs test suite before pushing code
- Prone to developer skipping tests for "trivial" changes
- Doesn't scale organizationally
Periodic Testing
- Developers push changes directly to shared branch
- CI system runs tests from time to time - report if things still work
- "Who broke the build?"
- Leads to hacks like NVIE model
Post-Merge Testing
- Developers push changes directly to shared branch
- CI system is triggered by push - reports if push broke something
- Frequently batched / rolled up
- Easier to diagnose which change broke things
- Reactive - the bad changes are already in
Pre-Review Testing
- Changes are pushed to code review (Gerrit Change, GitHub PR, etc)
- CI system is triggered by code review change creation
- Test results inform review decisions
- Proactive - testing code before it lands
- Reviewers can get bored waiting for tests
- Only tests code as written, not potential result of merging code
Gating
- Changes are pushed to code review
- Gating system is triggered by code review approval
- Gating system merges code IFF tests pass
- Proactive - testing code before it lands
- Future state resulting from merge of code is tested
- Reviewers can fire-and-forget safely
Mix and Match
- Zuul supports all of those modes
- Zuul users frequently combine them
- Run pre-review (check) and gating (gate) on each change
- Post-merge/post-tag for release/publication automation
- Periodic for catching bitrot
Multi-repository integration
- Multiple source repositories are needed for deliverable
- Future state to be tested is the future state of all involved repos
To test proposed future state
- Get tip of each project. Merge appropriate change(s). Test.
- Changes must be serialized, otherwise state under test is invalid.
- Integrated deliverable repos share serialized queue
Speculative Execution
- Correct parallel processing of serialized future states
- Create virtual serial queue of changes for each deliverable
- Assume each change will pass its tests
- Test successive changes with previous changes applied to starting state
Nearest Non-Failing Change
(aka 'The Jim Blair Algorithm')
- If a change fails, move it aside
- Cancel all test jobs behind it in the queue
- Reparent queue items on the nearest non-failing change
- Restart tests with new state
Zuul Simulation
pan
- todo
images/zsim-00.ans
Zuul Simulation
cut
- todo
images/zsim-01.ans
Zuul Simulation
cut
- todo
images/zsim-02.ans
Zuul Simulation
cut
- todo
images/zsim-03.ans
Zuul Simulation
cut
- todo
images/zsim-04.ans
Zuul Simulation
cut
- todo
images/zsim-05.ans
Zuul Simulation
cut
- todo
images/zsim-06.ans
Zuul Simulation
cut
- todo
images/zsim-07.ans
Zuul Simulation
cut
- todo
images/zsim-08.ans
Zuul Simulation
cut
- todo
images/zsim-09.ans
Zuul Simulation
cut
- todo
images/zsim-10.ans
Zuul Simulation
cut
- todo
images/zsim-11.ans
Zuul Simulation
cut
- todo
images/zsim-12.ans
Zuul Simulation
cut
- todo
images/zsim-13.ans
Zuul Simulation
cut
- todo
images/zsim-14.ans
Zuul Simulation
cut
- todo
images/zsim-15.ans
Zuul Simulation
cut
- todo
images/zsim-16.ans
Zuul Simulation
cut
- todo
images/zsim-17.ans
Zuul Simulation
cut
- todo
images/zsim-18.ans
Zuul Simulation
cut
- todo
images/zsim-19.ans
Zuul Simulation
cut
- todo
images/zsim-20.ans
Zuul Simulation
cut
- todo
images/zsim-21.ans
Zuul Simulation
cut
- todo
images/zsim-22.ans
Lock Step Changes
- Circular Dependencies are not supported on purpose
- Rolling upgrades across interdependent services
- HOWEVER - many valid use cases (go/rust/c++) - support will be coming
Live Configuration Changes
Zuul is a distributed system, with a distributed configuration.
- tenant:
name: openstack
source:
gerrit:
config-repos:
- opendev/project-config
project-repos:
- zuul/zuul-jobs
- zuul/zuul
- zuul/nodepool
- ansible/ansible
- openstack/openstacksdk
Zuul Startup
- Read config file
Zuul Startup
- Read config file
- Ask mergers for branches of each repo
images/startup1.ans
Zuul Startup
Read config file
Ask mergers for branches of each repo
Ask mergers for .zuul.yaml for each branch
of each repo
images/startup2.ans
When .zuul.yaml Changes
Zuul looks for changes to .zuul.yaml
Asks mergers for updated content
Splices into configuration used for that change
Works with cross-repo dependencies
("This change depends on a change to the job definition")
Explicit Cross-Project Dependencies
- Developers can mark changes as being dependent
- Depends-On: footer - in commit or PR
- Zuul uses depends-on when constructing virtual serial queue
- Will not merge changes in gate before depends-on changes
- Works cross-repo AND cross-source
Depends-On Example
- Service 'nova' talks to service 'ironic'
- Currently using 'python-ironicclient'
- Want to replace python-ironicclient with openstacksdk:
- https://review.openstack.org/643664
- Need some plumbing in nova first:
- https://review.openstack.org/642899
- That change "Depends-On" a change to openstacksdk
Depends-On Example - openstacksdk
- In openstacksdk, need a new method to extract config differently
- https://review.openstack.org/643601
- The nova plumbing change adds this:
Depends-On: https://review.openstack.org/643601
Depends-On Example - keystoneauth
- openstacksdk uses 'keystoneauth' library to make REST calls
- Config extraction change wants a new helper method in keystoneauth
- https://review.openstack.org/644251
- openstacksdk change adds:
Depends-On: https://review.openstack.org/644251
Depends-On Example - In the Gate
- When Zuul prepares git repos for the Ironic nova change:
- Tip of nova, plus nova plumbing change, plus nova ironic change
- Tip of openstacksdk, plus config method change
- Tip of keystoneauth, plus helper method change
- Developers iterate on the nova service change
- BEFORE finalizing and releasing keystoneauth and openstacksdk changes
Zuul Architecture
We used to call "microservices" "distributed"
- Zuul is comprised of several services (mostly python3)
- zuul-scheduler
- zuul-executor
- zuul-merger
- zuul-web
- zuul-dashboard (javascript/react)
- zuul-fingergw
- zuul-proxy (c++)
- RDBMS
- Gearman
- Zookeeper
- Nodepool
Zuul Architecture
images/architecture.ans
Where Does Job Content Run?
Nodepool
- A separate program that works very closely with Zuul
- Zuul requires Nodepool but Nodepool can be used independently
- Creates and destroys zero or more node resources
- Resources can include VMs, Containers, COE contexts or Bare Metals
- Static driver for allocating pre-existing nodes to jobs
- Optionally periodically builds images and uploads to clouds
Nodepool Launcher
Where build nodes should come from
- OpenStack
- Static
- Kubernetes Pod
- Kubernetes Namespace
- AWS
In work / coming soon:
- Azure
- GCE
What about job content?
- Written in Ansible
- Ansible is excellent at running one or more tasks in one or more places
- The answer to "how do I" is almost always "Ansible"
What Zuul Does
- Listens for code events
- Prepares appropriate job config and git repo states
- Requests nodes for test jobs from Nodepool
- Runs user-defined Ansible playbooks with nodes in an inventory
- Collects/reports results
- Potentially merges change
Jobs
- Jobs define test node needs
- Metadata defined in Zuul's configuration
- Execution content in Ansible
- Jobs may be defined centrally or in the repo being tested
- Jobs have contextual variants that simplify configuration
Job
- job:
name: base
parent: null
description: |
The base job for Zuul. timeout: 1800
nodeset:
nodes:
- name: primary
label: ubuntu-bionic
pre-run: playbooks/base/pre.yaml
post-run:
- playbooks/base/post-ssh.yaml
- playbooks/base/post-logs.yaml
secrets:
- site_logs
Simple Job
- job:
name: tox
pre-run: playbooks/setup-tox.yaml
run: playbooks/tox.yaml
post-run: playbooks/fetch-tox-output.yaml
Simple Job Inheritance
- job:
name: tox-py36
parent: tox
vars:
tox_envlist: py36
Inheritance Works Like An Onion
- pre-run playbooks run in order of inheritance
- run playbook of job runs
- post-run playbooks run in reverse order of inheritance
- If pre-run playbooks fail, job is re-tried
- All post-run playbooks run - as far as pre-run playbooks got
Inheritance Example
For tox-py36 job
- base pre-run playbooks/base/pre.yaml
- tox pre-run playbooks/setup-tox.yaml
- tox run playbooks/tox.yaml
- tox post-run playbooks/fetch-tox-output.yaml
- base post-run playbooks/base/post-ssh.yaml
- base post-run playbooks/base/post-logs.yaml
Simple Job Variant
- job:
name: tox-py27
branches: stable/mitaka
nodeset:
- name: ubuntu-trusty
label: ubuntu-trusty
Nodesets for Multi-node Jobs
- nodeset:
name: ceph-cluster
nodes:
- name: controller
label: centos-7
- name: compute1
label: fedora-28
- name: compute2
label: fedora-28
groups:
- name: ceph-osd
nodes:
- controller
- name: ceph-monitor
nodes:
- controller
- compute1
- compute2
Multi-node Job
- nodesets are provided to Ansible for jobs in inventory
- job:
name: ceph-multinode
nodeset: ceph-cluster
run: playbooks/install-ceph.yaml
- Creates ansible inventory:
controller ansible_host=1.2.3.4
compute1 ansible_host=1.2.3.5
compute2 ansible_host=1.2.3.6
[ceph-osd]
controller
[ceph-monitor]
controller
compute1
compute2
Multi-node Ceph Job Content
- hosts: all
roles:
- install-ceph
- hosts: ceph-osd
roles:
- start-ceph-osd
- hosts: ceph-monitor
roles:
- start-ceph-monitor
- hosts: all
roles:
- do-something-interesting
Project With Central and Local Config
# In opendev.org/openstack-infra/project-config:
- project:
name: openstack/nova
templates:
- openstack-tox-jobs
# In opendev.org/openstack/nova/.zuul.yaml:
- project:
check:
- nova-placement-functional-devstack
zuul-jobs standard library
- https://opendev.org/openstack-infra/zuul-jobs
- Repo containing general purpose job definitions
- Add the git repo directly to a local Zuul config
Project with Job Dependencies
- project:
release:
jobs:
- build-artifacts
- upload-tarball:
dependencies: build-artifacts
- upload-pypi:
dependencies: build-artifacts
- notify-mirror:
dependencies:
- upload-tarball
- upload-pypi
Secrets
- Inspired by Kubernetes Secrets API
- Projects can add named encrypted secrets to their .zuul.yaml file
- Jobs can request to use secrets by name
- Jobs using secrets are not reconfigured speculatively
- Secrets can only be used by the same project they are defined in
- Public key per project:
{{ zuul_url }}/{{ tenant }}/{{ project }}.pub
Secret Example (note, no admins had to enable this)
# In opendev.org/openstack/loci/.zuul.yaml:
- secret:
name: loci_docker_login
data:
user: loci-username
password: !encrypted/pkcs1-oaep
- gUEX4eY3JAk/Xt7Evmf/hF7xr6HpNRXTibZjrKTbmI4QYHlzEBrBbHey27Pt/eYvKKeKw
hk8MDQ4rNX7ZK1v+CKTilUfOf4AkKYbe6JFDd4z+zIZ2PAA7ZedO5FY/OnqrG7nhLvQHE
5nQrYwmxRp4O8eU5qG1dSrM9X+bzri8UnsI7URjqmEsIvlUqtybQKB9qQXT4d6mOeaKGE
5h6Ydkb9Zdi4Qh+GpCGDYwHZKu1mBgVK5M1G6NFMy1DYz+4NJNkTRe9J+0TmWhQ/KZSqo
4ck0x7Tb0Nr7hQzV8SxlwkaCTLDzvbiqmsJPLmzXY2jry6QsaRCpthS01vnj47itoZ/7p
taH9CoJ0Gl7AkaxsrDSVjWSjatTQpsy1ub2fuzWHH4ASJFCiu83Lb2xwYts++r8ZSn+mA
hbEs0GzPI6dIWg0u7aUsRWMOB4A+6t2IOJibVYwmwkG8TjHRXxVCLH5sY+i3MR+NicR9T
IZFdY/AyH6vt5uHLQDU35+5n91pUG3F2lyiY5aeMOvBL05p27GTMuixR5ZoHcvSoHHtCq
7Wnk21iHqmv/UnEzqUfXZOque9YP386RBWkshrHd0x3OHUfBK/WrpivxvIGBzGwMr2qAj
/AhJsfDXKBBbhGOGk1u5oBLjeC4SRnAcIVh1+RWzR4/cAhOuy2EcbzxaGb6VTM=
Secret Example
# In opendev.org/openstack/loci/.zuul.yaml:
- job:
name: publish-loci-cinder
parent: loci-cinder
post-run: playbooks/push
secrets:
- loci_docker_login
# In opendev.org/openstack/loci/playbooks/push.yaml:
- hosts: all
tasks:
- include_vars: vars.yaml
- name: Push project to DockerHub
block:
- command: docker login -u {{ loci_docker_login.user }} -p {{ loci_docker_login.password }}
no_log: True
- command: docker push openstackloci/{{ project }}:{{ branch }}-{{ item.name }}
with_items: "{{ distros }}"
Speculative Conatiner Images
- Gating applied to continuously deployed container images
- Build and test images that depend on other images
- Build and test deployments comprising multiple images
- Without publishing to final location
- Publish the actual image that was built in the gate
Zuul is not New
- Has been in Production for OpenStack for Six Years
- Zuul is now a top-level effort of OpenStack Foundation
- Zuul v3 first release where not-OpenStack is first-class use case
OpenDev - Largest Known Zuul
- 2KJPH (2,000 jobs per hour)
- Build Nodes from 16 Regions of 5 Public and 3 Private OpenStack Clouds
- Rackspace, Internap, OVH, Vexxhost, CityCloud
- Linaro (ARM), Limestone, Packethost
- 10,000 changes merged per month
Not just for OpenStack
- BMW (control plane in OpenShift)
- GoDaddy (control plane in private Kubernetes)
- GoodMoney (control plane in EKS, adding GKE)
- Le Bon Coin
- Easystack
- TungstenFabric
- OpenLab
- Red Hat
- others ...
Code Review Systems
- Gerrit
- GitHub (Public and Enterprise)
In work / coming soon:
- Pagure
- Gitea
Commonly Requested:
- GitLab
- Bitbucket
Support for non-git
- Nope
- helix4git may work for perforce, but is untested
Installation of Software
Ways to Install Zuul
- Containers: https://hub.docker.com/_/zuul/
- Windmill: http://opendev.org/openstack/windmill
- Software Factory: https://softwarefactory-project.io/
- Puppet: http://opendev.org/openstack-infra/puppet-zuul
Zuul Containers
- Published on every commit
- Application/Process containers
- Config / Data should be bind-mounted in
zuul/zuul-executor
- In k8s, zuul-executor must be run privileged
- Uses bubblewrap for unprivileged sanboxing
- Restriction may be lifted in the future
Release Management
- Zuul is run Continuously Delivered and Deployed upstream
- Some users deploy Zuul with Zuul
- Releases are tagged from code run for OpenDev
- There is no intent to have a 'stable' release
- 'stable' is a synonym for "old and buggy"
zuul/zuul-scheduler
- SPOF
- We're working on it - HA/Distributed scheduler is coming
- Recommend running scheduler from tags
Quick Start
- docker-compose
https://zuul-ci.org/docs/zuul/admin/quick-start.html
Important Links
- https://zuul-ci.org/
- https://zuul-ci.org/docs/zuul
- https://zuul-ci.org/docs/zuul-jobs/
- freenode:#zuul
- https://opendev.org/zuul (https://git.zuul-ci.org/cgit/zuul)
Questions
images/questions.ans
Presentty
pan
Presentty
- Console presentations written in reStructuredText
- Cross-fade, pan, tilt, cut transitions
- https://pypi.python.org/pypi/presentty