24 KiB
. display in 68x24 .. display in 88x24
dissolve
Test Slide
images/testslide.ans
Zuul v3 for Gating
Who the heck am I?
- mordred on freenode
- @e_monty on twitter
- I like scuba diving, smoking meat and long walks on the beach ...
Red Hat
- I work for Red Hat in the CTO Office as the Chief Architect for CI/CD
images/redhat.ans
OpenStack
- I work on OpenStack.
- I sit on the Board of Directors. I was on the Technical Committee
images/openstack.ans
OpenStack Infra
- My primary technical role with OpenStack is working on the OpenStack CI system.
"most insane CI infrastructure I've ever been a part of"
-- Alex Gaynor
"OpenStack Infra are like the SpaceX of CI"
-- Emily Dunham
tl;dr
- multi repo
- integrated deliverable
- gated commits
- open tooling
- nobody is special
- there is no Dana, only Zuul
OpenStack Is
- Federated
- Distributed
- Large
- Open
- Not Alone
Federated
- Hundreds of involved companies
- No 'main' company
- "Decisions are made by those who show up"
- Union of priorities/use cases
Impact of being Federated
- No company can appoint humans to project positions
- The project cannot fire anyone
- Variable background of contributors
- Heavy reliance on consensus-oriented democracy
Distributed
- There is no office
- Contributor base is global
- Multitude of contributor backgrounds
Impact of being Distributed
- Constantly at odds with American Exceptionalism
- Tooling must empower all contributors, regardless of background, skill level or cultural context
- Heavy preference for text-based communication
- Cannot assume US-centric needs or solutions
Large numbers of
- Contributors (~2k in any given 6 month period)
- Changes
- Code Repositories (1904 as of this morning)
OpenStack Scale Comparison
- 2KJPH (2,000 jobs per hour)
- Build Nodes from 13 Regions of 5 Public and 2 Private OpenStack Clouds
- Rackspace, Internap, OVH, Vexxhost, CityCloud and Linaro, Limestone
- 10,000 changes merged per month
OpenStack Scale Comparison
- 2KJPH (2,000 jobs per hour)
- Nodes from 12 Regions of 5 Public and 1 Private OpenStack Clouds
- Rackspace, Internap, OVH, Vexxhost, CityCloud and Linaro, Limestone
- 10,000 changes merged per month
By comparison, our friends at the amazing project Ansible received 13,000 changes and had merged 8,000 of them in its first 4 years.
Four Opens
- Open Source (we don't hold back Enterprise features, we don't cripple things)
- Open Design (design process open to all, decisions are not made inside company doors)
- Open Development (public source code, public code review, all code is reviewed and gated)
- Open Community (lazy consensus, democratic leadership from participants, public logged meetings in IRC, public archived mailing lists)
Nobody is Special
- No dictators
- Aggressively egalitarian
- No "pay for play"
Free Software Needs Free Tools
Free Software needs Free Tools
- Benjamin Mako Hill
Fifth Open - Four Opens Applied to our Infrastructure
- All tools must be Open Source
- Any external services must by Open Source
- Strongly avoid single-vendor
All Tools are Open Source
No Developer is ever required to use a proprietary tool to work on OpenStack.
Sixth Open - Four Opens Applied to Operations
- Ops driven by git/code-review - not by humans running commands
- Run as many things CD as possible
- Infrastructure team operates the same as the project
- Core reviewer status and root access are earned
- Human-initiated ops actions (running commands, clicking a UI) are a bug
- Keys/secrets are not Open :)
We're Not Alone
- Dependencies (libvirt/kvm/xen, mysql/pg, rabbit, python/javascript, ceph/gluster, ansible/salt/puppet/chef, ovs/odl)
- Adjacencies (kubernetes, ansible, terraform, opnfv, spinnaker)
- Vendors (plugins, products, services, distros)
Developer Process In a Nutshell
- Code Review - nobody has direct commit/push access
- 3rd-Party CI for vendors
- Gated Commits
@jessfraz
dear @github,
hope you are well, love ya so much
can I get a "squash and merge once
CI passes if no other changes are
pushed" button
k thx
Gated Commits
Hack Review Test
========= ========== ==========
push approve
+-------------+ +-------------+
| | | |
+------+--+ +--v----+--+ +--v-------+
| | | | | |
| $EDITOR | | Gerrit | | Zuul |
| | | | | |
+------^--+ +--+----^--+ +--+-------+
| | | |
+-------------+ +-------------+
clone merge
Gating
Every change proposed for a repository is tested before it merges.
Co-gating
Changes to a set of repositories merge monotonically such that each change is tested with the current state of all other related repositories before it merges.
Parallel Co-gating
Changes are serialized such that each change is tested with all of the changes ahead of it to satisfy the gating requirement while being able to run tests for multiple changes simultaneously.
Approve and move on
- Reviewers approve changes whenever
- Automation correctly deals with the rest
Zuul
images/zuul.ans
Zuul
- Multi-repo parallel co-gating engine
- When to run
- Where to run it
- With what git states
- How to respond to results
Zuul Architecture
images/architecture.ans
Not just for OpenStack
- Zuul v3 is in production for OpenStack (in OpenStack VMs)
Also running at:
- BMW (control plane in OpenShift)
- GoDaddy (control plane in Kubernetes)
- Red Hat
- OpenContrail
- OpenLab
- others ...
Zuul is not a general purpose automation framework
Zuul in a nutshell
- Listens for code events
- Prepares appropriate job config and git repo states
- Allocates nodes for test jobs
- Pushes git repo states to nodes
- Runs user-defined Ansible playbooks
- Collects/reports results
- Potentially merges change
Zuul Simulation
pan
- That was a lot of words - let's walk through it one step at a time
- Here we have two git repos, called nova and keystone, and their current HEAD state
images/zsim-00.ans
Zuul Simulation
cut
- A change is approved for Nova
images/zsim-01.ans
Zuul Simulation
cut
- Zuul starts running jobs for it
- The tests will test the current state of nova and keystone PLUS this nova change
images/zsim-02.ans
Zuul Simulation
cut
- A change is approved for Keystone
images/zsim-03.ans
Zuul Simulation
cut
- The tests will test the current state of nova and keystone PLUS this nova change
images/zsim-04.ans
Zuul Simulation
cut
- todo
images/zsim-05.ans
Zuul Simulation
cut
- todo
images/zsim-06.ans
Zuul Simulation
cut
- todo
images/zsim-07.ans
Zuul Simulation
cut
- todo
images/zsim-08.ans
Zuul Simulation
cut
- todo
images/zsim-09.ans
Zuul Simulation
cut
- todo
images/zsim-10.ans
Zuul Simulation
cut
- todo
images/zsim-11.ans
Zuul Simulation
cut
- todo
images/zsim-12.ans
Zuul Simulation
cut
- todo
images/zsim-13.ans
Zuul Simulation
cut
- todo
images/zsim-14.ans
Zuul Simulation
cut
- todo
images/zsim-15.ans
Zuul Simulation
cut
- todo
images/zsim-16.ans
Zuul Simulation
cut
- todo
images/zsim-17.ans
Zuul Simulation
cut
- todo
images/zsim-18.ans
Zuul Simulation
cut
- todo
images/zsim-19.ans
Zuul Simulation
cut
- todo
images/zsim-20.ans
Zuul Simulation
cut
- todo
images/zsim-21.ans
Zuul Simulation
cut
- todo
images/zsim-22.ans
Jobs
- Jobs run on nodes from nodepool (static or dynamic)
- Metadata defined in Zuul's configuration
- Execution content in Ansible (with live streaming!)
- Jobs may be defined centrally or in the repo being tested
- Jobs have contextual variants that simplify configuration
Shared Job Configs
- Job config repos are all in git
- Designed to support directly sharing job configurations
- git.zuul-ci.org/zuul-jobs repo is a 'standard library' to be directly shared between zuul installations
Job
- job:
name: base
parent: null
description: |
The base job for Zuul. timeout: 1800
nodeset:
nodes:
- name: primary
label: centos-7
pre-run: playbooks/base/pre.yaml
post-run:
- playbooks/base/post-ssh.yaml
- playbooks/base/post-logs.yaml
secrets:
- site_logs
Simple Job
- job:
name: tox
pre-run: playbooks/setup-tox.yaml
run: playbooks/tox.yaml
post-run: playbooks/fetch-tox-output.yaml
- job:
name: tox-py27
parent: tox
vars:
tox_envlist: py27
Playbooks
- Jobs run Ansible playbooks
- Playbooks may be defined centrally or in the repo being tested
- Playbooks can use roles from current or other Zuul repos (or Galaxy, coming soon)
- Playbooks are run on the zuul-executor using bubblewrap https://github.com/projectatomic/bubblewrap
- Playbooks are not allowed to execute content on 'localhost'
Devstack-gate / Tempest Playbook
# devstack-gate / tempest playbook
hosts: all
roles:
- setup-multinode-networking
- partition-swap
- configure-mirrors
- run-devstack
- run-tempest
Simple Shell Playbook
hosts: controller
tasks:
- shell: ./run_tests.sh
Test Like Production
If you use Ansible for deployment, your test and deployment processes and playbooks are the same
What if you don't use Ansible?
OpenStack Infra Control Plane (currently) uses Puppet
# In git.openstack.org/openstack-infra/project-config/roles/legacy-install-afs-with-puppet/tasks/main.yaml
- name: Install puppet
shell: ./install_puppet.sh
args:
chdir: "{{ ansible_user_dir }}/src/git.openstack.org/openstack-infra/system-config"
environment:
# Skip setting up pip, our images have already done this.
SETUP_PIP: "false"
become: yes
- name: Copy manifest
copy:
src: manifest.pp
dest: "{{ ansible_user_dir }}/manifest.pp"
- name: Run puppet
puppet:
manifest: "{{ ansible_user_dir }}/manifest.pp"
become: yes
Cross-Project Example Problem
- User reports bug in shade - auto_ip is not discovering their NAT properly
- Two fixes, one to detection algorithm, one to config override
- Config override requires adding support to os-client-config
- Once support is added to os-client-config, it can be consumed in shade
- How do we integration test this without releasing os-client-config?
Cross-Project Dependencies
Testing or gating dependencies (including jobs) manually specified by developers
shade https://review.openstack.org/513913/
Add unittest tips jobs
Change-ID: I5b411be5c5aa43535fa89a51d6099aadd7a8ea60
os-client-config https://review.openstack.org/513915
Add shade-tox-tips jobs
Change-ID: Ie3e9a4deca1d74b94e810e87e130706fe15fe2c9
Depends-On: https://review.openstack.org/513913/
os-client-config https://review.openstack.org/513751/
Added nat_source flag for networks
Change-ID: I3d8dd6d734a1013d2d4a43e11c3538c3a345820b
shade https://review.openstack.org/#/c/513914
Add support for configured NAT source variable
Change-Id: I4b50c2323a487b5ce90f9d38a48be249cfb739c5
Depends-On: https://review.openstack.org/513914
shade: Add unittest tips jobs
- In git.openstack.org/openstack-infra/shade/.zuul.yaml:
- job:
name: shade-tox-py27-tips
parent: openstack-tox-py27
description: |
Run tox python 27 unittests against master of important libs required-projects:
- openstack-infra/shade
- openstack/keystoneauth
- openstack/os-client-config
- job:
name: shade-tox-py35-tips
parent: openstack-tox-py35
description: |
Run tox python 35 unittests against master of important libs required-projects:
- openstack-infra/shade
- openstack/keystoneauth
- openstack/os-client-config
shade: Add unittest tips project-template
- In git.openstack.org/openstack-infra/shade/.zuul.yaml:
- project-template:
name: shade-tox-tips
check:
jobs:
- shade-tox-py27-tips
- shade-tox-py35-tips
gate:
jobs:
- shade-tox-py27-tips
- shade-tox-py35-tips
shade: Add unittest tips project-template to project
- In git.openstack.org/openstack-infra/shade/.zuul.yaml:
- project:
templates:
- publish-to-pypi
- publish-openstack-sphinx-docs
- shade-tox-tips
os-client-config: Add shade-tox-tips jobs
- In git.openstack.org/openstack/os-client-config/.zuul.yaml:
- project:
templates:
- shade-tox-tips
os-client-config: Add nat_source flag for networks
diff --git a/os_client_config/cloud_config.py b/os_client_config/cloud_config.py
index 2e97629..d1a6983 100644
--- a/os_client_config/cloud_config.py
+++ b/os_client_config/cloud_config.py
@@ -581,3 +581,10 @@ class CloudConfig(object):
if net['nat_destination']:
return net['name']
return None
+
+ def get_nat_source(self):
+ """Get network used for NAT source."""
+ for net in self.config['networks']:
+ if net.get('nat_source'):
+ return net['name']
+ return None
shade: Add support for configured NAT source variable
Zuul 10-21 13:57
Patch Set 5: Verified-1
Build failed.
openstack-tox-pep8 SUCCESS in 2m 29s
openstack-tox-py27 FAILURE in 2m 34s
build-openstack-releasenotes SUCCESS in 2m 47s
openstack-tox-py35 FAILURE in 2m 41s
openstack-tox-cover POST_FAILURE in 3m 52s (non-voting)
build-openstack-sphinx-docs SUCCESS in 2m 57s
shade-tox-py27-tips SUCCESS in 3m 18s
shade-tox-py35-tips SUCCESS in 2m 28s
OpenStack Github Support for Cross Community Testing
- OpenStack does not use Github, but other people do
- Github App "OpenStack Zuul"
- App added to github project by project admin
- Project aded to OpenStack's main.yaml
- Test interactions between OpenStack and important adjacent communities
- https://github.com/ansible/ansible/pull/20974
Cross Source Dependencies
shade https://review.openstack.org/539563
Shift voting flag and test_matrix_branch for ansible-devel job
Change-ID: Ic9d3983de641dbe618c65b2cbf2dcfa3686575df
ansible https://github.com/ansible/ansible/pull/34925
continue fact gathering even without dmidecode
ansible https://github.com/ansible/ansible/pull/20974
Make a generalized OpenStack cloud constructor
Depends-On: https://review.openstack.org/539563 Depends-On: https://github.com/ansible/ansible/pull/34925
Nodesets for Multi-node Jobs
- nodeset:
name: ceph-cluster
nodes:
- name: controller
label: centos-7
- name: compute1
label: fedora-26
- name: compute2
label: fedora-26
groups:
- name: ceph-osd
nodes:
- controller
- name: ceph-monitor
nodes:
- controller
- compute1
- compute2
Multi-node Job
- nodesets are provided to Ansible for jobs in inventory
- job:
name: ceph-multinode
nodeset: ceph-cluster
run: playbooks/install-ceph.yaml
Multi-node Ceph Job Content
- hosts: all
roles:
- install-ceph
- hosts: ceph-osd
roles:
- start-ceph-osd
- hosts: ceph-monitor
roles:
- start-ceph-monitor
- hosts: all
roles:
- do-something-interesting
Projects
- Projects are git repositories
- Specify a set of jobs for each pipeline
- golang git repo naming as been adopted:
zuul@ubuntu-xenial:~$ find /home/zuul/src -mindepth 3 -maxdepth 3 -type d
/home/zuul/src/git.openstack.org/openstack-infra/shade
/home/zuul/src/git.openstack.org/openstack/keystoneauth
/home/zuul/src/git.openstack.org/openstack/os-client-config
/home/zuul/src/github.com/ansible/ansible
Project with Job Dependencies
# In git.openstack.org/openstack-infra/project-config:
- project:
name: openstack/nova
release:
jobs:
- build-artifacts
- upload-tarball:
dependencies: build-artifacts
- upload-pypi:
dependencies: build-artifacts
- notify-mirror:
dependencies:
- upload-tarball
- upload-pypi
Secrets
- Inspired by Kubernetes Secrets API
- Projects can add named encrypted secrets to their .zuul.yaml file
- Jobs can request to use secrets by name
- Jobs using secrets are not reconfigured speculatively
- Secrets can only be used by the same project they are defined in
- Public key per project:
{{ zuul_url }}/tenant/{{ tenant }}/key/{{ project }}.pub
Secret Example (note, no admins had to enable this)
# In git.openstack.org/openstack/loci/.zuul.yaml:
- secret:
name: loci_docker_login
data:
user: !encrypted/pkcs1-oaep
- r8Nbpq5olmfLF035BZ/CUoFLIdhvBi/49KuochOAHbvns+xMiho3C7MEFzYDqJX3IhHde
BICYOgK7qnyINOIZL2e7pl75rEdHQwJjSFUMkpdY6wEP7f9hpolj9xVp0ifHUVQqPHMRn
zoPFd8MEAHxH5GLmc2SWJ98E/QUqGltxBi1YRSZoCcNtq3tHFK5Y+xQlLhIseJ2HkpDs6
YXOGP9Qt4Va6sdyBcA90H+apSAcYA3Duu962ySZQAsYNui/3NQq3gLA+OZeyTJtcrh4hj
Rb5dBnDWfSrMpxdNkbPXXgbQaxO3T0L4jbaOF8VKEsiI9olBrOeV2M9ddYJjSsHGj4XR8
4vwS0+doB7np93fujiDuHVgdG8R40NW2GznyKRlRtzAORla7Mzw1Y1MokcUyY6p1LlLLl
wUuWYCCEuRciOPhZXQ2u42qju/zrK2/dPnO8HfUINSrN0WbNq14ZwPpbj0ro02oGPbtwu
OTw1z+N0Nc+GuLWlwYJGYM/z0UnvDR3WEBc2kXbVev9w4n0cB3RyphML2PDZZWbw8tjnX
h1VsAOJ0Qo4qq1K/ft95ypd+vtjkfepEgHEBmJNwutJa9IHAkGfrkO9VkpUTPpfffnPwz
d0/zaaadNl6MLQUSutRwY23YIIbv+fmukxw2vnJmvn6abkBlMya7KgtifwNA8c=
password: !encrypted/pkcs1-oaep
- gUEX4eY3JAk/Xt7Evmf/hF7xr6HpNRXTibZjrKTbmI4QYHlzEBrBbHey27Pt/eYvKKeKw
hk8MDQ4rNX7ZK1v+CKTilUfOf4AkKYbe6JFDd4z+zIZ2PAA7ZedO5FY/OnqrG7nhLvQHE
5nQrYwmxRp4O8eU5qG1dSrM9X+bzri8UnsI7URjqmEsIvlUqtybQKB9qQXT4d6mOeaKGE
5h6Ydkb9Zdi4Qh+GpCGDYwHZKu1mBgVK5M1G6NFMy1DYz+4NJNkTRe9J+0TmWhQ/KZSqo
4ck0x7Tb0Nr7hQzV8SxlwkaCTLDzvbiqmsJPLmzXY2jry6QsaRCpthS01vnj47itoZ/7p
taH9CoJ0Gl7AkaxsrDSVjWSjatTQpsy1ub2fuzWHH4ASJFCiu83Lb2xwYts++r8ZSn+mA
hbEs0GzPI6dIWg0u7aUsRWMOB4A+6t2IOJibVYwmwkG8TjHRXxVCLH5sY+i3MR+NicR9T
IZFdY/AyH6vt5uHLQDU35+5n91pUG3F2lyiY5aeMOvBL05p27GTMuixR5ZoHcvSoHHtCq
7Wnk21iHqmv/UnEzqUfXZOque9YP386RBWkshrHd0x3OHUfBK/WrpivxvIGBzGwMr2qAj
/AhJsfDXKBBbhGOGk1u5oBLjeC4SRnAcIVh1+RWzR4/cAhOuy2EcbzxaGb6VTM=
Secret Example
# In git.openstack.org/openstack/loci/.zuul.yaml:
- job:
name: publish-loci-cinder
parent: loci-cinder
post-run: playbooks/push
secrets:
- loci_docker_login
# In git.openstack.org/openstack/loci/playbooks/push.yaml:
- hosts: all
tasks:
- include_vars: vars.yaml
- name: Push project to DockerHub
block:
- command: docker login -u {{ loci_docker_login.user }} -p {{ loci_docker_login.password }}
no_log: True
- command: docker push openstackloci/{{ project }}:{{ branch }}-{{ item.name }}
with_items: "{{ distros }}"
What's Next?
- MQTT publisher
- node providers
- kuberenetes
- OCI/docker
- Mac Stadium (for our Ansible friends)
- ec2/gce/azure
- native container/kubernetes job execution
Important Links
- https://zuul-ci.org/
- https://git.zuul-ci.org/cgit/zuul
- https://zuul-ci.org/docs/zuul
- https://zuul-ci.org/docs/zuul-jobs/
- https://docs.openstack.org/infra/manual/zuulv3.html
- https://docs.openstack.org/infra/openstack-zuul-jobs/
- https://storyboard.openstack.org/#!/project/679
- https://storyboard.openstack.org/#!/board/41
- freenode:#zuul
Questions
images/questions.ans
Presentty
pan
Presentty
- Console presentations written in reStructuredText
- Cross-fade, pan, tilt, cut transitions
- Figlet, cowsay!
- https://pypi.python.org/pypi/presentty