inaugust.com/src/gating/gating.rst

24 KiB

. display in 68x24 .. display in 88x24

dissolve

Test Slide

images/testslide.ans

Zuul v3 for Gating

Who the heck am I?

  • mordred on freenode
  • @e_monty on twitter
  • I like scuba diving, smoking meat and long walks on the beach ...

Red Hat

  • I work for Red Hat in the CTO Office as the Chief Architect for CI/CD

images/redhat.ans

OpenStack

  • I work on OpenStack.
  • I sit on the Board of Directors. I was on the Technical Committee

images/openstack.ans

OpenStack Infra

  • My primary technical role with OpenStack is working on the OpenStack CI system.
"most insane CI infrastructure I've ever been a part of"

  -- Alex Gaynor

"OpenStack Infra are like the SpaceX of CI"

  -- Emily Dunham

tl;dr

  • multi repo
  • integrated deliverable
  • gated commits
  • open tooling
  • nobody is special
  • there is no Dana, only Zuul

OpenStack Is

  • Federated
  • Distributed
  • Large
  • Open
  • Not Alone

Federated

  • Hundreds of involved companies
  • No 'main' company
  • "Decisions are made by those who show up"
  • Union of priorities/use cases

Impact of being Federated

  • No company can appoint humans to project positions
  • The project cannot fire anyone
  • Variable background of contributors
  • Heavy reliance on consensus-oriented democracy

Distributed

  • There is no office
  • Contributor base is global
  • Multitude of contributor backgrounds

Impact of being Distributed

  • Constantly at odds with American Exceptionalism
  • Tooling must empower all contributors, regardless of background, skill level or cultural context
  • Heavy preference for text-based communication
  • Cannot assume US-centric needs or solutions

Large numbers of

  • Contributors (~2k in any given 6 month period)
  • Changes
  • Code Repositories (1904 as of this morning)

OpenStack Scale Comparison

  • 2KJPH (2,000 jobs per hour)
  • Build Nodes from 13 Regions of 5 Public and 2 Private OpenStack Clouds
  • Rackspace, Internap, OVH, Vexxhost, CityCloud and Linaro, Limestone
  • 10,000 changes merged per month

OpenStack Scale Comparison

  • 2KJPH (2,000 jobs per hour)
  • Nodes from 12 Regions of 5 Public and 1 Private OpenStack Clouds
  • Rackspace, Internap, OVH, Vexxhost, CityCloud and Linaro, Limestone
  • 10,000 changes merged per month

By comparison, our friends at the amazing project Ansible received 13,000 changes and had merged 8,000 of them in its first 4 years.

Four Opens

  • Open Source (we don't hold back Enterprise features, we don't cripple things)
  • Open Design (design process open to all, decisions are not made inside company doors)
  • Open Development (public source code, public code review, all code is reviewed and gated)
  • Open Community (lazy consensus, democratic leadership from participants, public logged meetings in IRC, public archived mailing lists)

Nobody is Special

  • No dictators
  • Aggressively egalitarian
  • No "pay for play"

Free Software Needs Free Tools

Free Software needs Free Tools
  - Benjamin Mako Hill

Fifth Open - Four Opens Applied to our Infrastructure

  • All tools must be Open Source
  • Any external services must by Open Source
  • Strongly avoid single-vendor

All Tools are Open Source

No Developer is ever required to use a proprietary tool to work on OpenStack.

Sixth Open - Four Opens Applied to Operations

  • Ops driven by git/code-review - not by humans running commands
  • Run as many things CD as possible
  • Infrastructure team operates the same as the project
  • Core reviewer status and root access are earned
  • Human-initiated ops actions (running commands, clicking a UI) are a bug
  • Keys/secrets are not Open :)

We're Not Alone

  • Dependencies (libvirt/kvm/xen, mysql/pg, rabbit, python/javascript, ceph/gluster, ansible/salt/puppet/chef, ovs/odl)
  • Adjacencies (kubernetes, ansible, terraform, opnfv, spinnaker)
  • Vendors (plugins, products, services, distros)

Developer Process In a Nutshell

  • Code Review - nobody has direct commit/push access
  • 3rd-Party CI for vendors
  • Gated Commits

@jessfraz

dear @github,

hope you are well, love ya so much

can I get a "squash and merge once

CI passes if no other changes are

pushed" button

k thx

Gated Commits

Hack             Review              Test
=========         ==========         ==========

        push              approve
   +-------------+    +-------------+
   |             |    |             |
+------+--+       +--v----+--+       +--v-------+
|         |       |          |       |          |
| $EDITOR |       |  Gerrit  |       |   Zuul   |
|         |       |          |       |          |
+------^--+       +--+----^--+       +--+-------+
   |             |    |             |
   +-------------+    +-------------+
        clone              merge

Gating

Every change proposed for a repository is tested before it merges.

Co-gating

Changes to a set of repositories merge monotonically such that each change is tested with the current state of all other related repositories before it merges.

Parallel Co-gating

Changes are serialized such that each change is tested with all of the changes ahead of it to satisfy the gating requirement while being able to run tests for multiple changes simultaneously.

Approve and move on

  • Reviewers approve changes whenever
  • Automation correctly deals with the rest

Zuul

images/zuul.ans

Zuul

  • Multi-repo parallel co-gating engine
  • When to run
  • Where to run it
  • With what git states
  • How to respond to results

Zuul Architecture

images/architecture.ans

Not just for OpenStack

  • Zuul v3 is in production for OpenStack (in OpenStack VMs)

Also running at:

  • BMW (control plane in OpenShift)
  • GoDaddy (control plane in Kubernetes)
  • Red Hat
  • OpenContrail
  • OpenLab
  • others ...

Zuul is not a general purpose automation framework

Zuul in a nutshell

  • Listens for code events
  • Prepares appropriate job config and git repo states
  • Allocates nodes for test jobs
  • Pushes git repo states to nodes
  • Runs user-defined Ansible playbooks
  • Collects/reports results
  • Potentially merges change

Zuul Simulation

pan

  • That was a lot of words - let's walk through it one step at a time
  • Here we have two git repos, called nova and keystone, and their current HEAD state

images/zsim-00.ans

Zuul Simulation

cut

  • A change is approved for Nova

images/zsim-01.ans

Zuul Simulation

cut

  • Zuul starts running jobs for it
  • The tests will test the current state of nova and keystone PLUS this nova change

images/zsim-02.ans

Zuul Simulation

cut

  • A change is approved for Keystone

images/zsim-03.ans

Zuul Simulation

cut

  • The tests will test the current state of nova and keystone PLUS this nova change

images/zsim-04.ans

Zuul Simulation

cut

  • todo

images/zsim-05.ans

Zuul Simulation

cut

  • todo

images/zsim-06.ans

Zuul Simulation

cut

  • todo

images/zsim-07.ans

Zuul Simulation

cut

  • todo

images/zsim-08.ans

Zuul Simulation

cut

  • todo

images/zsim-09.ans

Zuul Simulation

cut

  • todo

images/zsim-10.ans

Zuul Simulation

cut

  • todo

images/zsim-11.ans

Zuul Simulation

cut

  • todo

images/zsim-12.ans

Zuul Simulation

cut

  • todo

images/zsim-13.ans

Zuul Simulation

cut

  • todo

images/zsim-14.ans

Zuul Simulation

cut

  • todo

images/zsim-15.ans

Zuul Simulation

cut

  • todo

images/zsim-16.ans

Zuul Simulation

cut

  • todo

images/zsim-17.ans

Zuul Simulation

cut

  • todo

images/zsim-18.ans

Zuul Simulation

cut

  • todo

images/zsim-19.ans

Zuul Simulation

cut

  • todo

images/zsim-20.ans

Zuul Simulation

cut

  • todo

images/zsim-21.ans

Zuul Simulation

cut

  • todo

images/zsim-22.ans

Jobs

  • Jobs run on nodes from nodepool (static or dynamic)
  • Metadata defined in Zuul's configuration
  • Execution content in Ansible (with live streaming!)
  • Jobs may be defined centrally or in the repo being tested
  • Jobs have contextual variants that simplify configuration

Shared Job Configs

  • Job config repos are all in git
  • Designed to support directly sharing job configurations
  • git.zuul-ci.org/zuul-jobs repo is a 'standard library' to be directly shared between zuul installations

Job

- job:
    name: base
    parent: null
    description: |
      The base job for Zuul.
    timeout: 1800
    nodeset:
      nodes:
        - name: primary
          label: centos-7
    pre-run: playbooks/base/pre.yaml
    post-run:
      - playbooks/base/post-ssh.yaml
      - playbooks/base/post-logs.yaml
    secrets:
      - site_logs

Simple Job

- job:
    name: tox
    pre-run: playbooks/setup-tox.yaml
    run: playbooks/tox.yaml
    post-run: playbooks/fetch-tox-output.yaml

- job:
    name: tox-py27
    parent: tox
    vars:
      tox_envlist: py27

Playbooks

  • Jobs run Ansible playbooks
  • Playbooks may be defined centrally or in the repo being tested
  • Playbooks can use roles from current or other Zuul repos (or Galaxy, coming soon)
  • Playbooks are run on the zuul-executor using bubblewrap https://github.com/projectatomic/bubblewrap
  • Playbooks are not allowed to execute content on 'localhost'

Devstack-gate / Tempest Playbook

# devstack-gate / tempest playbook
hosts: all
roles:
  - setup-multinode-networking
  - partition-swap
  - configure-mirrors
  - run-devstack
  - run-tempest

Simple Shell Playbook

hosts: controller
tasks:
  - shell: ./run_tests.sh

Test Like Production

If you use Ansible for deployment, your test and deployment processes and playbooks are the same

What if you don't use Ansible?

OpenStack Infra Control Plane (currently) uses Puppet

# In git.openstack.org/openstack-infra/project-config/roles/legacy-install-afs-with-puppet/tasks/main.yaml
- name: Install puppet
  shell: ./install_puppet.sh
  args:
    chdir: "{{ ansible_user_dir }}/src/git.openstack.org/openstack-infra/system-config"
  environment:
    # Skip setting up pip, our images have already done this.
    SETUP_PIP: "false"
  become: yes

- name: Copy manifest
  copy:
    src: manifest.pp
    dest: "{{ ansible_user_dir }}/manifest.pp"

- name: Run puppet
  puppet:
    manifest: "{{ ansible_user_dir }}/manifest.pp"
  become: yes

Cross-Project Example Problem

  • User reports bug in shade - auto_ip is not discovering their NAT properly
  • Two fixes, one to detection algorithm, one to config override
  • Config override requires adding support to os-client-config
  • Once support is added to os-client-config, it can be consumed in shade
  • How do we integration test this without releasing os-client-config?

Cross-Project Dependencies

Testing or gating dependencies (including jobs) manually specified by developers

shade: Add unittest tips jobs

  • In git.openstack.org/openstack-infra/shade/.zuul.yaml:
- job:
    name: shade-tox-py27-tips
    parent: openstack-tox-py27
    description: |
      Run tox python 27 unittests against master of important libs
    required-projects:
      - openstack-infra/shade
      - openstack/keystoneauth
      - openstack/os-client-config

- job:
    name: shade-tox-py35-tips
    parent: openstack-tox-py35
    description: |
      Run tox python 35 unittests against master of important libs
    required-projects:
      - openstack-infra/shade
      - openstack/keystoneauth
      - openstack/os-client-config

shade: Add unittest tips project-template

  • In git.openstack.org/openstack-infra/shade/.zuul.yaml:
- project-template:
    name: shade-tox-tips
    check:
      jobs:
        - shade-tox-py27-tips
        - shade-tox-py35-tips
    gate:
      jobs:
        - shade-tox-py27-tips
        - shade-tox-py35-tips

shade: Add unittest tips project-template to project

  • In git.openstack.org/openstack-infra/shade/.zuul.yaml:
- project:
    templates:
      - publish-to-pypi
      - publish-openstack-sphinx-docs
      - shade-tox-tips

os-client-config: Add shade-tox-tips jobs

  • In git.openstack.org/openstack/os-client-config/.zuul.yaml:
- project:
    templates:
      - shade-tox-tips

os-client-config: Add nat_source flag for networks

diff --git a/os_client_config/cloud_config.py b/os_client_config/cloud_config.py
index 2e97629..d1a6983 100644
--- a/os_client_config/cloud_config.py
+++ b/os_client_config/cloud_config.py
@@ -581,3 +581,10 @@ class CloudConfig(object):
if net['nat_destination']:
    return net['name']
return None
+
+    def get_nat_source(self):
+        """Get network used for NAT source."""
+        for net in self.config['networks']:
+            if net.get('nat_source'):
+                return net['name']
+        return None

shade: Add support for configured NAT source variable

Zuul 10-21 13:57
Patch Set 5: Verified-1
Build failed.
    openstack-tox-pep8 SUCCESS in 2m 29s
    openstack-tox-py27 FAILURE in 2m 34s
    build-openstack-releasenotes SUCCESS in 2m 47s
    openstack-tox-py35 FAILURE in 2m 41s
    openstack-tox-cover POST_FAILURE in 3m 52s (non-voting)
    build-openstack-sphinx-docs SUCCESS in 2m 57s
    shade-tox-py27-tips SUCCESS in 3m 18s
    shade-tox-py35-tips SUCCESS in 2m 28s

OpenStack Github Support for Cross Community Testing

  • OpenStack does not use Github, but other people do
  • Github App "OpenStack Zuul"
  • App added to github project by project admin
  • Project aded to OpenStack's main.yaml
  • Test interactions between OpenStack and important adjacent communities
  • https://github.com/ansible/ansible/pull/20974

Cross Source Dependencies

Nodesets for Multi-node Jobs

- nodeset:
    name: ceph-cluster
    nodes:
      - name: controller
        label: centos-7
      - name: compute1
        label: fedora-26
      - name: compute2
        label: fedora-26
    groups:
      - name: ceph-osd
        nodes:
          - controller
      - name: ceph-monitor
        nodes:
          - controller
          - compute1
          - compute2

Multi-node Job

  • nodesets are provided to Ansible for jobs in inventory
- job:
    name: ceph-multinode
    nodeset: ceph-cluster
    run: playbooks/install-ceph.yaml

Multi-node Ceph Job Content

- hosts: all
  roles:
    - install-ceph
- hosts: ceph-osd
  roles:
    - start-ceph-osd
- hosts: ceph-monitor
  roles:
    - start-ceph-monitor
- hosts: all
  roles:
    - do-something-interesting

Projects

  • Projects are git repositories
  • Specify a set of jobs for each pipeline
  • golang git repo naming as been adopted:
zuul@ubuntu-xenial:~$ find /home/zuul/src -mindepth 3 -maxdepth 3 -type d
/home/zuul/src/git.openstack.org/openstack-infra/shade
/home/zuul/src/git.openstack.org/openstack/keystoneauth
/home/zuul/src/git.openstack.org/openstack/os-client-config
/home/zuul/src/github.com/ansible/ansible

Project with Job Dependencies

# In git.openstack.org/openstack-infra/project-config:
- project:
    name: openstack/nova
    release:
      jobs:
      - build-artifacts
      - upload-tarball:
          dependencies: build-artifacts
      - upload-pypi:
          dependencies: build-artifacts
      - notify-mirror:
          dependencies:
            - upload-tarball
            - upload-pypi

Secrets

  • Inspired by Kubernetes Secrets API
  • Projects can add named encrypted secrets to their .zuul.yaml file
  • Jobs can request to use secrets by name
  • Jobs using secrets are not reconfigured speculatively
  • Secrets can only be used by the same project they are defined in
  • Public key per project: {{ zuul_url }}/tenant/{{ tenant }}/key/{{ project }}.pub

Secret Example (note, no admins had to enable this)

# In git.openstack.org/openstack/loci/.zuul.yaml:
- secret:
    name: loci_docker_login
    data:
      user: !encrypted/pkcs1-oaep
        - r8Nbpq5olmfLF035BZ/CUoFLIdhvBi/49KuochOAHbvns+xMiho3C7MEFzYDqJX3IhHde
          BICYOgK7qnyINOIZL2e7pl75rEdHQwJjSFUMkpdY6wEP7f9hpolj9xVp0ifHUVQqPHMRn
          zoPFd8MEAHxH5GLmc2SWJ98E/QUqGltxBi1YRSZoCcNtq3tHFK5Y+xQlLhIseJ2HkpDs6
          YXOGP9Qt4Va6sdyBcA90H+apSAcYA3Duu962ySZQAsYNui/3NQq3gLA+OZeyTJtcrh4hj
          Rb5dBnDWfSrMpxdNkbPXXgbQaxO3T0L4jbaOF8VKEsiI9olBrOeV2M9ddYJjSsHGj4XR8
          4vwS0+doB7np93fujiDuHVgdG8R40NW2GznyKRlRtzAORla7Mzw1Y1MokcUyY6p1LlLLl
          wUuWYCCEuRciOPhZXQ2u42qju/zrK2/dPnO8HfUINSrN0WbNq14ZwPpbj0ro02oGPbtwu
          OTw1z+N0Nc+GuLWlwYJGYM/z0UnvDR3WEBc2kXbVev9w4n0cB3RyphML2PDZZWbw8tjnX
          h1VsAOJ0Qo4qq1K/ft95ypd+vtjkfepEgHEBmJNwutJa9IHAkGfrkO9VkpUTPpfffnPwz
          d0/zaaadNl6MLQUSutRwY23YIIbv+fmukxw2vnJmvn6abkBlMya7KgtifwNA8c=
      password: !encrypted/pkcs1-oaep
        - gUEX4eY3JAk/Xt7Evmf/hF7xr6HpNRXTibZjrKTbmI4QYHlzEBrBbHey27Pt/eYvKKeKw
          hk8MDQ4rNX7ZK1v+CKTilUfOf4AkKYbe6JFDd4z+zIZ2PAA7ZedO5FY/OnqrG7nhLvQHE
          5nQrYwmxRp4O8eU5qG1dSrM9X+bzri8UnsI7URjqmEsIvlUqtybQKB9qQXT4d6mOeaKGE
          5h6Ydkb9Zdi4Qh+GpCGDYwHZKu1mBgVK5M1G6NFMy1DYz+4NJNkTRe9J+0TmWhQ/KZSqo
          4ck0x7Tb0Nr7hQzV8SxlwkaCTLDzvbiqmsJPLmzXY2jry6QsaRCpthS01vnj47itoZ/7p
          taH9CoJ0Gl7AkaxsrDSVjWSjatTQpsy1ub2fuzWHH4ASJFCiu83Lb2xwYts++r8ZSn+mA
          hbEs0GzPI6dIWg0u7aUsRWMOB4A+6t2IOJibVYwmwkG8TjHRXxVCLH5sY+i3MR+NicR9T
          IZFdY/AyH6vt5uHLQDU35+5n91pUG3F2lyiY5aeMOvBL05p27GTMuixR5ZoHcvSoHHtCq
          7Wnk21iHqmv/UnEzqUfXZOque9YP386RBWkshrHd0x3OHUfBK/WrpivxvIGBzGwMr2qAj
          /AhJsfDXKBBbhGOGk1u5oBLjeC4SRnAcIVh1+RWzR4/cAhOuy2EcbzxaGb6VTM=

Secret Example

# In git.openstack.org/openstack/loci/.zuul.yaml:
- job:
    name: publish-loci-cinder
    parent: loci-cinder
    post-run: playbooks/push
    secrets:
      - loci_docker_login

# In git.openstack.org/openstack/loci/playbooks/push.yaml:
- hosts: all
  tasks:
    - include_vars: vars.yaml

- name: Push project to DockerHub
  block:
    - command: docker login -u {{ loci_docker_login.user }} -p {{ loci_docker_login.password }}
      no_log: True
    - command: docker push openstackloci/{{ project }}:{{ branch }}-{{ item.name }}
      with_items: "{{ distros }}"

What's Next?

  • MQTT publisher
  • node providers
    • kuberenetes
    • OCI/docker
    • Mac Stadium (for our Ansible friends)
    • ec2/gce/azure
  • native container/kubernetes job execution

Important Links

Questions

images/questions.ans

Presentty

pan

Presentty