inaugust.com/src/talks/zuul.hbs

216 lines
7.7 KiB
Handlebars

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Zuul</title>
</head>
<body>
<section id="what-is-zuul" class="slide level2">
<h1>What is Zuul?</h1>
<ul>
<li>Multi-cloud, scalable, elastic CI/CD engine</li>
<li>Validation of speculative future states</li>
<li>Test things like you deploy them</li>
<li>Single-use VM build nodes - safely run tests that need root</li>
<li>Fully support Bare Metal, VMs and Containers</li>
<li>Multi-node builds</li>
<li>Multi-repo projects</li>
<li>Native support for gating configuration</li>
</ul>
</section>
<section id="terminoloy" class="slide level2">
<h1>Terminology</h1>
<ul>
<li>Periodic: jobs run in response to a timer</li>
<li>Post: jobs run after a change</li>
<li>Check: job run when someone proposes a change</li>
<li>Gate: jobs run between change approval and landing</li>
</ul>
</section>
<section id="why" class="slide level2">
<h1>Why - the original OpenStack use case</h1>
<ul>
<li>Fully automated gated commits</li>
<li>Full end-to-end integration tests from scratch for every commit</li>
<li>Massive scale</li>
</ul>
</section>
<section id="openstack-scale" class="slide level2">
<h1> OpenStack Development Scale by the numbers</h1>
<p>When we say "massive scale"</p>
<ul>
<li>2 KJPH (kilo-jobs / hour)</li>
<li>2500 arbitrary developers</li>
<li>1474 git repositories</li>
<li>11727 Jobs</li>
<li>450k lifetime changes</li>
<li>Merge 10k Changes / 42 days</li>
</ul>
<p class='fragment'>ansible has _received_ 13171 PRs (changes), has
merged 8190 of them and has 37788 commits in its entire lifetime</p>
</section>
<section id="speculative-execution" class="slide level2">
<h1>Multi Repository Speculative Execution</h1>
<ul>
<li>Zuul constructs speculative states as-if a change were merged</li>
<li>Tests future states without landing those changes first</li>
<li>The as-if spans multiple repos</li>
<li>In the Gate pipeline, speculative changes are put into a
virtual serial queue, then tested in parallel as-if each change
combination in front of them had landed</li>
</ul>
</section>
<section id="animation" class="slide level2">
<h1>Zuul Animation</h1>
<a href='http://docs.openstack.org/infra/publications/zuul/#(18)'>http://docs.openstack.org/infra/publications/zuul/#(18)</a>
</section>
<section id="not-specific" class="slide level2">
<h1>Not Specific to OpenStack</h1>
<ul>
<li>"Gate" and "Check" are merely configurations</li>
<li>50+ OpenStack Vendors use Zuul for "3rd Party CI" of drivers</li>
<li>HP uses zuul for both OpenStack and non-OpenStack products</li>
<li>Wikimedia uses zuul</li>
</ul>
</section>
<section id="status-pages" class="slide level2">
<h1>Status Pages</h1>
<p><a href="https://integration.wikimedia.org/zuul/">
https://integration.wikimedia.org/zuul/</a></p>
<p><a href="http://status.openstack.org/zuul/">
http://status.openstack.org/zuul/</a></p>
</section>
<section id="pluggable" class="slide level2">
<h1>Pluggable</h1>
<ul>
<li>Triggers</li>
<li>Reporters</li>
<li>Node Providers</li>
<li>Execution content (ansible)</li>
</ul>
</section>
<section id="zuul-v2" class="slide level2">
<h1>Zuul v2</h1>
<ul>
<li>In production for OpenStack for 4 years</li>
<li>What most people run</li>
<li>Triggers: Gerrit, Periodic</li>
<li>Reporters: Gerrit, Email, MySQL</li>
<li>Node Providers: OpenStack, Long-lived non-managed servers</li>
<li>Jobs executed by Jenkins</li>
</ul>
</section>
<section id="zuul-v2.5" class="slide level2">
<h1>Zuul v2.5</h1>
<ul>
<li>In use only by OpenStack (on purpose)</li>
<li>Replaced Jenkins with Ansible</li>
<li>Jobs still written using JJB - playbooks generated on the fly</li>
</ul>
<p><a href='http://logs.openstack.org/09/352209/7/check/gate-networking-ovn-python35/0fae86c/_zuul_ansible/'>
http://logs.openstack.org/09/352209/7/check/gate-networking-ovn-python35/0fae86c/_zuul_ansible/
</a></p>
</section>
<section id="replace-jenkins" class="slide level2">
<h1>Why Replace Jenkins?</h1>
</section>
<section id="no-lack-of-trying" class="slide level2">
<h1>Not for lack of trying</h1>
<ul>
<li>Started on Jenkins (actually, on Hudson, remember that?)</li>
<li>Funded the Jenkins JClouds Plugin</li>
<li>Did deep dev in the Gerrit Trigger Plugin</li>
<li>Maintain the SCP artifact plugin (added console log support)</li>
<li>Added 0mq notification plugin</li>
<li>Added Gearman Worker plugin (allowed us to grow to 8 Masters/1000 concurrent slaves)</li>
<li>Wrote Jenkins Job Builder</li>
</ul>
</section>
<section id="problems" class="slide level2">
<h1>Jenkins Problems</h1>
<p>Wasn't written originally to be Internet-facing</p>
<p>Security</p>
<ul>
<li>don't run WebUI on the internet</li>
<li> ssh slave plugin - it's possible for a slave to run arbitrary
code on the master</li>
</ul>
<p>Stability</p>
<ul><li>Almost every Jenkins upgrade has broken us</li></ul>
<p>Scalability</p>
<ul>
<li>Jenkins has global mutexes, especially in plugins</li>
<li>Extra large cloud server could handle ~100 concurrent jobs</li>
<li>We ran 8 Jenkins Masters with slaves sharded across them</li>
</ul>
<p>Overkill</p>
<ul><li>We only used it as a remote shell execution engine</li></ul>
</section>
<section id="better-engine" class="slide level2">
<h1>We know a better engine for remote execution</h1>
</section>
<section id="zuul-v3" class="slide level2">
<h1>Zuul V3</h1>
<ul>
<li>Jobs written in and executed with Ansible</li>
<li>Intended for broad use</li>
<li>Triggers: gerrit, periodic, github
(? bitbucket, gitlab, stash, fedmsg, email)</li>
<li>Reporters: gerrit, email, github (? bitbucket, gitlab, stash, resultsdb) </li>
<li>Node providers: pre-existing servers, dynamic cloud slaves (OpenStack, AWS, GCE), k8s clusters</li>
<li>In-repo config</li>
<li>Multi-node build clusters as first class resource</li>
<li>Multi-Tenant</li>
</ul>
</section>
<section id="focus" class="slide level2">
<h1>Focus</h1>
<p>So far</p>
<ul>
<li>OpenStack, and the hard problems that brings</li>
<li>Extra-hard is handled. So is simple - but zuul is complex to run
if you only have the simple use cases</li>
</ul>
<p>Zuul v3</p>
<ul>
<li>Get it ready for Ansible project</li>
<li>Making it truly suitable for not-OpenStack Infra to run</li>
<li>Making the easy tasks simple</li>
<li>Making zuul the thing everyone WANTS to use</li>
</ul>
</section>
<section id="focus" class="slide level2">
<h1>For More Information</h1>
<ul>
<li>http://docs.openstack.org/infra/zuul/</li>
<li>http://specs.openstack.org/openstack-infra/infra-specs/specs/zuulv3.html</li>
<li>freenode:#zuul</li>
<li>http://docs.openstack.org/infra/publications/zuul/#(1)</li>
</ul>
</section>
</body>
</html>