The Gatekeeper, or a project gating system
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

components.rst 37KB

Support nodes setting 'auto' python-path The nodepool "python-path" config variable makes it's way through from the node arguments and ends up as the "ansible_python_interpreter" variable for the inventory when running the job. Notably, Python 3 only distributions require this to be set to /usr/bin/python3 to avoid what can often be confusing red-herring errors (e.g. things like dnf packages incorrectly appearing to be missing on Fedora, for example [1]). Upstream is aware of this often confusing behaviour and has made an "ansible_python_interpreter" value of "auto" to, essentially, "do the right thing" [2] and choose the right python for the target environment. This is available in Ansible >=2.8 and will become default in 2.12. This allows, and defaults to, an interpreter value of "auto" when running with Ansible >=2.8. On the supported prior Ansible releases, "auto" will be translated into "/usr/bin/python2" to maintain backwards compatability. Of course a node explicity setting "python-path" already will override this. Nodepool is updated to set this by default with I02a1a618c8806b150049e91b644ec3c0cb826ba4. I think this is much more user friendly as it puts the work of figuring out what platform has what interpreter into Ansible. It alleviates the need for admins to know anything at all about "python-path" for node configurations unless they are actually doing something out of the ordinary like using a virtualenv. At the moment, if you put a modern Python-3 only distro into nodepool, Zuul always does the wrong thing by selecting /usr/bin/python2; you are left to debug the failures and need to know to go and manually update the python-path to Python 3. Documentation is updated. Detailed discussion is moved into the executor section; the README is simplified a bit to avoid confusion. A release note is added. A test-case is added. Note that it is also self-testing in that jobs using Ansible 2.8 use the updated value (c.f. I7cdcfc760975871f7fa9949da1015d7cec92ee67) [1] [2] Change-Id: I2b3bc6d4f873b7d653cfaccd1598464583c561e7
5 months ago
Make Info.endpoint a config override The endpoint field in the info payload is intended to help the javascript web code find out where the API endpoint is, and to allow people who are deploying html/js as static assets to an external web server drop a json file in that deployment to tell it where their zuul-web server is. The only scenario where the endpoint information as served by the /info or /{tenant}/info endpoints of zuul-web is useful is same-host/single-apache deployments that are hosted on a sub-url ... and unfortunately, it's not possible for the aiohttp code to be aware of such suburl deployments from http headers. request.url has the actual location (such as http://localhost:8080/info) and X-Forwarded-Host will only contain the host, not the path. The actual important aspects of the payload are: * A payload always be able to be found no matter the deployment scenario. * That a deployer can communicate to the javascript code the root of the REST API in the scenarios where relative paths will resolve to the incorrect thing. With that in mind, change the Info.endpoint field returned by zuul-web to default to None (or actually json null), or to a value provided by the deployer in the zuul.conf file similar to websocket_url. This way the web app can view 'null' as meaning "I'm deployed in such a manner that relative paths are the correct thing to fetch from" and a value as "the deployer has told me explicitly where to fetch from, I will join my relative paths to the value first." Because it is a value that is provided by the deployer if it is to exist, rename it to "rest_api_url" to better match websocket_url and stats_url. Change-Id: I6b85a93db6c70c997bbff1329373fbfc2d1007c6
2 years ago
Add /info and /{tenant}/info route to zuul-web There are a few pieces of information that are useful to know in the web layer. websocket_url is a config setting that, if set, is needed by the console streaming. We currently pass this in appended to the streaming url as a url parameter (which since it's a URL is a bit extra odd) The endpoint is normally relative to the webapp, but may need to be overridden in cases like publishing the html and javascript to a disconnected location such as the draft output into the log server in openstack or publishing built html/javascript to swift. Add WebInfo and TenantWebInfo objects and corresponding /info and /{tenant}/info routes. As an alternative, we could collapse WebInfo and TenantWebInfo to just WebInfo and leave the tenant field set to None for the /info route. Some of the API functions are optionally provided by plugins. The github plugin provides webhook URLs and the SQLReporter plugin is needed for the builds endpoints. Add a Capabilities object that can report on the existance of such things and pass it to plugin route registration so that capabilities can be registered. Add support for configuring stats_url The old zuul status page had sparklines and other graphs on it, which are not present in the current one because the graphite server wasn't parameterized. Add a config setting allowing a URL to a graphite server to be set and expose that in the /info endpoint. Since statsd itself can emit to multiple different backends, add a setting for the type of server, defaulting to graphite. Change-Id: I606a3b2cdf03cb73aa3ffd69d9d64c171b23b97a
2 years ago
Use yarn and webpack to manage zuul-web javascript yarn drives package and dependency management. webpack handles bundling, minification and transpiling down to browser-acceptable javascript but allows for more modern javascript like import statements. There are some really neat things in the webpack dev server. CSS changes, for instance, get applied immediately without a refresh. Other things, like the jquery plugin do need a refresh, but it's handled just on a file changing. As a followup, we can also consider turning the majority of the status page into a webpack library that other people can depend on as a mechanism for direct use. Things like that haven't been touched because allowing folks to poke at the existing known status page without too many changes using the tools seems like a good way for people to learn/understand the stack. Move things so that the built content gets put into zuul/web/static so that the built-in static serving from zuul-web will/can serve the files. Update so that if npm run build:dist is run before the python sdist, the built html/javascript content will be included in the source tarball. Add a pbr hook so that if yarn is installed, javascript content will be built before the tarball. Add a zuul job with a success url that contains a source_url pointing to the live v3 data. This adds a framework for verifying that we can serve the web app urls and their dependencies for all of the various ways we want to support folks hosting zuul-web. It includes a very simple reverse proxy server for approximating what we do in openstack to "white label" the Zuul service -- that is, hide the multitenancy aspect and present the single tenant at the site root. We can run similar tests without the proxy to ensure the default, multi-tenant view works as well. Add babel transpiling enabling use of ES6 features ECMAScript6 has a bunch of nice things, like block scoped variables, const, template strings and classes. Babel is a javascript transpiler which webpack can use to allow us to write using modern javascript but the resulting code to still work on older browsers. Use the babel-plugin-angularjs-annotate so that angular's dependency injection doesn't get borked by babel's transpiling things (which causes variables to otherwise be renamed in a way that causes angular to not find them) While we're at it, replace our use of var with let (let is the new block-scoped version of var) and toss in some use of const and template strings for good measure. Add StandardJS eslint config for linting JavaScript Standard Style is a code style similar to pep8/flake8. It's being added here not because of the pep8 part, but because the pyflakes equivalent can catch real errors. This uses the babel-eslint parser since we're using Babel to transpile already. This auto-formats the existing code with: npm run format Rather than using StandardJS directly through the 'standard' package, use the standardjs eslint plugin so that we can ignore the camelCase rule (and any other rule that might emerge in the future) Many of under_score/camelCase were fixed in a previous version of the patch. Since the prevailing zuul style is camelCase methods anyway, those fixes were left. That warning has now been disabled. Other things, such as == vs. === and ensuring template strings are in backticks are fixed. Ignore indentation errors for now - we'll fix them at the end of this stack and then remove the exclusion. Add a 'format' npm run target that will run the eslint command with --fix for ease of fixing reported issues. Add a 'lint' npm run target and a 'lint' environment that runs with linting turned to errors. The next patch makes the lint environment more broadly useful. When we run lint, also run the BundleAnalyzerPlugin and set the success-url to the report. Add an angular controller for status and stream page Wrap the status and stream page construction with an angular controller so that all the javascripts can be bundled in a single file. Building the files locally is wonderful and all, but what we really want is to make a tarball that has the built code so that it can be deployed. Put it in the root source dir so that it can be used with the zuul fetch-javascript-tarball role. Also, replace the custom npm job with the new build-javascript-content job which naturally grabs the content we want. Make a 'main.js' file that imports the other three so that we just have a single bundle. Then, add a 'vendor' entry in the common webpack file and use the CommonsChunkPlugin to extract dependencies into their own bundle. A second CommonsChunkPlugin entry pulls out a little bit of metadata that would otherwise cause the main and vendor chunks to change even with no source change. Then add chunkhash into the filename. This way the files themselves can be aggressively cached. This all follows recommendations from and Change-Id: I2e1230783fe57f1bc3b7818460463df1e659936b Co-Authored-By: Tristan Cacqueray <> Co-Authored-By: James E. Blair <>
2 years ago
  1. :title: Components
  2. .. _components:
  3. Components
  4. ==========
  5. Zuul is a distributed system consisting of several components, each of
  6. which is described below.
  7. .. graphviz::
  8. :align: center
  9. graph {
  10. node [shape=box]
  11. Database [fontcolor=grey]
  12. Gearman [shape=ellipse]
  13. Gerrit [fontcolor=grey]
  14. Statsd [shape=ellipse fontcolor=grey]
  15. Zookeeper [shape=ellipse]
  16. Nodepool
  17. GitHub [fontcolor=grey]
  18. Merger -- Gearman
  19. Executor -- Gearman
  20. Executor -- Statsd
  21. Web -- Database
  22. Web -- Gearman
  23. Web -- Zookeeper
  24. Web -- Executor
  25. Finger -- Gearman
  26. Finger -- Executor
  27. Gearman -- Scheduler;
  28. Scheduler -- Database;
  29. Scheduler -- Gerrit;
  30. Scheduler -- Zookeeper;
  31. Zookeeper -- Nodepool;
  32. Scheduler -- GitHub;
  33. Scheduler -- Statsd;
  34. }
  35. Each of the Zuul processes may run on the same host, or different
  36. hosts. Within Zuul, the components communicate with the scheduler via
  37. the Gearman protocol, so each Zuul component needs to be able to
  38. connect to the host running the Gearman server (the scheduler has a
  39. built-in Gearman server which is recommended) on the Gearman port --
  40. TCP port 4730 by default.
  41. The Zuul scheduler communicates with Nodepool via the ZooKeeper
  42. protocol. Nodepool requires an external ZooKeeper cluster, and the
  43. Zuul scheduler needs to be able to connect to the hosts in that
  44. cluster on TCP port 2181.
  45. Both the Nodepool launchers and Zuul executors need to be able to
  46. communicate with the hosts which nodepool provides. If these are on
  47. private networks, the Executors will need to be able to route traffic
  48. to them.
  49. Only Zuul fingergw and Zuul web need to be publicly accessible;
  50. executors never do. Executors should be accessible on TCP port 7900
  51. by fingergw and web.
  52. A database is only required if you create an sql driver in your Zuul
  53. connections configuration. Both Zuul scheduler and Zuul web will need
  54. access to it.
  55. If statsd is enabled, the executors and scheduler needs to be able to
  56. emit data to statsd. Statsd can be configured to run on each host
  57. and forward data, or services may emit to a centralized statsd
  58. collector. Statsd listens on UDP port 8125 by default.
  59. All Zuul processes read the ``/etc/zuul/zuul.conf`` file (an alternate
  60. location may be supplied on the command line) which uses an INI file
  61. syntax. Each component may have its own configuration file, though
  62. you may find it simpler to use the same file for all components.
  63. An example ``zuul.conf``:
  64. .. code-block:: ini
  65. [gearman]
  66. server=localhost
  67. [gearman_server]
  68. start=true
  69. log_config=/etc/zuul/gearman-logging.yaml
  70. [zookeeper]
  72. [web]
  73. status_url=
  74. [scheduler]
  75. log_config=/etc/zuul/scheduler-logging.yaml
  76. A minimal Zuul system may consist of a :ref:`scheduler` and
  77. :ref:`executor` both running on the same host. Larger installations
  78. should consider running multiple executors, each on a dedicated host,
  79. and running mergers on dedicated hosts as well.
  80. Common
  81. ------
  82. The following applies to all Zuul components.
  83. Configuration
  84. ~~~~~~~~~~~~~
  85. The following sections of ``zuul.conf`` are used by all Zuul components:
  86. .. attr:: gearman
  87. Client connection information for Gearman.
  88. .. attr:: server
  89. :required:
  90. Hostname or IP address of the Gearman server.
  91. .. attr:: port
  92. :default: 4730
  93. Port on which the Gearman server is listening.
  94. .. attr:: ssl_ca
  95. An openssl file containing a set of concatenated “certification
  96. authority” certificates in PEM formet.
  97. .. attr:: ssl_cert
  98. An openssl file containing the client public certificate in PEM format.
  99. .. attr:: ssl_key
  100. An openssl file containing the client private key in PEM format.
  101. .. attr:: statsd
  102. Information about the optional statsd server. If the ``statsd``
  103. python module is installed and this section is configured,
  104. statistics will be reported to statsd. See :ref:`statsd` for more
  105. information.
  106. .. attr:: server
  107. Hostname or IP address of the statsd server.
  108. .. attr:: port
  109. :default: 8125
  110. The UDP port on which the statsd server is listening.
  111. .. attr:: prefix
  112. If present, this will be prefixed to all of the keys before
  113. transmitting to the statsd server.
  114. .. attr:: zookeeper
  115. Client connection information for ZooKeeper
  116. .. attr:: hosts
  117. :required:
  118. A list of zookeeper hosts for Zuul to use when communicating
  119. with Nodepool.
  120. .. attr:: session_timeout
  121. :default: 10.0
  122. The ZooKeeper session timeout, in seconds.
  123. .. _scheduler:
  124. Scheduler
  125. ---------
  126. The scheduler is the primary component of Zuul. The scheduler is not
  127. a scalable component; one, and only one, scheduler must be running at
  128. all times for Zuul to be operational. It receives events from any
  129. connections to remote systems which have been configured, enqueues
  130. items into pipelines, distributes jobs to executors, and reports
  131. results.
  132. The scheduler includes a Gearman server which is used to communicate
  133. with other components of Zuul. It is possible to use an external
  134. Gearman server, but the built-in server is well-tested and
  135. recommended. If the built-in server is used, other Zuul hosts will
  136. need to be able to connect to the scheduler on the Gearman port, TCP
  137. port 4730. It is also strongly recommended to use SSL certs with
  138. Gearman, as secrets are transferred from the scheduler to executors
  139. over this link.
  140. The scheduler must be able to connect to the ZooKeeper cluster used by
  141. Nodepool in order to request nodes. It does not need to connect
  142. directly to the nodes themselves, however -- that function is handled
  143. by the Executors.
  144. It must also be able to connect to any services for which connections
  145. are configured (Gerrit, GitHub, etc).
  146. Configuration
  147. ~~~~~~~~~~~~~
  148. The following sections of ``zuul.conf`` are used by the scheduler:
  149. .. attr:: gearman_server
  150. The builtin gearman server. Zuul can fork a gearman process from
  151. itself rather than connecting to an external one.
  152. .. attr:: start
  153. :default: false
  154. Whether to start the internal Gearman server.
  155. .. attr:: listen_address
  156. :default: all addresses
  157. IP address or domain name on which to listen.
  158. .. attr:: port
  159. :default: 4730
  160. TCP port on which to listen.
  161. .. attr:: log_config
  162. Path to log config file for internal Gearman server.
  163. .. attr:: ssl_ca
  164. An openssl file containing a set of concatenated “certification
  165. authority” certificates in PEM formet.
  166. .. attr:: ssl_cert
  167. An openssl file containing the server public certificate in PEM
  168. format.
  169. .. attr:: ssl_key
  170. An openssl file containing the server private key in PEM format.
  171. .. attr:: web
  172. .. attr:: root
  173. :required:
  174. The root URL of the web service (e.g.,
  175. ````).
  176. See :attr:`tenant.web-root` for additional options for
  177. whitelabeled tenant configuration.
  178. .. attr:: status_url
  179. URL that will be posted in Zuul comments made to changes when
  180. starting jobs for a change.
  181. .. TODO: is this effectively required?
  182. .. attr:: scheduler
  183. .. attr:: command_socket
  184. :default: /var/lib/zuul/scheduler.socket
  185. Path to command socket file for the scheduler process.
  186. .. attr:: tenant_config
  187. Path to :ref:`tenant-config` file. This attribute
  188. is exclusive with :attr:`scheduler.tenant_config_script`.
  189. .. attr:: tenant_config_script
  190. Path to a script to execute and load the tenant
  191. config from. This attribute is exclusive with
  192. :attr:`scheduler.tenant_config`.
  193. .. attr:: default_ansible_version
  194. Default ansible version to use for jobs that doesn't specify a version.
  195. See :attr:`job.ansible-version` for details.
  196. .. attr:: log_config
  197. Path to log config file.
  198. .. attr:: pidfile
  199. :default: /var/run/zuul/
  200. Path to PID lock file.
  201. .. attr:: state_dir
  202. :default: /var/lib/zuul
  203. Path to directory in which Zuul should save its state.
  204. .. attr:: relative_priority
  205. :default: False
  206. A boolean which indicates whether the scheduler should supply
  207. relative priority information for node requests.
  208. In all cases, each pipeline may specify a precedence value which
  209. is used by Nodepool to satisfy requests from higher-precedence
  210. pipelines first. If ``relative_priority`` is set to ``True``,
  211. then Zuul will additionally group items in the same pipeline by
  212. pipeline queue and weight each request by its position in that
  213. project's group. A request for the first change in a given
  214. queue will have the highest relative priority, and the second
  215. change a lower relative priority. The first change of each
  216. queue in a pipeline has the same relative priority, regardless
  217. of the order of submission or how many other changes are in the
  218. pipeline. This can be used to make node allocations complete
  219. faster for projects with fewer changes in a system dominated by
  220. projects with more changes.
  221. If this value is ``False`` (the default), then node requests are
  222. sorted by pipeline precedence followed by the order in which
  223. they were submitted. If this is ``True``, they are sorted by
  224. pipeline precedence, followed by relative priority, and finally
  225. the order in which they were submitted.
  226. .. attr:: default_hold_expiration
  227. :default: max_hold_expiration
  228. The default value for held node expiration if not supplied. This
  229. will default to the value of ``max_hold_expiration`` if not changed,
  230. or if it is set to a higher value than the max.
  231. .. attr:: max_hold_expiration
  232. :default: 0
  233. Maximum number of seconds any nodes held for an autohold request
  234. will remain available. A value of 0 disables this, and the nodes
  235. will remain held until the autohold request is manually deleted.
  236. If a value higher than ``max_hold_expiration`` is supplied during
  237. hold request creation, it will be lowered to this value.
  238. Operation
  239. ~~~~~~~~~
  240. To start the scheduler, run ``zuul-scheduler``. To stop it, kill the
  241. PID which was saved in the pidfile specified in the configuration.
  242. Most of Zuul's configuration is automatically updated as changes to
  243. the repositories which contain it are merged. However, Zuul must be
  244. explicitly notified of changes to the tenant config file, since it is
  245. not read from a git repository. To do so, run
  246. ``zuul-scheduler full-reconfigure``. The signal based method by sending
  247. a `SIGHUP` signal to the scheduler PID is deprecated.
  248. Merger
  249. ------
  250. Mergers are an optional Zuul service; they are not required for Zuul
  251. to operate, but some high volume sites may benefit from running them.
  252. Zuul performs quite a lot of git operations in the course of its work.
  253. Each change that is to be tested must be speculatively merged with the
  254. current state of its target branch to ensure that it can merge, and to
  255. ensure that the tests that Zuul perform accurately represent the
  256. outcome of merging the change. Because Zuul's configuration is stored
  257. in the git repos it interacts with, and is dynamically evaluated, Zuul
  258. often needs to perform a speculative merge in order to determine
  259. whether it needs to perform any further actions.
  260. All of these git operations add up, and while Zuul executors can also
  261. perform them, large numbers may impact their ability to run jobs.
  262. Therefore, administrators may wish to run standalone mergers in order
  263. to reduce the load on executors.
  264. Mergers need to be able to connect to the Gearman server (usually the
  265. scheduler host) as well as any services for which connections are
  266. configured (Gerrit, GitHub, etc).
  267. Configuration
  268. ~~~~~~~~~~~~~
  269. The following section of ``zuul.conf`` is used by the merger:
  270. .. attr:: merger
  271. .. attr:: command_socket
  272. :default: /var/lib/zuul/merger.socket
  273. Path to command socket file for the merger process.
  274. .. attr:: git_dir
  275. :default: /var/lib/zuul/merger-git
  276. Directory in which Zuul should clone git repositories.
  277. .. attr:: git_http_low_speed_limit
  278. :default: 1000
  279. If the HTTP transfer speed is less then git_http_low_speed_limit for
  280. longer then git_http_low_speed_time, the transfer is aborted.
  281. Value in bytes, setting to 0 will disable.
  282. .. attr:: git_http_low_speed_time
  283. :default: 30
  284. If the HTTP transfer speed is less then git_http_low_speed_limit for
  285. longer then git_http_low_speed_time, the transfer is aborted.
  286. Value in seconds, setting to 0 will disable.
  287. .. attr:: git_timeout
  288. :default: 300
  289. Timeout for git clone and fetch operations. This can be useful when
  290. dealing with large repos. Note that large timeouts can increase startup
  291. and reconfiguration times if repos are not cached so be cautious when
  292. increasing this value.
  293. Value in seconds.
  294. .. attr:: git_user_email
  295. Value to pass to `git config
  296. <>`_.
  297. .. attr:: git_user_name
  298. Value to pass to `git config
  299. <>`_.
  300. .. attr:: log_config
  301. Path to log config file for the merger process.
  302. .. attr:: pidfile
  303. :default: /var/run/zuul/
  304. Path to PID lock file for the merger process.
  305. Operation
  306. ~~~~~~~~~
  307. To start the merger, run ``zuul-merger``. To stop it, kill the
  308. PID which was saved in the pidfile specified in the configuration.
  309. .. _executor:
  310. Executor
  311. --------
  312. Executors are responsible for running jobs. At the start of each job,
  313. an executor prepares an environment in which to run Ansible which
  314. contains all of the git repositories specified by the job with all
  315. dependent changes merged into their appropriate branches. The branch
  316. corresponding to the proposed change will be checked out (in all
  317. projects, if it exists). Any roles specified by the job will also be
  318. present (also with dependent changes merged, if appropriate) and added
  319. to the Ansible role path. The executor also prepares an Ansible
  320. inventory file with all of the nodes requested by the job.
  321. The executor also contains a merger. This is used by the executor to
  322. prepare the git repositories used by jobs, but is also available to
  323. perform any tasks normally performed by standalone mergers. Because
  324. the executor performs both roles, small Zuul installations may not
  325. need to run standalone mergers.
  326. Executors need to be able to connect to the Gearman server (usually
  327. the scheduler host), any services for which connections are configured
  328. (Gerrit, GitHub, etc), as well as directly to the hosts which Nodepool
  329. provides.
  330. Trusted and Untrusted Playbooks
  331. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  332. The executor runs playbooks in one of two execution contexts depending
  333. on whether the project containing the playbook is a
  334. :term:`config-project` or an :term:`untrusted-project`. If the
  335. playbook is in a config project, the executor runs the playbook in the
  336. *trusted* execution context, otherwise, it is run in the *untrusted*
  337. execution context.
  338. Both execution contexts use `bubblewrap`_ [#nullwrap]_ to create a namespace to
  339. ensure that playbook executions are isolated and are unable to access
  340. files outside of a restricted environment. The administrator may
  341. configure additional local directories on the executor to be made
  342. available to the restricted environment.
  343. The trusted execution context has access to all Ansible features,
  344. including the ability to load custom Ansible modules. Needless to
  345. say, extra scrutiny should be given to code that runs in a trusted
  346. context as it could be used to compromise other jobs running on the
  347. executor, or the executor itself, especially if the administrator has
  348. granted additional access through bubblewrap, or a method of escaping
  349. the restricted environment created by bubblewrap is found.
  350. Playbooks run in the untrusted execution context are not permitted to
  351. load additional Ansible modules or access files outside of the
  352. restricted environment prepared for them by the executor. In addition
  353. to the bubblewrap environment applied to both execution contexts, in
  354. the untrusted context some standard Ansible modules are replaced with
  355. versions which prohibit some actions, including attempts to access
  356. files outside of the restricted execution context. These redundant
  357. protections are made as part of a defense-in-depth strategy.
  358. .. _bubblewrap:
  359. .. _zuul-discuss:
  360. .. [#nullwrap] `bubblewrap` is integral to securely operating Zuul.
  361. If it is difficult for you to use it in your environment, we
  362. encourage you to let us know via the `zuul-discuss`_ mailing
  363. list.
  364. Configuration
  365. ~~~~~~~~~~~~~
  366. The following sections of ``zuul.conf`` are used by the executor:
  367. .. attr:: executor
  368. .. attr:: command_socket
  369. :default: /var/lib/zuul/executor.socket
  370. Path to command socket file for the executor process.
  371. .. attr:: finger_port
  372. :default: 7900
  373. Port to use for finger log streamer.
  374. .. attr:: state_dir
  375. :default: /var/lib/zuul
  376. Path to directory in which Zuul should save its state.
  377. .. attr:: git_dir
  378. :default: /var/lib/zuul/executor-git
  379. Directory that Zuul should clone local git repositories to. The
  380. executor keeps a local copy of every git repository it works
  381. with to speed operations and perform speculative merging.
  382. This should be on the same filesystem as
  383. :attr:`executor.job_dir` so that when git repos are cloned into
  384. the job workspaces, they can be hard-linked to the local git
  385. cache.
  386. .. attr:: job_dir
  387. :default: /var/lib/zuul/builds
  388. Directory that Zuul should use to hold temporary job directories.
  389. When each job is run, a new entry will be created under this
  390. directory to hold the configuration and scratch workspace for
  391. that job. It will be deleted at the end of the job (unless the
  392. `--keep-jobdir` command line option is specified).
  393. This should be on the same filesystem as :attr:`executor.git_dir`
  394. so that when git repos are cloned into the job workspaces, they
  395. can be hard-linked to the local git cache.
  396. .. attr:: log_config
  397. Path to log config file for the executor process.
  398. .. attr:: pidfile
  399. :default: /var/run/zuul/
  400. Path to PID lock file for the executor process.
  401. .. attr:: private_key_file
  402. :default: ~/.ssh/id_rsa
  403. SSH private key file to be used when logging into worker nodes.
  404. .. note:: If you use an RSA key, ensure it is encoded in the PEM
  405. format (use the ``-t rsa -m PEM`` arguments to
  406. `ssh-keygen`).
  407. .. attr:: default_username
  408. :default: zuul
  409. Username to use when logging into worker nodes, if none is
  410. supplied by Nodepool.
  411. .. attr:: winrm_cert_key_file
  412. :default: ~/.winrm/winrm_client_cert.key
  413. The private key file of the client certificate to use for winrm
  414. connections to Windows nodes.
  415. .. attr:: winrm_cert_pem_file
  416. :default: ~/.winrm/winrm_client_cert.pem
  417. The certificate file of the client certificate to use for winrm
  418. connections to Windows nodes.
  419. .. note:: Currently certificate verification is disabled when
  420. connecting to Windows nodes via winrm.
  421. .. attr:: winrm_operation_timeout_sec
  422. :default: None. The Ansible default of 20 is used in this case.
  423. The timeout for WinRM operations.
  424. .. attr:: winrm_read_timeout_sec
  425. :default: None. The Ansible default of 30 is used in this case.
  426. The timeout for WinRM read. Increase this if there are intermittent
  427. network issues and read timeout errors keep occurring.
  428. .. _admin_sitewide_variables:
  429. .. attr:: variables
  430. Path to an Ansible variables file to supply site-wide variables.
  431. This should be a YAML-formatted file consisting of a single
  432. dictionary. The contents will be made available to all jobs as
  433. Ansible variables. These variables take precedence over all
  434. other forms (job variables and secrets). Care should be taken
  435. when naming these variables to avoid potential collisions with
  436. those used by jobs. Prefixing variable names with a
  437. site-specific identifier is recommended. The default is not to
  438. add any site-wide variables. See the :ref:`User's Guide
  439. <user_jobs_sitewide_variables>` for more information.
  440. .. attr:: manage_ansible
  441. :default: True
  442. Specifies wether the zuul-executor should install the supported ansible
  443. versions during startup or not. If this is ``True`` the zuul-executor
  444. will install the ansible versions into :attr:`executor.ansible_root`.
  445. It is recommended to set this to ``False`` and manually install Ansible
  446. after the Zuul installation by running ``zuul-manage-ansible``. This has
  447. the advantage that possible errors during Ansible installation can be
  448. spotted earlier. Further especially containerized deployments of Zuul
  449. will have the advantage of predictable versions.
  450. .. attr:: ansible_root
  451. :default: <state_dir>/ansible-bin
  452. Specifies where the zuul-executor should look for its supported ansible
  453. installations. By default it looks in the following directories and uses
  454. the first which it can find.
  455. * ``<zuul_install_dir>/lib/zuul/ansible``
  456. * ``<ansible_root>``
  457. The ``ansible_root`` setting allows you to override the second location
  458. which is also used for installation if ``manage_ansible`` is ``True``.
  459. .. attr:: ansible_setup_timeout
  460. :default: 60
  461. Timeout of the ansible setup playbook in seconds that runs before
  462. the first playbook of the job.
  463. .. attr:: disk_limit_per_job
  464. :default: 250
  465. This integer is the maximum number of megabytes that any one job
  466. is allowed to consume on disk while it is running. If a job's
  467. scratch space has more than this much space consumed, it will be
  468. aborted. Set to -1 to disable the limit.
  469. .. attr:: trusted_ro_paths
  470. List of paths, separated by ``:`` to read-only bind mount into
  471. trusted bubblewrap contexts.
  472. .. attr:: trusted_rw_paths
  473. List of paths, separated by ``:`` to read-write bind mount into
  474. trusted bubblewrap contexts.
  475. .. attr:: untrusted_ro_paths
  476. List of paths, separated by ``:`` to read-only bind mount into
  477. untrusted bubblewrap contexts.
  478. .. attr:: untrusted_rw_paths
  479. List of paths, separated by ``:`` to read-write bind mount into
  480. untrusted bubblewrap contexts.
  481. .. attr:: load_multiplier
  482. :default: 2.5
  483. When an executor host gets too busy, the system may suffer
  484. timeouts and other ill effects. The executor will stop accepting
  485. more than 1 job at a time until load has lowered below a safe
  486. level. This level is determined by multiplying the number of
  487. CPU's by `load_multiplier`.
  488. So for example, if the system has 2 CPUs, and load_multiplier
  489. is 2.5, the safe load for the system is 5.00. Any time the
  490. system load average is over 5.00, the executor will quit
  491. accepting multiple jobs at one time.
  492. The executor will observe system load and determine whether
  493. to accept more jobs every 30 seconds.
  494. .. attr:: max_starting_builds
  495. :default: None
  496. An executor is accepting up to as many starting builds as defined by the
  497. :attr:`executor.load_multiplier` on systems with more than four CPU cores,
  498. and up to twice as many on systems with four or less CPU cores. For
  499. example, on a system with two CPUs: 2 * 2.5 * 2 - up to ten starting
  500. builds may run on such executor; on systems with eight CPUs: 2.5 * 8 - up
  501. to twenty starting builds may run on such executor.
  502. On systems with high CPU/vCPU count an executor may accept too many
  503. starting builds. This can be overwritten using this option providing a
  504. fixed number of maximum starting builds on an executor.
  505. .. attr:: min_avail_hdd
  506. :default: 5.0
  507. This is the minimum percentage of HDD storage available for the
  508. :attr:`executor.state_dir` directory. The executor will stop accepting
  509. more than 1 job at a time until more HDD storage is available. The
  510. available HDD percentage is calculated from the total available
  511. disk space divided by the total real storage capacity multiplied by
  512. 100.
  513. .. attr:: min_avail_mem
  514. :default: 5.0
  515. This is the minimum percentage of system RAM available. The
  516. executor will stop accepting more than 1 job at a time until
  517. more memory is available. The available memory percentage is
  518. calculated from the total available memory divided by the
  519. total real memory multiplied by 100. Buffers and cache are
  520. considered available in the calculation.
  521. .. attr:: hostname
  522. :default: hostname of the server
  523. The executor needs to know its hostname under which it is reachable by
  524. zuul-web. Otherwise live console log streaming doesn't work. In most cases
  525. This is automatically detected correctly. But when running in environments
  526. where it cannot determine its hostname correctly this can be overridden
  527. here.
  528. .. attr:: zone
  529. :default: None
  530. Name of the nodepool executor-zone to exclusively execute all jobs that
  531. have nodes with the specified executor-zone attribute. As an example,
  532. it is possible for nodepool nodes to exist in a cloud without public
  533. accessable IP address. By adding an executor to a zone nodepool nodes
  534. could be configured to use private ip addresses.
  535. To enable this in nodepool, you'll use the node-attributes setting in a
  536. provider pool. For example:
  537. .. code-block:: yaml
  538. pools:
  539. - name: main
  540. node-attributes:
  541. executor-zone: vpn
  542. .. attr:: merger
  543. .. attr:: git_user_email
  544. Value to pass to `git config
  545. <>`_.
  546. .. attr:: git_user_name
  547. Value to pass to `git config
  548. <>`_.
  549. Operation
  550. ~~~~~~~~~
  551. To start the executor, run ``zuul-executor``.
  552. There are several commands which can be run to control the executor's
  553. behavior once it is running.
  554. In order to stop the executor and under normal circumstances it is
  555. best to pause and wait for all currently running jobs to finish
  556. before stopping it. To do so run ``zuul-executor pause``.
  557. To stop the executor immediately, run ``zuul-executor stop``. Jobs that were
  558. running on the stopped executor will be rescheduled on other executors.
  559. To enable or disable running Ansible in verbose mode (with the
  560. ``-vvv`` argument to ansible-playbook) run ``zuul-executor verbose``
  561. and ``zuul-executor unverbose``.
  562. Ansible and Python 3
  563. ~~~~~~~~~~~~~~~~~~~~
  564. As noted above, the executor runs Ansible playbooks against the remote
  565. node(s) allocated for the job. Since part of executing playbooks on
  566. remote hosts is running Python scripts on them, Ansible needs to know
  567. what Python interpreter to use on the remote host. With older
  568. distributions, ``/usr/bin/python2`` was a generally sensible choice.
  569. However, over time a heterogeneous Python ecosystem has evolved where
  570. older distributions may only provide Python 2, most provide a mixed
  571. 2/3 environment and newer distributions may only provide Python 3 (and
  572. then others like RHEL8 may even have separate "system" Python versions
  573. to add to confusion!).
  574. Ansible's ``ansible_python_interpreter`` variable configures the path
  575. to the remote Python interpreter to use during playbook execution.
  576. This value is set by Zuul from the ``python-path`` specified for the
  577. node by Nodepool; see the `nodepool configuration documentation
  578. <>`__.
  579. This defaults to ``auto``, where Ansible will automatically discover
  580. the interpreter available on the remote host. However, this setting
  581. only became available in Ansible >=2.8, so Zuul will translate
  582. ``auto`` into the old default of ``/usr/bin/python2`` when configured
  583. to use older Ansible versions.
  584. Thus for modern Python 3-only hosts no further configuration is needed
  585. when using Ansible >=2.8 (e.g. Fedora, Bionic onwards). If using
  586. earlier Ansible versions you may need to explicitly set the
  587. ``python-path`` if ``/usr/bin/python2`` is not available on the node.
  588. Ansible roles/modules which include Python code are generally Python 3
  589. safe now, but there is still a small possibility of incompatibility.
  590. See also the Ansible `Python 3 support page
  591. <>`__.
  592. .. _web-server:
  593. Web Server
  594. ----------
  595. .. TODO: Turn REST API into a link to swagger docs when we grow them
  596. The Zuul web server serves as the single process handling all HTTP
  597. interactions with Zuul. This includes the websocket interface for live
  598. log streaming, the REST API and the html/javascript dashboard. All three are
  599. served as a holistic web application. For information on additional supported
  600. deployment schemes, see :ref:`web-deployment-options`.
  601. Web servers need to be able to connect to the Gearman server (usually
  602. the scheduler host). If the SQL reporter is used, they need to be
  603. able to connect to the database it reports to in order to support the
  604. dashboard. If a GitHub connection is configured, they need to be
  605. reachable by GitHub so they may receive notifications.
  606. Configuration
  607. ~~~~~~~~~~~~~
  608. In addition to the common configuration sections, the following
  609. sections of ``zuul.conf`` are used by the web server:
  610. .. attr:: web
  611. .. attr:: listen_address
  612. :default:
  613. IP address or domain name on which to listen.
  614. .. attr:: log_config
  615. Path to log config file for the web server process.
  616. .. attr:: pidfile
  617. :default: /var/run/zuul/
  618. Path to PID lock file for the web server process.
  619. .. attr:: port
  620. :default: 9000
  621. Port to use for web server process.
  622. .. attr:: websocket_url
  623. Base URL on which the websocket service is exposed, if different
  624. than the base URL of the web app.
  625. .. attr:: stats_url
  626. Base URL from which statistics emitted via statsd can be queried.
  627. .. attr:: stats_type
  628. :default: graphite
  629. Type of server hosting the statistics information. Currently only
  630. 'graphite' is supported by the dashboard.
  631. .. attr:: static_path
  632. :default: zuul/web/static
  633. Path containing the static web assets.
  634. .. attr:: static_cache_expiry
  635. :default: 3600
  636. The Cache-Control max-age response header value for static files served
  637. by the zuul-web. Set to 0 during development to disable Cache-Control.
  638. .. _web-server-tenant-scoped-api:
  639. Enabling tenant-scoped access to privileged actions
  640. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  641. A user can be granted access to protected REST API endpoints by providing a
  642. valid JWT (JSON Web Token) as a bearer token when querying the API endpoints.
  643. JWTs are signed and therefore Zuul must be configured so that signatures can be
  644. verified. More information about the JWT standard can be found on the `IETF's
  645. RFC page <>`_.
  646. This optional section of ``zuul.conf``, if present, will activate the
  647. protected endpoints and configure JWT validation:
  648. .. attr:: auth <authenticator name>
  649. .. attr:: driver
  650. The signing algorithm to use. Accepted values are ``HS256``, ``RS256`` or
  651. ``RS256withJWKS``. See below for driver-specific configuration options.
  652. .. attr:: allow_authz_override
  653. :default: false
  654. Allow a JWT to override predefined access rules. See the section on
  655. :ref:`JWT contents <jwt-format>` for more details on how to grant access
  656. to tenants with a JWT.
  657. .. attr:: realm
  658. The authentication realm.
  659. .. attr:: default
  660. :default: false
  661. If set to ``true``, use this realm as the default authentication realm
  662. when handling HTTP authentication errors.
  663. .. attr:: client_id
  664. The expected value of the "aud" claim in the JWT. This is required for
  665. validation.
  666. .. attr:: issuer_id
  667. The expected value of the "iss" claim in the JWT. This is required for
  668. validation.
  669. .. attr:: uid_claim
  670. :default: sub
  671. The JWT claim that Zuul will use as a unique identifier for the bearer of
  672. a token. This is "sub" by default, as it is usually the purpose of this
  673. claim in a JWT. This identifier is used in audit logs.
  674. .. attr:: max_validity_time
  675. Optional value to ensure a JWT cannot be valid for more than this amount
  676. of time in seconds. This is useful if the Zuul operator has no control
  677. over the service issueing JWTs, and the tokens are too long-lived.
  678. This section can be repeated as needed with different authenticators, allowing
  679. access to privileged API actions from several JWT issuers.
  680. Driver-specific attributes
  681. ..........................
  682. HS256
  683. ,,,,,
  684. This is a symmetrical encryption algorithm that only requires a shared secret
  685. between the JWT issuer and the JWT consumer (ie Zuul). This driver should be
  686. used in test deployments only, or in deployments where JWTs will be issued
  687. manually.
  688. .. attr:: secret
  689. :noindex:
  690. The shared secret used to sign JWTs and validate signatures.
  691. RS256
  692. ,,,,,
  693. This is an asymmetrical encryption algorithm that requires an RSA key pair. Only
  694. the public key is needed by Zuul for signature validation.
  695. .. attr:: public_key
  696. The path to the public key of the RSA key pair. It must be readable by Zuul.
  697. .. attr:: private_key
  698. Optional. The path to the private key of the RSA key pair. It must be
  699. readable by Zuul.
  700. RS256withJWKS
  701. ,,,,,,,,,,,,,
  702. Some Identity Providers use key sets (also known as **JWKS**), therefore the key to
  703. use when verifying the Authentication Token's signatures cannot be known in
  704. advance; the key's id is stored in the JWT's header and the key must then be
  705. found in the remote key set.
  706. The key set is usually available at a specific URL that can be found in the
  707. "well-known" configuration of an OpenID Connect Identity Provider.
  708. .. attr:: keys_url
  709. The URL where the Identity Provider's key set can be found. For example, for
  710. Google's OAuth service:
  711. Operation
  712. ~~~~~~~~~
  713. To start the web server, run ``zuul-web``. To stop it, kill the
  714. PID which was saved in the pidfile specified in the configuration.
  715. Web Client
  716. ----------
  717. Zuul's command line client may be configured to make calls to Zuul's web
  718. server. The client will then look for a ``zuul.conf`` file with a ``webclient``
  719. section to set up the connection over HTTP.
  720. Configuration
  721. ~~~~~~~~~~~~~
  722. .. attr:: webclient
  723. .. attr:: url
  724. The root URL of Zuul's web server.
  725. .. attr:: verify_ssl
  726. :default: true
  727. Enforce SSL verification when sending requests over to Zuul's web server.
  728. This should only be disabled when working with test servers.
  729. Configuration
  730. ~~~~~~~~~~~~~
  731. In addition to the common configuration sections, the following
  732. sections of ``zuul.conf`` are used by the web server:
  733. .. attr:: web
  734. .. attr:: listen_address
  735. :default:
  736. IP address or domain name on which to listen.
  737. .. attr:: log_config
  738. Path to log config file for the web server process.
  739. Finger Gateway
  740. --------------
  741. The Zuul finger gateway listens on the standard finger port (79) for
  742. finger requests specifying a build UUID for which it should stream log
  743. results. The gateway will determine which executor is currently running that
  744. build and query that executor for the log stream.
  745. This is intended to be used with the standard finger command line client.
  746. For example::
  747. finger
  748. The above would stream the logs for the build identified by `UUID`.
  749. Finger gateway servers need to be able to connect to the Gearman
  750. server (usually the scheduler host), as well as the console streaming
  751. port on the executors (usually 7900).
  752. Configuration
  753. ~~~~~~~~~~~~~
  754. In addition to the common configuration sections, the following
  755. sections of ``zuul.conf`` are used by the finger gateway:
  756. .. attr:: fingergw
  757. .. attr:: command_socket
  758. :default: /var/lib/zuul/fingergw.socket
  759. Path to command socket file for the executor process.
  760. .. attr:: listen_address
  761. :default: all addresses
  762. IP address or domain name on which to listen.
  763. .. attr:: log_config
  764. Path to log config file for the finger gateway process.
  765. .. attr:: pidfile
  766. :default: /var/run/zuul/
  767. Path to PID lock file for the finger gateway process.
  768. .. attr:: port
  769. :default: 79
  770. Port to use for the finger gateway. Note that since command line
  771. finger clients cannot usually specify the port, leaving this set to
  772. the default value is highly recommended.
  773. .. attr:: user
  774. User ID for the zuul-fingergw process. In normal operation as a
  775. daemon, the finger gateway should be started as the ``root``
  776. user, but if this option is set, it will drop privileges to this
  777. user during startup. It is recommended to set this option to an
  778. unprivileged user.
  779. Operation
  780. ~~~~~~~~~
  781. To start the finger gateway, run ``zuul-fingergw``. To stop it, kill the
  782. PID which was saved in the pidfile specified in the configuration.