Merge "Remove nodepool documentation"

This commit is contained in:
Zuul
2025-07-18 16:44:20 +00:00
committed by Gerrit Code Review
6 changed files with 45 additions and 143 deletions

View File

@@ -112,7 +112,7 @@ We require two projects to be provisioned
* A ``zuul`` project for infrastructure testing nodes
* A ``ci`` project for control-plane services
The ``zuul`` project will be used by nodepool for running the testing
The ``zuul`` project will be used by Zuul for running the testing
nodes. Note there may be be references in configuration to projects
with ``jenkins``; although this is not used any more some original
clouds named their projects for the CI system in use at the time.
@@ -124,7 +124,7 @@ for the cloud's region(s). This will be named
(this might influence choices you make in network setup, etc.).
Depending on the resources available and with prior co-ordination with
the provider, the infrastructure team may also run other services in
this project such as webservers, file servers or nodepool builders.
this project such as webservers, file servers or Zuul components.
The exact project and user names is not particularly important,
usually something like ``openstack[ci|zuul]`` is chosen. Per below,
@@ -176,26 +176,26 @@ Once active, ``bridge.openstack.org`` will begin regularly running
against the new cloud to configure keys, upload base images, setup
security groups and such.
Activate in nodepool
--------------------
Activate in Zuul
----------------
After the cloud is configured, it can be added as a resource for
nodepool to use for testing nodes.
Zuul to use for testing nodes.
Firstly, an ``infra-root`` member will need to make the region-local
mirror server, configure any required storage for it and setup DNS.
With this active, the cloud is ready to start running testing nodes.
At this point, the cloud needs to be added to nodepool configuration
in `project-config
<https://opendev.org/openstack/project-config/src/branch/master/nodepool>`__.
At this point, the cloud needs to be added to Zuul configuration
in `zuul-providers
<https://opendev.org/opendev/zuul-providers/src/branch/master>`__.
Again existing entries provide useful templates for the initial review
proposal, which can be done by anyone. Some clouds provision
particular flavors for CI nodes; these need to be present at this
point and will be conveyed via the nodepool configuration. Again CI
point and will be conveyed via the Zuul configuration. Again CI
checks and reviewers will help with any fine details.
Once this is committed, nodepool will upload images into the new
Once this is committed, Zuul will upload images into the new
region and start running nodes automatically. Don't forget to add the
region to the `grafana
<https://opendev.org/openstack/project-config/src/branch/master/grafana>`__.
@@ -205,8 +205,9 @@ Ongoing operation
-----------------
If at any point the cloud needs to be disabled for maintenance a
review can be proposed to set the ``max-servers`` to zero in the
nodepool configuration. We usually propose a revert of this at the
same time with a negative workflow to remember to turn it back on when
appropriate. In an emergency, an ``infra-root`` member can bypass the
normal review process and apply such a change by hand.
review can be proposed to set the ``instances`` key under
``resource-limits`` to zero in the Zuul configuration. We usually
propose a revert of this at the same time with a negative workflow to
remember to turn it back on when appropriate. In an emergency, an
``infra-root`` member can bypass the normal review process and apply
such a change by hand.

View File

@@ -48,7 +48,7 @@ How It Works
============
The devstack test starts with an essentially bare virtual machine
made available by :ref:`nodepool` and prepares the testing
made available by :ref:`zuul` and prepares the testing
environment. This is driven by the devstack-gate repository which
holds several scripts that are run by Zuul.

View File

@@ -7,7 +7,7 @@ Grafana
Grafana is an open source, feature rich metrics dashboard and graph editor for
Graphite, InfluxDB & OpenTSDB. OpenStack runs Graphite which stores all the
metrics related to Nodepool and Zuul (to name a few).
metrics related to Zuul and other services.
At a Glance
===========

View File

@@ -1,116 +0,0 @@
:title: Nodepool
.. _nodepool:
Nodepool
########
Nodepool is a service used by the OpenStack CI team to deploy and manage a pool
of devstack images on a cloud server for use in OpenStack project testing.
At a Glance
===========
:Hosts:
* nl05.opendev.org
* nl06.opendev.org
* nl07.opendev.org
* nl08.opendev.org
* nb05.opendev.org
* nb06.opendev.org
* nb07.opendev.org
* zk01.opendev.org
* zk02.opendev.org
* zk03.opendev.org
:Puppet:
* https://opendev.org/opendev/puppet-openstackci/src/branch/master/manifests/nodepool_builder.pp
:Configuration:
* :config:`nodepool/nodepool.yaml`
* :config:`nodepool/scripts/`
* :config:`nodepool/elements/`
:Projects:
* https://opendev.org/zuul/nodepool
:Bugs:
* https://storyboard.openstack.org/#!/project/668
:Resources:
* `Nodepool Reference Manual <http://docs.openstack.org/infra/nodepool>`_
* `ZooKeeper Programmer's Guide <https://zookeeper.apache.org/doc/trunk/zookeeperProgrammers.html>`_
* `ZooKeeper Administrator's Guide <https://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html>`_
* `zk_shell <https://pypi.org/project/zk_shell/>`_
Overview
========
Once per day, for every image type (and provider) configured by
nodepool, a new image with cached data is built for use by devstack.
Nodepool spins up new instances and tears down old as tests are queued
up and completed, always maintaining a consistent number of available
instances for tests up to the set limits of the CI infrastructure.
Zookeeper
=========
Nodepool stores image metadata in ZooKeeper. We have a three-node
ZooKeeper cluster running on zk01.opendev.org - zk03.opendev.org.
The Nodepool CLI should be sufficient to examine and alter any of the
information stored in ZooKeeper. However, in case advanced debugging
is needed, use of zk-shell ("pip install zk_shell" into a virtualenv
and run "zk-shell") is recommended as an easy way to inspect and/or
change data in ZooKeeper.
Bad Images
==========
Since nodepool takes a while to build images, and generally only does
it once per day, occasionally the images it produces may have
significant behavior changes from the previous versions. For
instance, a provider's base image or operating system package may
update, or some of the scripts or system configuration that we apply
to the images may change. If this occurs, it is easy to revert to the
last good image.
Nodepool periodically deletes old images, however, it never deletes
the current or next most recent image in the ``ready`` state for any
image-provider combination. So if you find that the
``ubuntu-precise`` image is problematic, you can run::
$ sudo nodepool dib-image-list
+---------------------------+----------------+---------+-----------+----------+-------------+
| ID | Image | Builder | Formats | State | Age |
+---------------------------+----------------+---------+-----------+----------+-------------+
| ubuntu-precise-0000000001 | ubuntu-precise | nb01 | qcow2,vhd | ready | 02:00:57:33 |
| ubuntu-precise-0000000002 | ubuntu-precise | nb01 | qcow2,vhd | ready | 01:00:57:33 |
+---------------------------+----------------+---------+-----------+----------+-------------+
Image ubuntu-precise-0000000001 is the previous image and
ubuntu-precise-0000000002 is the current image (they are both marked
as ``ready`` and the current image is simply the image with the
shortest age.
Nodepool aggressively attempts to build and upload missing images, so
if the problem with the image will not be solved with an immediate
rebuild, image builds must first be disabled for that image. To do
so, add ``pause: True`` to the ``diskimage`` section for
``ubuntu-precise`` in nodepool.yaml.
Then delete the problematic image with::
$ sudo nodepool dib-image-delete ubuntu-precise-0000000002
All uploads corresponding to that image build will be deleted from providers
before the image DIB files are deleted. The previous image will become the
current image and nodepool will use it when creating new nodes. When nodepool
next creates an image, it will still retain build #1 since it will still be
considered the next-most-recent image.
vhd-util
========
Creating images for Rackspace requires a patched version of vhd-util
to convert the images into the appropriate VHD format. See the
`opendev/infra-vhd-util-deb
<https://opendev.org/opendev/infra-vhd-util-deb>`__ for details of
this custom package. This is installed on a production host via a PPA
built and published by jobs in this repository.

View File

@@ -17,7 +17,6 @@ Major Systems
keycloak
zuul
devstack-gate
nodepool
jeepyb
irc
etherpad

View File

@@ -98,19 +98,14 @@ Zuul has three main subsystems:
* Zuul Executors
* Zuul Web
that in OpenDev's deployment depend on four 'external' systems:
that in OpenDev's deployment depend on two 'external' systems:
* Nodepool
* Zookeeper
* gear
* MySQL
Scheduler
---------
The Zuul Scheduler and gear are all co-located on a single host,
referred to by the ``zuul.opendev.org`` CNAME in DNS.
Zuul is stateless, so the server does not need backing up. However
Zuul talks through git and ssh so you will need to manually check ssh
host keys as the zuul user.
@@ -120,9 +115,6 @@ e.g.::
sudo su - zuul
ssh -p 29418 review.opendev.org
The Zuul Scheduler talks to Nodepool using Zookeeper and distributes work to
the executors using gear.
OpenDev's Zuul installation is also configured to write job results into
a MySQL database via the SQL Reporter plugin. The database for that is a
Rackspace Cloud DB and is configured in the ``mysql`` entry of the
@@ -301,6 +293,32 @@ enough to collect a representative set of profiler data. In most cases a minute
or two should be sufficient. Slow memory leaks may require hours, but running
Zuul under yappi for hours isn't practical.
Bad Image Builds
================
Since it takes a while to build images, and we generally only do it
once per day, occasionally the images produced may have significant
behavior changes from the previous versions. For instance, a
provider's base image or operating system package may update, or some
of the scripts or system configuration that we apply to the images may
change. If this occurs, it is easy to revert to the last good image.
Zuul periodically deletes old images, however, it never deletes the
current or next most recent image in the ``ready`` state for any
image-provider combination. So if you find an image is problematic,
you can identify the UUID of the image using the web interface and
delete it through the web or using ``zuul-client``.
vhd-util
========
Creating images for Rackspace requires a patched version of vhd-util
to convert the images into the appropriate VHD format. See the
`opendev/infra-vhd-util-deb
<https://opendev.org/opendev/infra-vhd-util-deb>`__ for details of
this custom package. This is installed on a production host via a PPA
built and published by jobs in this repository.
.. _zuul_github_projects:
GitHub Projects