diff --git a/doc/source/contribute-cloud.rst b/doc/source/contribute-cloud.rst index d7eb6f29c6..a7b5697e13 100644 --- a/doc/source/contribute-cloud.rst +++ b/doc/source/contribute-cloud.rst @@ -112,7 +112,7 @@ We require two projects to be provisioned * A ``zuul`` project for infrastructure testing nodes * A ``ci`` project for control-plane services -The ``zuul`` project will be used by nodepool for running the testing +The ``zuul`` project will be used by Zuul for running the testing nodes. Note there may be be references in configuration to projects with ``jenkins``; although this is not used any more some original clouds named their projects for the CI system in use at the time. @@ -124,7 +124,7 @@ for the cloud's region(s). This will be named (this might influence choices you make in network setup, etc.). Depending on the resources available and with prior co-ordination with the provider, the infrastructure team may also run other services in -this project such as webservers, file servers or nodepool builders. +this project such as webservers, file servers or Zuul components. The exact project and user names is not particularly important, usually something like ``openstack[ci|zuul]`` is chosen. Per below, @@ -176,26 +176,26 @@ Once active, ``bridge.openstack.org`` will begin regularly running against the new cloud to configure keys, upload base images, setup security groups and such. -Activate in nodepool --------------------- +Activate in Zuul +---------------- After the cloud is configured, it can be added as a resource for -nodepool to use for testing nodes. +Zuul to use for testing nodes. Firstly, an ``infra-root`` member will need to make the region-local mirror server, configure any required storage for it and setup DNS. With this active, the cloud is ready to start running testing nodes. -At this point, the cloud needs to be added to nodepool configuration -in `project-config -`__. +At this point, the cloud needs to be added to Zuul configuration +in `zuul-providers +`__. Again existing entries provide useful templates for the initial review proposal, which can be done by anyone. Some clouds provision particular flavors for CI nodes; these need to be present at this -point and will be conveyed via the nodepool configuration. Again CI +point and will be conveyed via the Zuul configuration. Again CI checks and reviewers will help with any fine details. -Once this is committed, nodepool will upload images into the new +Once this is committed, Zuul will upload images into the new region and start running nodes automatically. Don't forget to add the region to the `grafana `__. @@ -205,8 +205,9 @@ Ongoing operation ----------------- If at any point the cloud needs to be disabled for maintenance a -review can be proposed to set the ``max-servers`` to zero in the -nodepool configuration. We usually propose a revert of this at the -same time with a negative workflow to remember to turn it back on when -appropriate. In an emergency, an ``infra-root`` member can bypass the -normal review process and apply such a change by hand. +review can be proposed to set the ``instances`` key under +``resource-limits`` to zero in the Zuul configuration. We usually +propose a revert of this at the same time with a negative workflow to +remember to turn it back on when appropriate. In an emergency, an +``infra-root`` member can bypass the normal review process and apply +such a change by hand. diff --git a/doc/source/devstack-gate.rst b/doc/source/devstack-gate.rst index e42593bb9b..7efe39acb9 100644 --- a/doc/source/devstack-gate.rst +++ b/doc/source/devstack-gate.rst @@ -48,7 +48,7 @@ How It Works ============ The devstack test starts with an essentially bare virtual machine -made available by :ref:`nodepool` and prepares the testing +made available by :ref:`zuul` and prepares the testing environment. This is driven by the devstack-gate repository which holds several scripts that are run by Zuul. diff --git a/doc/source/grafana.rst b/doc/source/grafana.rst index 925ed22019..c82b856943 100644 --- a/doc/source/grafana.rst +++ b/doc/source/grafana.rst @@ -7,7 +7,7 @@ Grafana Grafana is an open source, feature rich metrics dashboard and graph editor for Graphite, InfluxDB & OpenTSDB. OpenStack runs Graphite which stores all the -metrics related to Nodepool and Zuul (to name a few). +metrics related to Zuul and other services. At a Glance =========== diff --git a/doc/source/nodepool.rst b/doc/source/nodepool.rst deleted file mode 100644 index 8fd3189978..0000000000 --- a/doc/source/nodepool.rst +++ /dev/null @@ -1,116 +0,0 @@ -:title: Nodepool - -.. _nodepool: - -Nodepool -######## - -Nodepool is a service used by the OpenStack CI team to deploy and manage a pool -of devstack images on a cloud server for use in OpenStack project testing. - -At a Glance -=========== - -:Hosts: - * nl05.opendev.org - * nl06.opendev.org - * nl07.opendev.org - * nl08.opendev.org - * nb05.opendev.org - * nb06.opendev.org - * nb07.opendev.org - * zk01.opendev.org - * zk02.opendev.org - * zk03.opendev.org -:Puppet: - * https://opendev.org/opendev/puppet-openstackci/src/branch/master/manifests/nodepool_builder.pp -:Configuration: - * :config:`nodepool/nodepool.yaml` - * :config:`nodepool/scripts/` - * :config:`nodepool/elements/` -:Projects: - * https://opendev.org/zuul/nodepool -:Bugs: - * https://storyboard.openstack.org/#!/project/668 -:Resources: - * `Nodepool Reference Manual `_ - * `ZooKeeper Programmer's Guide `_ - * `ZooKeeper Administrator's Guide `_ - * `zk_shell `_ - -Overview -======== - -Once per day, for every image type (and provider) configured by -nodepool, a new image with cached data is built for use by devstack. -Nodepool spins up new instances and tears down old as tests are queued -up and completed, always maintaining a consistent number of available -instances for tests up to the set limits of the CI infrastructure. - -Zookeeper -========= - -Nodepool stores image metadata in ZooKeeper. We have a three-node -ZooKeeper cluster running on zk01.opendev.org - zk03.opendev.org. - -The Nodepool CLI should be sufficient to examine and alter any of the -information stored in ZooKeeper. However, in case advanced debugging -is needed, use of zk-shell ("pip install zk_shell" into a virtualenv -and run "zk-shell") is recommended as an easy way to inspect and/or -change data in ZooKeeper. - -Bad Images -========== - -Since nodepool takes a while to build images, and generally only does -it once per day, occasionally the images it produces may have -significant behavior changes from the previous versions. For -instance, a provider's base image or operating system package may -update, or some of the scripts or system configuration that we apply -to the images may change. If this occurs, it is easy to revert to the -last good image. - -Nodepool periodically deletes old images, however, it never deletes -the current or next most recent image in the ``ready`` state for any -image-provider combination. So if you find that the -``ubuntu-precise`` image is problematic, you can run:: - - $ sudo nodepool dib-image-list - - +---------------------------+----------------+---------+-----------+----------+-------------+ - | ID | Image | Builder | Formats | State | Age | - +---------------------------+----------------+---------+-----------+----------+-------------+ - | ubuntu-precise-0000000001 | ubuntu-precise | nb01 | qcow2,vhd | ready | 02:00:57:33 | - | ubuntu-precise-0000000002 | ubuntu-precise | nb01 | qcow2,vhd | ready | 01:00:57:33 | - +---------------------------+----------------+---------+-----------+----------+-------------+ - -Image ubuntu-precise-0000000001 is the previous image and -ubuntu-precise-0000000002 is the current image (they are both marked -as ``ready`` and the current image is simply the image with the -shortest age. - -Nodepool aggressively attempts to build and upload missing images, so -if the problem with the image will not be solved with an immediate -rebuild, image builds must first be disabled for that image. To do -so, add ``pause: True`` to the ``diskimage`` section for -``ubuntu-precise`` in nodepool.yaml. - -Then delete the problematic image with:: - - $ sudo nodepool dib-image-delete ubuntu-precise-0000000002 - -All uploads corresponding to that image build will be deleted from providers -before the image DIB files are deleted. The previous image will become the -current image and nodepool will use it when creating new nodes. When nodepool -next creates an image, it will still retain build #1 since it will still be -considered the next-most-recent image. - -vhd-util -======== - -Creating images for Rackspace requires a patched version of vhd-util -to convert the images into the appropriate VHD format. See the -`opendev/infra-vhd-util-deb -`__ for details of -this custom package. This is installed on a production host via a PPA -built and published by jobs in this repository. diff --git a/doc/source/systems.rst b/doc/source/systems.rst index 3ae07bc6f7..279ecd3c02 100644 --- a/doc/source/systems.rst +++ b/doc/source/systems.rst @@ -17,7 +17,6 @@ Major Systems keycloak zuul devstack-gate - nodepool jeepyb irc etherpad diff --git a/doc/source/zuul.rst b/doc/source/zuul.rst index b16eedfb27..c4f547e61b 100644 --- a/doc/source/zuul.rst +++ b/doc/source/zuul.rst @@ -98,19 +98,14 @@ Zuul has three main subsystems: * Zuul Executors * Zuul Web -that in OpenDev's deployment depend on four 'external' systems: +that in OpenDev's deployment depend on two 'external' systems: -* Nodepool * Zookeeper -* gear * MySQL Scheduler --------- -The Zuul Scheduler and gear are all co-located on a single host, -referred to by the ``zuul.opendev.org`` CNAME in DNS. - Zuul is stateless, so the server does not need backing up. However Zuul talks through git and ssh so you will need to manually check ssh host keys as the zuul user. @@ -120,9 +115,6 @@ e.g.:: sudo su - zuul ssh -p 29418 review.opendev.org -The Zuul Scheduler talks to Nodepool using Zookeeper and distributes work to -the executors using gear. - OpenDev's Zuul installation is also configured to write job results into a MySQL database via the SQL Reporter plugin. The database for that is a Rackspace Cloud DB and is configured in the ``mysql`` entry of the @@ -301,6 +293,32 @@ enough to collect a representative set of profiler data. In most cases a minute or two should be sufficient. Slow memory leaks may require hours, but running Zuul under yappi for hours isn't practical. +Bad Image Builds +================ + +Since it takes a while to build images, and we generally only do it +once per day, occasionally the images produced may have significant +behavior changes from the previous versions. For instance, a +provider's base image or operating system package may update, or some +of the scripts or system configuration that we apply to the images may +change. If this occurs, it is easy to revert to the last good image. + +Zuul periodically deletes old images, however, it never deletes the +current or next most recent image in the ``ready`` state for any +image-provider combination. So if you find an image is problematic, +you can identify the UUID of the image using the web interface and +delete it through the web or using ``zuul-client``. + +vhd-util +======== + +Creating images for Rackspace requires a patched version of vhd-util +to convert the images into the appropriate VHD format. See the +`opendev/infra-vhd-util-deb +`__ for details of +this custom package. This is installed on a production host via a PPA +built and published by jobs in this repository. + .. _zuul_github_projects: GitHub Projects