Start adding operational docs to zuulv3

This tries to capture common operation tasks in the documentation. It
also clears up some related items about what is necessary to have a
functioning Nodepool installation and what the dib-image-delete command
does.

Story: 2000790
Change-Id: I397fc4879fa84ffc667ddda0aff9c107eee0d694
This commit is contained in:
Clark Boylan 2017-04-05 10:17:21 -07:00
parent e63f67c2f7
commit f26f502fbf
2 changed files with 63 additions and 3 deletions

View File

@ -12,6 +12,10 @@ Both daemons frequently re-read their configuration file after
starting to support adding or removing new images and providers, or
otherwise altering the configuration.
These daemons communicate with each other via a Zookeeper database.
You must run Zookeeper and at least one of each of these daemons to
have a functioning Nodepool installation.
Nodepool-builder
----------------
@ -35,8 +39,8 @@ Nodepool-launcher
-----------------
The main nodepool daemon is named ``nodepool-launcher`` and is
responsible for launching instances from the images created and
uploaded by ``nodepool-builder``.
responsible for managing cloud instances launched from the images
created and uploaded by ``nodepool-builder``.
When a new image is created and uploaded, ``nodepool-launcher`` will
immediately start using it when launching nodes (Nodepool always uses
@ -90,6 +94,61 @@ metadata:
nodepool_node_id
The nodepool id of the node as an integer.
Common Management Tasks
-----------------------
In the course of running a Nodepool service you will find that there are
some common operations that will be performed. Like the services
themselves these are split into two groups, image management and
instance management.
Image Management
~~~~~~~~~~~~~~~~
Before Nodepool can launch any cloud instances it must have images to boot
off of. ``nodepool dib-image-list`` will show you which images are available
locally on disk. These images on disk are then uploaded to clouds,
``nodepool image-list`` will show you what images are bootable in your
various clouds.
If you need to force a new image to be built to pick up a new feature more
quickly than the normal rebuild cycle (which defaults to 24 hours) you can
manually trigger a rebuild. Using ``nodepool image-build`` you can tell
Nodepool to begin a new image build now. Note that depending on work that
the nodepool-builder is already performing this may queue the build. Check
``nodepool dib-image-list`` to see the current state of the builds. Once
the image is built it is automatically uploaded to all of the clouds
configured to use that image.
At times you may need to stop using an existing image because it is broken.
Your two major options here are to build a new image to replace the existing
image or to delete the existing image and have Nodepool fall back on using
the previous image. Rebuilding and uploading can be slow so typically the
best option is to simply ``nodepool image-delete`` the most recent image
which will cause Nodepool to fallback on using the previous image. Howevever,
if you do this without "pausing" the image it will be immediately reuploaded.
You will want to pause the image if you need to further investigate why
the image is not being built correctly. If you know the image will be built
correctly you can simple delete the built image and remove it from all clouds
which will cause it to be rebuilt using ``nodepool dib-image-delete``.
Instance Management
~~~~~~~~~~~~~~~~~~~
With working images in providers you should see Nodepool launching instances
in these providers using the images it built. You may find that you need to
debug a particular job failure manually. An easy way to do this is to
``nodepool hold`` an instance then log in to the instance and perform any
necessary debugging steps. Note that this doesn't stop the job running there,
what it will do is prevent Nodepool from automatically deleting this instance
once the job is complete.
In some circumstances like manually holding an instance above, or wanting to
force a job restart you may want to delete a running instance. You can issue
a ``nodepool delete`` to force nodepool to do this.
Complete command help info is below.
Command Line Tools
------------------

View File

@ -111,7 +111,8 @@ class NodePoolCmd(NodepoolApp):
cmd_dib_image_delete = subparsers.add_parser(
'dib-image-delete',
help='delete image built with diskimage-builder')
help='Delete a dib built image from disk along with all cloud '
'uploads of this image')
cmd_dib_image_delete.set_defaults(func=self.dib_image_delete)
cmd_dib_image_delete.add_argument('id', help='dib image id')