When running nodepool launchers in kubernetes a common method to update nodepool or its config is doing rolling restarts. The process for this is start a new nodepool, wait for it to be ready and then tear down the old instance. Currently this is not possible without risking node_failures when there is only one instance serving a label. The reason for this is that there is no reliable way to determine when the new instance is fully started which could lead to a too early tear down of the old instance. This would result in node_failures for all in-flight nore requests that are only valid for this provider. Adding a /ready endpoint to the webapp can make this deterministic using readiness checks of kubernetes. Change-Id: I53e77f3d8aaa4742ce2a89c1179e8563f850270e
17 KiB
Operation
Nodepool has two components which run as daemons. The
nodepool-builder
daemon is responsible for building
diskimages and uploading them to providers, and the
nodepool-launcher
daemon is responsible for launching and
deleting nodes.
Both daemons frequently re-read their configuration file after starting to support adding or removing new images and providers, or otherwise altering the configuration.
These daemons communicate with each other via a Zookeeper database. You must run Zookeeper and at least one of each of these daemons to have a functioning Nodepool installation.
Nodepool-builder
The nodepool-builder
daemon builds and uploads images to
providers. It may be run on the same or a separate host as the main
nodepool daemon. Multiple instances of nodepool-builder
may
be run on the same or separate hosts in order to speed up image builds
across many machines, or supply high-availability or redundancy.
However, since nodepool-builder
allows specification of the
number of both build and upload threads, it is usually not advantageous
to run more than a single instance on one machine. Note that while
diskimage-builder (which is responsible for building the underlying
images) generally supports executing multiple builds on a single machine
simultaneously, some of the elements it uses may not. To be safe, it is
recommended to run a single instance of nodepool-builder
on
a machine, and configure that instance to run only a single build thread
(the default).
Nodepool-launcher
The main nodepool daemon is named nodepool-launcher
and
is responsible for managing cloud instances launched from the images
created and uploaded by nodepool-builder
.
When a new image is created and uploaded,
nodepool-launcher
will immediately start using it when
launching nodes (Nodepool always uses the most recent image for a given
provider in the ready
state). Nodepool will delete images
if they are not the most recent or second most recent ready
images. In other words, Nodepool will always make sure that in addition
to the current image, it keeps the previous image around. This way if
you find that a newly created image is problematic, you may simply
delete it and Nodepool will revert to using the previous image.
Daemon usage
To start the main Nodepool daemon, run nodepool-launcher:
nodepool-launcher --help
To start the nodepool-builder daemon, run nodepool--builder:
nodepool-builder --help
To stop a daemon, send SIGINT to the process.
When yappi (Yet Another Python Profiler) is available, additional functions' and threads' stats are emitted as well. The first SIGUSR2 will enable yappi, on the second SIGUSR2 it dumps the information collected, resets all yappi state and stops profiling. This is to minimize the impact of yappi on a running system.
Metadata
When Nodepool creates instances, it will assign the following nova metadata:
- groups
A comma separated list containing the name of the image and the name of the provider. This may be used by the Ansible OpenStack inventory plugin.
- nodepool_image_name
The name of the image as a string.
- nodepool_provider_name
The name of the provider as a string.
- nodepool_node_id
The nodepool id of the node as an integer.
Common Management Tasks
In the course of running a Nodepool service you will find that there are some common operations that will be performed. Like the services themselves these are split into two groups, image management and instance management.
Image Management
Before Nodepool can launch any cloud instances it must have images to
boot off of. nodepool dib-image-list
will show you which
images are available locally on disk. These images on disk are then
uploaded to clouds, nodepool image-list
will show you what
images are bootable in your various clouds.
If you need to force a new image to be built to pick up a new feature
more quickly than the normal rebuild cycle (which defaults to 24 hours)
you can manually trigger a rebuild. Using
nodepool image-build
you can tell Nodepool to begin a new
image build now. Note that depending on work that the nodepool-builder
is already performing this may queue the build. Check
nodepool dib-image-list
to see the current state of the
builds. Once the image is built it is automatically uploaded to all of
the clouds configured to use that image.
At times you may need to stop using an existing image because it is
broken. Your two major options here are to build a new image to replace
the existing image or to delete the existing image and have Nodepool
fall back on using the previous image. Rebuilding and uploading can be
slow so typically the best option is to simply
nodepool image-delete
the most recent image which will
cause Nodepool to fallback on using the previous image. Howevever, if
you do this without "pausing" the image it will be immediately
reuploaded. You will want to pause the image if you need to further
investigate why the image is not being built correctly. If you know the
image will be built correctly you can simple delete the built image and
remove it from all clouds which will cause it to be rebuilt using
nodepool dib-image-delete
.
Command Line Tools
Usage
The general options that apply to all subcommands are:
nodepool --help
The following subcommands deal with nodepool images:
dib-image-list
nodepool dib-image-list --help
image-list
nodepool image-list --help
image-build
nodepool image-build --help
dib-image-delete
nodepool dib-image-delete --help
image-delete
nodepool image-delete --help
The following subcommands deal with nodepool nodes:
list
nodepool list --help
delete
nodepool delete --help
The following subcommands deal with ZooKeeper data management:
info
nodepool info --help
erase
nodepool erase --help
If Nodepool's database gets out of sync with reality, the following commands can help identify compute instances or images that are unknown to Nodepool:
alien-image-list
nodepool alien-image-list --help
Removing a Provider
Removing a provider from nodepool involves two separate steps: removing from the builder process, and removing from the launcher process.
Warning
Since the launcher process depends on images being present in the provider, you should follow the process for removing a provider from the launcher before doing the steps to remove it from the builder.
Removing from the Launcher
To remove a provider from the launcher, set that provider's
max-servers
value to 0 (or any value less than 0). This
disables the provider and will instruct the launcher to stop booting new
nodes on the provider. You can then let the nodes go through their
normal lifecycle. Once all nodes have been deleted, you may remove the
provider from launcher configuration file entirely, although leaving it
in this state is effectively the same and makes it easy to turn the
provider back on.
Note
There is currently no way to force the launcher to immediately begin deleting any unused instances from a disabled provider. If urgency is required, you can delete the nodes directly instead of waiting for them to go through their normal lifecycle, but the effect is the same.
For example, if you want to remove ProviderA from a launcher with a configuration file defined as:
providers:
- name: ProviderA
region-name: region1
cloud: ProviderA
boot-timeout: 120
diskimages:
- name: centos
- name: fedora
pools:
- name: main
max-servers: 100
labels:
- name: centos
min-ram: 8192
flavor-name: Performance
diskimage: centos
key-name: root-key
Then you would need to alter the configuration to:
providers:
- name: ProviderA
region-name: region1
cloud: ProviderA
boot-timeout: 120
diskimages:
- name: centos
- name: fedora
pools:
- name: main
max-servers: 0
labels:
- name: centos
min-ram: 8192
flavor-name: Performance
diskimage: centos
key-name: root-key
Note
The launcher process will automatically notice any changes in its configuration file, so there is no need to restart the service to pick up the change.
Removing from the Builder
The builder controls image building, uploading, and on-disk cleanup.
The builder needs a chance to properly manage these resources for a
removed a provider. To do this, you need to first set the
diskimage
configuration section for the provider you want
to remove to an empty list.
Warning
Make sure the provider is disabled in the launcher before disabling in the builder.
For example, if you want to remove ProviderA from a builder with a configuration file defined as:
providers:
- name: ProviderA
region-name: region1
diskimages:
- name: centos
- name: fedora
diskimages:
- name: centos
pause: false
elements:
- centos-minimal
...
env-vars:
...
Then you would need to alter the configuration to:
providers:
- name: ProviderA
region-name: region1
diskimages: []
diskimages:
- name: centos
pause: false
elements:
- centos-minimal
...
env-vars:
...
By keeping the provider defined in the configuration file, but
changing the diskimages
to an empty list, you signal the
builder to cleanup resources for that provider, including any images
already uploaded, any on-disk images, and any image data stored in
ZooKeeper. After those resources have been cleaned up, it is safe to
remove the provider from the configuration file entirely, if you wish to
do so.
Note
The builder process will automatically notice any changes in its configuration file, so there is no need to restart the service to pick up the change.
Web interface
If configured (see webapp-conf
), a nodepool-launcher
instance can provide a range of end-points that can provide information
in text and json
format. Note if there are multiple
launchers, all will provide the same information.
The status of uploaded images
- query fields
comma-separated list of fields to display
- reqheader Accept
application/json
ortext/*
- resheader Content-Type
application/json
ortext/plain
depending on the :http:header:Accept header
The status of images built by
diskimage-builder
- query fields
comma-separated list of fields to display
- reqheader Accept
application/json
ortext/*
- resheader Content-Type
application/json
ortext/plain
depending on the :http:header:Accept header
The status of currently active nodes
- query node_id
restrict to a specific node
- query fields
comma-separated list of fields to display
- reqheader Accept
application/json
ortext/*
- resheader Content-Type
application/json
ortext/plain
depending on the :http:header:Accept header
Outstanding requests
- query fields
comma-separated list of fields to display
- reqheader Accept
application/json
ortext/*
- resheader Content-Type
application/json
ortext/plain
depending on the :http:header:Accept header
All available labels as reported by all launchers
- query fields
comma-separated list of fields to display
- reqheader Accept
application/json
ortext/*
- resheader Content-Type
application/json
ortext/plain
depending on the :http:header:Accept header
Responds with status code 200 as soon as all configured providers are fully started. During startup it returns 500. This can be used as a readiness probe in a kubernetes based deployment.
Monitoring
Nodepool provides monitoring information to statsd. See statsd_configuration
to learn
how to enable statsd support. Currently, these metrics are
supported:
Nodepool builder
Nodepool launcher
OpenStack API stats
Low level details on the timing of OpenStack API calls will be logged
by openstacksdk
. These calls are logged under
nodepool.task.<provider>.<api-call>
. The API
call name is of the generic format
<service-type>.<method>.<operation>
. For
example, the GET /servers
call to the compute
service becomes compute.GET.servers
.
Since these calls reflect the internal operations of the
openstacksdk
, the exact keys logged may vary across
providers and releases.