This augments the dib-request list (which shows what images have
manual build requests) with information about whether the image
is paused. The resulting command is renamed to "image-status".
Image build requests can now be retrieved through the /dib-request-list
endpoint or via the dib-request-list sub-command. The list will show the
age of the request and if it is still pending or if there is already a
build in progress.
Users of the "nodepool" command don't need to see the component
registry logs at info level (which output at least a line for each
connected component). Set the minimum level to warning to avoid
The component registry may still be useful for command-line use
in the future, so we leave it in place rather than disabling it
We have made many improvements to connection handling in Zuul.
Bring those back to Nodepool by copying over the zuul/zk directory
which has our base ZK connection classes.
This will enable us to bring other Zuul classes over, such as the
The existing connection-related code is removed and the remaining
model-style code is moved to nodepool.zk.zookeeper. Almost every
file imported the model as nodepool.zk, so import adjustments are
made to compensate while keeping the code more or less as-is.
The default zookeeper session timout is 10 seconds which is not enough
on a highly loaded nodepool. Like in zuul make this configurable so we
can avoid session losses.
This change adds the option to put quota on resources on a per-tenant
basis (i.e. Zuul tenants).
It adds a new top-level config structure ``tenant-resource-limits``
under which one can specify a number of tenants, each with
``max-servers``, ``max-cores``, and ``max-ram`` limits. These limits
are valid globally, i.e., for all providers. This is contrary to
currently existing provider and pool quotas, which only are consindered
for nodes of the same provider.
The launcher implements deletes using threads, and unlike with
launches, does not give drivers an opportunity to override that
and handle them without threads (as we want to do in the state
To correct this, we move the NodeDeleter class from the launcher
to driver utils, and add a new driver Provider method that returns
the NodeDeleter thread. This is added in the base Provider class
so all drivers get this behavior by default.
In the state machine driver, we override the method so that instead
of returning a thread, we start a state machine and add it to a list
of state machines that our internal state machine runner thread
This adds a CLI commend to set a flag in ZK for images indicating
that the image should be paused. This can be used to quickly pause
the building and uploading of one or more images globally. This
will effectively be boolean OR'd with the pause value for diskimage
builds in the config file.
In particular, this can be used to pause images for short durations,
either because a fix is imminent, or to allow the system to remain
stable while a configuration change goes through the CI/CD workflow.
This change enables setting configuration values through
environment variables. This is useful to manage user defined
configuration, such as user password, in Kubernetes deployment.
While YAML does have inbuilt support for anchors to greatly reduce
duplicated sections, anchors have no support for merging values. For
diskimages, this can result in a lot of duplicated values for each
image which you can not otherwise avoid.
This provides two new values for diskimages; a "parent" and
Specifying a parent means you inherit all the configuration values
from that image. Anything specified within the child image overwrites
the parent values as you would expect; caveats, as described in the
documentation, are that the elements field appends and the env-vars
field has update() semantics.
An "abstract" diskimage is not instantiated into a real image, it is
only used for configuration inheritance. This way you can make a
abstrat "base" image with common values and inherit that everywhere
without having to worry about bringing in values you don't want.
You can also chain parents together and the inheritance flows through.
Documentation is updated, and several tests are added to ensure the
correct parenting, merging and override behaviour of the new values.
Ensure 'name' is a primary key for diskimages.
Change the constructor to take the name as an argument. Update the
config validator to ensure there is a name, and that it is unique.
Add tests for both these cases.
Having python files with exec bit and shebang defined in
/usr/lib/python-*/site-package/ is not fine in a RPM package.
Instead of carrying a patch in nodepool RPM packaging better
to fix this directly upstream.
We broke nodepool configuration with
I3795fee1530045363e3f629f0793cbe6e95c23ca by not having the labels
defined in the OpenStack provider in the top-level label list.
The added check here would have found such a case.
The validate() function is reworked slightly; previously it would
return various exceptions from the tools it was calling (YAML,
voluptuous, etc.). Now we have more testing (and I'd imagine we could
do even more, similar vaildations too) we'd have to keep adding
exception types. Just make the function return a value; this also
makes sure the regular exit paths are taken from the caller in
nodepoolcmd.py, rather than dying with an exception at whatever point.
A unit test is added.
Co-Authored-By: Mohammed Naser <email@example.com>
This change allows you to specify a dib-cmd parameter for disk images,
which overrides the default call to "disk-image-create". This allows
you to essentially decide the disk-image-create binary to be called
for each disk image configured.
It is inspired by a couple of things:
The "--fake" argument to nodepool-builder has always been a bit of a
wart; a case of testing-only functionality leaking across into the
production code. It would be clearer if the tests used exposed
methods to configure themselves to use the fake builder.
Because disk-image-create is called from the $PATH, it makes it more
difficult to use nodepool from a virtualenv. You can not just run
"nodepool-builder"; you have to ". activate" the virtualenv before
running the daemon so that the path is set to find the virtualenv
In addressing activation issues by automatically choosing the
in-virtualenv binary in Ie0e24fa67b948a294aa46f8164b077c8670b4025, it
was pointed out that others are already using wrappers in various ways
where preferring the co-installed virtualenv version would break.
With this, such users can ensure they call the "disk-image-create"
binary they want. We can then make a change to prefer the
co-installed version without fear of breaking.
In theory, there's no reason why a totally separate
"/custom/venv/bin/disk-image-create" would not be valid if you
required a customised dib for some reason for just one image. This is
not currently possible, even modulo PATH hacks, etc., all images will
use the same binary to build. It is for this flexibility I think this
is best at the diskimage level, rather than as, say a global setting
for the whole builder instance.
Thus add a dib-cmd option for diskimages. In the testing case, this
points to the fake-image-create script, and the --fake command-line
option and related bits are removed.
It should have no backwards compatibility effects; documentation and a
release note is added.
Now that we publish dev releases to pypi, we should also allow those
versions to be displayed with --version flag.
Signed-off-by: Paul Belanger <firstname.lastname@example.org>
Now that there is no more TaskManager class, nor anything using
one, the use_taskmanager flag is vestigal. Clean it up so that we
don't have to pass it around to things anymore.
This patch makes the nodepool process avoid starting up as a daemon in
the Docker images, as it's not meant to become a background process
within a container. In order to have consistent logging like in the
daemonized mode we need to add a new foreground option that runs in
foreground but without debug logging.
Co-Authored-By: Tobias Henkel <email@example.com>
This reverts commit ccf40a462a.
The previous version would not work properly when daemonized
because there was no stdout. This version maintains stdout and
uses select/poll with non-blocking stdout to capture the output
to a log file.
A builder thread can wedge if the build process wedges. Add a timeout
to the subprocess. Since it was the call to readline() that would block,
we change the process to have DIB write directly to the log. This allows
us to set a timeout in the Popen.wait() call. And we kill the dib
subprocess, as well.
The timeout value can be controlled in the diskimage configuration and
defaults to 8 hours.
Adds a ProviderConfig class method that can be called to get
the config schema for the common config options in a Provider.
Drivers are modified to call this method.
In order to support static node pre-registration, we need to give
the provider manager the opportunity to register/deregister any
nodes in its configuration file when it starts (on startup or when
the config change). It will need a ZooKeeper connection to do this.
The OpenStack driver will ignore this parameter.
node_list takes an argument "detail" which adds a rather arbitrary
list of results to the output. This comes from the command-line,
where we're trying to keep width under a certain length; but doesn't
make as much sense here (especially for json).
For dashboard type applications, replace this with a simple "fields"
parameter which, if set, will only return those fields it sees in the
common text output function.
Note, this purposely doesn't apply to the JSON output, as it expected
client-side filtering is more appropriate there. We could also add
generic field support to the command-line tools, if considered
Add some documentation on all the end-points, and add info about these
Introduce a new configuration setting, "max_hold_age", that specifies
the maximum uptime of held instances. If set to 0, held instances
are kept until manually deleted. A custom value can be provided
at the rpcclient level.
This updates the builder to store individual build logs in dedicated
files, one per build, named for the image and build id. Old logs are
automatically pruned. By default, they are stored in
/var/log/nodepool/builds, but this can be changed.
This removes the need to specially configure logging handler for the
image build logs.