This augments the dib-request list (which shows what images have
manual build requests) with information about whether the image
is paused. The resulting command is renamed to "image-status".
Change-Id: If75a8757b4ec93563e47bfdf0a239a9c21660c45
Image build requests can now be retrieved through the /dib-request-list
endpoint or via the dib-request-list sub-command. The list will show the
age of the request and if it is still pending or if there is already a
build in progress.
Change-Id: If73d6c9fcd5bd94318f389771248604a7f51c449
Users of the "nodepool" command don't need to see the component
registry logs at info level (which output at least a line for each
connected component). Set the minimum level to warning to avoid
that.
The component registry may still be useful for command-line use
in the future, so we leave it in place rather than disabling it
entirely.
Change-Id: I8c0937d7304ddc536773cf74fc40bbf6e79918d4
We have made many improvements to connection handling in Zuul.
Bring those back to Nodepool by copying over the zuul/zk directory
which has our base ZK connection classes.
This will enable us to bring other Zuul classes over, such as the
component registry.
The existing connection-related code is removed and the remaining
model-style code is moved to nodepool.zk.zookeeper. Almost every
file imported the model as nodepool.zk, so import adjustments are
made to compensate while keeping the code more or less as-is.
Change-Id: I9f793d7bbad573cb881dfcfdf11e3013e0f8e4a3
The default zookeeper session timout is 10 seconds which is not enough
on a highly loaded nodepool. Like in zuul make this configurable so we
can avoid session losses.
Change-Id: Id7087141174c84c6cdcbb3933c233f5fa0e7d569
This change adds the option to put quota on resources on a per-tenant
basis (i.e. Zuul tenants).
It adds a new top-level config structure ``tenant-resource-limits``
under which one can specify a number of tenants, each with
``max-servers``, ``max-cores``, and ``max-ram`` limits. These limits
are valid globally, i.e., for all providers. This is contrary to
currently existing provider and pool quotas, which only are consindered
for nodes of the same provider.
Change-Id: I0c0154db7d5edaa91a9fe21ebf6936e14cef4db7
The launcher implements deletes using threads, and unlike with
launches, does not give drivers an opportunity to override that
and handle them without threads (as we want to do in the state
machine driver).
To correct this, we move the NodeDeleter class from the launcher
to driver utils, and add a new driver Provider method that returns
the NodeDeleter thread. This is added in the base Provider class
so all drivers get this behavior by default.
In the state machine driver, we override the method so that instead
of returning a thread, we start a state machine and add it to a list
of state machines that our internal state machine runner thread
should drive.
Change-Id: Iddb7ed23c741824b5727fe2d89c9ddbfc01cd7d7
This adds a CLI commend to set a flag in ZK for images indicating
that the image should be paused. This can be used to quickly pause
the building and uploading of one or more images globally. This
will effectively be boolean OR'd with the pause value for diskimage
builds in the config file.
In particular, this can be used to pause images for short durations,
either because a fix is imminent, or to allow the system to remain
stable while a configuration change goes through the CI/CD workflow.
Change-Id: I21a573dfc337c51f319afe3695d5446b2c91d70b
This change enables setting configuration values through
environment variables. This is useful to manage user defined
configuration, such as user password, in Kubernetes deployment.
Change-Id: Iafbb63ebbb388ef3038f45fd3a929c3e7e2dc343
While YAML does have inbuilt support for anchors to greatly reduce
duplicated sections, anchors have no support for merging values. For
diskimages, this can result in a lot of duplicated values for each
image which you can not otherwise avoid.
This provides two new values for diskimages; a "parent" and
"abstract".
Specifying a parent means you inherit all the configuration values
from that image. Anything specified within the child image overwrites
the parent values as you would expect; caveats, as described in the
documentation, are that the elements field appends and the env-vars
field has update() semantics.
An "abstract" diskimage is not instantiated into a real image, it is
only used for configuration inheritance. This way you can make a
abstrat "base" image with common values and inherit that everywhere
without having to worry about bringing in values you don't want.
You can also chain parents together and the inheritance flows through.
Documentation is updated, and several tests are added to ensure the
correct parenting, merging and override behaviour of the new values.
Change-Id: I170016ef7d8443b9830912b9b0667370e6afcde7
Ensure 'name' is a primary key for diskimages.
Change the constructor to take the name as an argument. Update the
config validator to ensure there is a name, and that it is unique.
Add tests for both these cases.
Change-Id: I3931dc1457c023154cde0df2bb7b0a41cc6f20d3
Change the 'info' command output to include image upload data.
For each image, we'll now output each build and the uploads for the build.
Change-Id: Ib25ce30d30ed718b2b6083c2127fdb214c3691f4
Having python files with exec bit and shebang defined in
/usr/lib/python-*/site-package/ is not fine in a RPM package.
Instead of carrying a patch in nodepool RPM packaging better
to fix this directly upstream.
Change-Id: I5a01e21243f175d28c67376941149e357cdacd26
We broke nodepool configuration with
I3795fee1530045363e3f629f0793cbe6e95c23ca by not having the labels
defined in the OpenStack provider in the top-level label list.
The added check here would have found such a case.
The validate() function is reworked slightly; previously it would
return various exceptions from the tools it was calling (YAML,
voluptuous, etc.). Now we have more testing (and I'd imagine we could
do even more, similar vaildations too) we'd have to keep adding
exception types. Just make the function return a value; this also
makes sure the regular exit paths are taken from the caller in
nodepoolcmd.py, rather than dying with an exception at whatever point.
A unit test is added.
Co-Authored-By: Mohammed Naser <mnaser@vexxhost.com>
Change-Id: I5455f5d7eb07abea34c11a3026d630dee62f2185
This change allows you to specify a dib-cmd parameter for disk images,
which overrides the default call to "disk-image-create". This allows
you to essentially decide the disk-image-create binary to be called
for each disk image configured.
It is inspired by a couple of things:
The "--fake" argument to nodepool-builder has always been a bit of a
wart; a case of testing-only functionality leaking across into the
production code. It would be clearer if the tests used exposed
methods to configure themselves to use the fake builder.
Because disk-image-create is called from the $PATH, it makes it more
difficult to use nodepool from a virtualenv. You can not just run
"nodepool-builder"; you have to ". activate" the virtualenv before
running the daemon so that the path is set to find the virtualenv
disk-image-create.
In addressing activation issues by automatically choosing the
in-virtualenv binary in Ie0e24fa67b948a294aa46f8164b077c8670b4025, it
was pointed out that others are already using wrappers in various ways
where preferring the co-installed virtualenv version would break.
With this, such users can ensure they call the "disk-image-create"
binary they want. We can then make a change to prefer the
co-installed version without fear of breaking.
In theory, there's no reason why a totally separate
"/custom/venv/bin/disk-image-create" would not be valid if you
required a customised dib for some reason for just one image. This is
not currently possible, even modulo PATH hacks, etc., all images will
use the same binary to build. It is for this flexibility I think this
is best at the diskimage level, rather than as, say a global setting
for the whole builder instance.
Thus add a dib-cmd option for diskimages. In the testing case, this
points to the fake-image-create script, and the --fake command-line
option and related bits are removed.
It should have no backwards compatibility effects; documentation and a
release note is added.
Change-Id: I6677e11823df72f8c69973c83039a987b67eb2af
Now that we publish dev releases to pypi, we should also allow those
versions to be displayed with --version flag.
Change-Id: I045c9d5382a1035cd7678f9882e32d371f108555
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
This change adds a new python_path Node attribute so that zuul executor
can remove the default hard-coded ansible_python_interpreter.
Change-Id: Iddf2cc6b2df579636ec39b091edcfe85a4a4ed10
This duplicates the logic in zuul, and makes us consistent with
current nodepool documentation that says we already support this.
Change-Id: Ib92272b302a5225726a830ee50571fb7ad96e457
Now that there is no more TaskManager class, nor anything using
one, the use_taskmanager flag is vestigal. Clean it up so that we
don't have to pass it around to things anymore.
Change-Id: I7c1f766f948ad965ee5f07321743fbaebb54288a
Change Ie14935f604f23b0928eed0dd8e28dff49699a2d1 altered one use of
this method, but this one was missed.
Change-Id: I299a12d73a6524f5097712f97342aed640786eea
This patch makes the nodepool process avoid starting up as a daemon in
the Docker images, as it's not meant to become a background process
within a container. In order to have consistent logging like in the
daemonized mode we need to add a new foreground option that runs in
foreground but without debug logging.
Change-Id: I77e9e6e4f94cf726336419a2b22916cc1e974e62
Co-Authored-By: Tobias Henkel <tobias.henkel@bmw.de>
This reverts commit ccf40a462a.
The previous version would not work properly when daemonized
because there was no stdout. This version maintains stdout and
uses select/poll with non-blocking stdout to capture the output
to a log file.
Depends-On: https://review.openstack.org/634266
Change-Id: I7f0617b91e071294fe6051d14475ead1d7df56b7
A builder thread can wedge if the build process wedges. Add a timeout
to the subprocess. Since it was the call to readline() that would block,
we change the process to have DIB write directly to the log. This allows
us to set a timeout in the Popen.wait() call. And we kill the dib
subprocess, as well.
The timeout value can be controlled in the diskimage configuration and
defaults to 8 hours.
Change-Id: I188e8a74dc39b55a4b50ade5c1a96832fea76a7d
Argparse can output the default values of the runtime
arguments. Enable it so we have them documented.
Change-Id: I4fd7f21546615bd249485707521d8222fab10962
Adds a ProviderConfig class method that can be called to get
the config schema for the common config options in a Provider.
Drivers are modified to call this method.
Change-Id: Ib67256dddc06d13eb7683226edaa8c8c10a73326
We currently only need to setup the zNode caches in the
launcher. Within the commandline client and the builders this is just
unneccessary work.
Change-Id: I03aa2a11b75cab3932e4b45c5e964811a7e0b3d4
In order to support static node pre-registration, we need to give
the provider manager the opportunity to register/deregister any
nodes in its configuration file when it starts (on startup or when
the config change). It will need a ZooKeeper connection to do this.
The OpenStack driver will ignore this parameter.
Change-Id: Idd00286b2577921b3fe5b55e8f13a27f2fbde5d6
node_list takes an argument "detail" which adds a rather arbitrary
list of results to the output. This comes from the command-line,
where we're trying to keep width under a certain length; but doesn't
make as much sense here (especially for json).
For dashboard type applications, replace this with a simple "fields"
parameter which, if set, will only return those fields it sees in the
common text output function.
Note, this purposely doesn't apply to the JSON output, as it expected
client-side filtering is more appropriate there. We could also add
generic field support to the command-line tools, if considered
worthwhile.
Add some documentation on all the end-points, and add info about these
parameters.
Change-Id: Ifbf1019b77368124961e7aa28dae403cabe50de1
All the _list functions return the same thing; save the results and
use a common output function to generate the json or text output.
Change-Id: I9cb44b09de2cb948e7381ef10302b21040433a2c
Introduce a new configuration setting, "max_hold_age", that specifies
the maximum uptime of held instances. If set to 0, held instances
are kept until manually deleted. A custom value can be provided
at the rpcclient level.
Change-Id: I9a09728e5728c537ee44721f5d5e774dc0dcefa7
This updates the builder to store individual build logs in dedicated
files, one per build, named for the image and build id. Old logs are
automatically pruned. By default, they are stored in
/var/log/nodepool/builds, but this can be changed.
This removes the need to specially configure logging handler for the
image build logs.
Change-Id: Ia7415d2fbbb320f8eddc4e46c3a055414df5f997