Having python files with exec bit and shebang defined in
/usr/lib/python-*/site-package/ is not fine in a RPM package.
Instead of carrying a patch in nodepool RPM packaging better
to fix this directly upstream.
Change-Id: I5a01e21243f175d28c67376941149e357cdacd26
The quota calculations in nodepool never can be perfectly accurate
because there still can be some races with other launchers in the
tenant or by other workloads sharing the tenant. So we must be able to
gracefully handle the case when the cloud refuses a launch because of
quota. Currently we just invalidate the quota cache and immediately
try again to launch the node. Of course that will fail again in most
cases. This makes the node request and thus the zuul job fail with
NODE_FAILURE.
Instead we need to pause the handler like it would happen because of
quota calculation. Instead we mark the node as aborted which is no
failure indicator and pause the handler so it will automatically
reschedule a new node launch as soon as the quota calculation allows
it.
Change-Id: I122268dc7723901c04a58fa6f5c1688a3fdab227
In case of an image with the connection type winrm we cannot scan the
ssh host keys. So in case the connection type is not ssh we
need to skip gathering the host keys.
Change-Id: I56f308baa10d40461cf4a919bbcdc4467e85a551
The pep8 rules used in nodepool are somewhat broken. In preparation to
use the pep8 ruleset from zuul we need to fix the findings upfront.
Change-Id: I9fb2a80db7671c590cdb8effbd1a1102aaa3aff8
This change moves OpenStack related code to a driver. To avoid circular
import, this change also moves the StatsReporter to the stats module so that
the handlers doesn't have to import the launcher.
Change-Id: I319ce8780aa7e81b079c3f31d546b89eca6cf5f4
Story: 2001044
Task: 4614
This implements the API necessary to perform the ZooKeeper functionality
outlined in the "Nodepool: Use ZooKeeper for Workers" spec:
http://specs.openstack.org/openstack-infra/infra-specs/specs/nodepool-zookeeper-workers.html
This API is not used yet, but will be used and modified where necessary
in upcoming reviews based on this work.
Change-Id: I681722a1f2dc3fe13efa2baa3a1a7acd1cbe50ee
We wrote shade as an extraction of the logic we had in nodepool, and
have since expanded it to support more clouds. It's time to start
using it in nodepool, since that will allow us to add more clouds
and also to handle a wider variety of them.
Making a patch series was too tricky because of the way fakes and
threading work, so this is everything in one stab.
Depends-On: I557694b3931d81a3524c781ab5dabfb5995557f5
Change-Id: I423716d619aafb2eca5c1748bc65b38603a97b6a
Co-Authored-By: James E. Blair <jeblair@linux.vnet.ibm.com>
Co-Authored-By: David Shrewsbury <shrewsbury.dave@gmail.com>
Co-Authored-By: Yolanda Robla <yolanda.robla-mota@hpe.com>
At the moment, grepping through logs to determine what's happening with
timeouts on a provider is difficult because for some errors the cause of
the timeout is on a different line than the provider in question.
Give each timeout a specific named exception, and then when we catch the
exceptions, log them specifically with node id, provider and then the
additional descriptive text from the timeout exception. This should
allow for easy grepping through logs to find specific instances of
types of timeouts - or of all timeouts. Also add a corresponding success
debug log so that comparitive greps/counts are also easy.
Change-Id: I889bd9b5d92f77ce9ff86415c775fe1cd9545bbc
It would be great if builders distinguished between a job failure
(invalid args, config, etc) and an exception (our code is broken). To do
this, we need to make our own exceptions and use them.
Change-Id: I31abb6fc2379ccac73b2045673eba453ac4a67a0