This is a follow-on to I3279c3b5cb8cf26d390835fd0a7049bc43ec40b5
As discussed in the referenced issue, the blank return and
AttributeError here is the correct way to determine a recently deleted
image. We don't need to catch the ClientError as that won't be raised
any more. Getting a value for the state would indicate it wasn't
deleted.
This bumps the moto base requirement to ensure we have this behaviour.
Change-Id: I2d5b0ccb9802aa0d4c81555a17f40fe8b8595ebd
Per the notes inline, the recent 3.1.6 moto release has changed the
way deleted images are returned. Catching the exception we now see
works, and we can revisit if we find better solutions.
Change-Id: I3279c3b5cb8cf26d390835fd0a7049bc43ec40b5
The python-path value should default to "auto" per documentation
and to match other drivers. Correct that.
Change-Id: Ie8254e10d9c4d8ff8f8f298fac32140a18248293
These changes are all in service of being able to better understand
AWS driver log messages:
* Use annotated loggers in the statemachine provider framework
so that we see the request, node, and provider information
* Have the statemachine framework pass annotated loggers to the
state machines themselves so that the above information is available
for log messages on individual API calls
* Add optional performance information to the rate limit handler
(delay and API call duration)
* Add some additional log entries to the AWS adapter
Also:
* Suppress boto logging by default in unit tests (it is verbose and
usually not helpful)
* Add coverage of node deletion in the AWS driver tests
Change-Id: I0e6b4ad72d1af7f776da73c5dd2a50b40f60e4a2
The optimization introduced in Id04f57ff1e4c28357370729b6383f5119cd616dc
can lead to a starvation of certain requests under the following
conditions:
* node request with a requested provider that doesn't support the
required node types
* another provider that could technically serve the request yields to
the requested provider
* requested provider is at quota or high inflow of node request
* requested provider postpones rejecting the yielded request as it
doesn't support the required labels
To avoid starvation of those request, we only yield to the requested
provider if it is capable of serving the request. This is done by
checking the supported labels of the requested provider.
Change-Id: I0ded785a13d1f955a71d519dc40e5e5c0ec35043
Propagate the NodeRequest.requestor to a new property Node.requestor
which usually holds the zuul_system_id, or "NodePool:min_ready".
This is set initially when first creating a node, and is updated when a
ready node is assigned to a request. This is to always know for which
requestor is a node got allocated.
Change-Id: Ifd54a94bae39f31a70bdedce384a64c2a04495c1
We noticed that a provider sometimes takes very long to serve a request
because it is busy declining requests with a higher priority. This is
made worse by the fact that we stop processing requests to remove
completed handlers every 15s. Afterwards we will start over and also
include new requests.
This means, that as long as there are requests with a higher priority we
will process them first even if they need to be declined because the
pool doesn't provide the requested label(s).
To improve this we will first sort the open node requests based on
whether they can be fulfilled by the current provider and only then by
priority. This has the added benefit that a provider might not even need
to decline a request if it could be fulfilled by another provider in the
meantime.
Change-Id: Id04f57ff1e4c28357370729b6383f5119cd616dc
This adds several new tests:
This exercises the new ipv6 functionality. The boto3 mocks do not
support assigning an ipv6 address on creation, so we adopt a
unit-test style approach and verify the calling side and our
processing of the results separately. This uses the patching
scheme that we just removed in the previous commit, but alters it
to simply record the call arguments so we can validate them outside
the patched method.
This adds a diskimage upload test to the AWS driver. Because moto
doesn't support the create_image method, we need to add a fake for
that. Boto's practice of creating proxy objects makes that hard to
do, so some helper methods are added on the adapter class to make
it easier for the tests to override.
The responses in the fake are based on a recorded session.
This adds a test of leaked resource cleanup, including the automatic
tagging of untagged resources from image import tasks.
Change-Id: I0626061b246e9c52b08c49394d4d22b46beeff7a
The default zookeeper session timout is 10 seconds which is not enough
on a highly loaded nodepool. Like in zuul make this configurable so we
can avoid session losses.
Change-Id: Id7087141174c84c6cdcbb3933c233f5fa0e7d569
This makes the AWS driver tests more modular, with minimal changes.
This will allow us to add more tests without extending the
current single-function-which-performs-all-tests.
It splits the config into multiple files, one for each distinct pool
configuration. This means that tests only run one pool thread (compared
to the current situation where every test runs every pool).
Whatever defect in moto which appeared to require patching the network
methods to verify the call signature appears to have been rectified, so
that is removed.
The two tags tests have been combined into one.
Private IP validation is improved so that we verify the values set on
the nodepool node rather than the mock call args.
There was no test that verified the EBS-optimised setting, so that has
been added.
Change-Id: I023ec41f015c2fcb20835fc0e38714149944de84
This updates the aws driver to use the statemachine framework which
should be able to scale to a much higher number of parallel operations
than the standard thread-per-node model. It is also simpler and
easier to maintain. Several new features are added to bring it to
parity with other drivers.
The unit tests are changed minimally so that they continue to serve
as regression tests for the new framework. Following changes will
revise the tests and add new tests for the additional functionality.
Change-Id: I8968667f927c82641460debeccd04e0511eb86a9
This adds QuotaSupport to all the drivers that don't have it, and
also updates their tests so there is at least one test which exercises
the new tenant quota feature.
Since this is expected to work across all drivers/providers/etc, we
should start including at least rudimentary quota support in every
driver.
Change-Id: I891ade226ba588ecdda835b143b7897bb4425bd8
If a provider (or its configuration) is sufficiently broken that
the provider manager is unable to start, then the launcher will
go into a loop where it attempts to restart all providers in the
system until it succeeds. During this time, no pool managers are
running which mean all requests are ignored by this launcher.
Nodepool continuously reloads its configuration file, and in case
of an error, the expected behavior is to continue running and allow
the user to correct the configuration and retry after a short delay.
We also expect providers on a launcher to be independent of each
other so that if ones fails, the others continue working.
However since we neither exit, nor process node requests if a
provider manager fails to start, an error with one provider can
cause all providers to stop handling requests with very little
feedback to the operator.
To address this, if a provider manager fails to start, the launcher
will now behave as if the provider were absent from the config file.
It will still emit the error to the log, and it will continuously
attempt to start the provider so that if the error condition abates,
the provider will start.
If there are no providers on-line for a label, then as long as any
provider in the system is running, node requests will be handled
and declined and possibly failed while the broken provider is offilne.
If the system contains only a single provider and it is broken, then
no requests will be handled (failed), which is the current behavior,
and still likely to be the most desirable in that case.
Change-Id: If652e8911993946cee67c4dba5e6f88e55ac7099
A DibImageFile represents one dib image-file on disk, so the extension
is required (see prior change
I214581ad80b7740e7ca749b574672d2c33b92474 where we modified callers
who were using this interface).
This fixes a bug by removing code; the pathlib with_suffix replacement
is not safe for image names with a period in them; consider
>>> pathlib.Path('image-v1.2-foo').with_suffix('.vhd')
PosixPath('image-v1.vhd')
We can now simply unconditionally append the extension in
DibImageFile.to_path().
Change-Id: I1bc812ddffacbcc414b8f7f372d9fca78bd87292
The static method from_path is only used from one place and is simply
joining a path; we can inline this and remove it for clarity.
Change-Id: Iade6e024516bf9ce212491d6461e00affb5971a0
This driver supplies "static" nodes that are actually backed by
another nodepool node. The use case is to be able to request a single
large node (a "backing node") from a cloud provider, and then divide
that node up into smaller nodes that are actually used ("requested
nodes"). A backing node can support one or more requested nodes, and
backing nodes should scale up or down as necessary.
Change-Id: I29d78705a87a53ee07dce6022b81a1ce97c54f1d
The assertEquals method has been deprecated since it was renamed
to assertEqual in Python 3.2.
https: //docs.python.org/3/library/unittest.html#deprecated-aliases
Change-Id: I306d43862eb6c7a36dad1d3a50822c2758fae5fe
Under Azure, an admin password is required in order to launch a
VM from a Windows image. Add support for that.
Also, shorten the node name to less than 15 characters in order
to accomodate Windows restrictions.
Change-Id: I899f3e02046ffdb5f9fd19fe90c4bc9afdb01a7c
Providers have pools have a provider have pools...
This fix mirrors the approach from the AWS driver.
Change-Id: I33be1ba7c604754139566642ca6a863304a74e73
(cherry picked from commit 559e3098d1)
This is the relatively common feature that allows users to include
cloud_init or similar data. Azure has two versions of it, user-data
and custom-data. For Nodepool's purpose they are similar, but a user
may prefer one or the other.
Also adds missing docs for the existing tags attribute.
Change-Id: Ia2f78a827c1909cc527733013167b2f3b5db18a3