nodepool

Author	SHA1	Message	Date
Jenkins	f43330c6c0	Merge "Fix potential floating ip leakage issue"	2014-08-14 16:24:30 +00:00
Jenkins	aa29c6729f	Merge "Add timestamps to nodepool logging"	2014-08-14 16:20:23 +00:00
Yolanda Robla	beab513bc9	Build images using diskimage-builder Create images locally using diskimage-builder, and then upload to glance. Co-Authored-By: Monty Taylor <mordred@inaugust.com> Change-Id: I8e96e9ea5b74ca640f483c9e1ad04a584b5660ed	2014-08-14 11:31:57 +02:00
Jonathan Harker	b8da6b7bff	Add timestamps to nodepool logging Change-Id: If3a8abcc3c35616dacacf7d9532cd1862b3c32a5	2014-08-13 11:40:04 -07:00
Aaron Rosen	f0068be114	Fix potential floating ip leakage issue Previously, when cleanupServer() was called it would first query nova to get the server and then release the floatingip associated with it (if there was one) and then delete the floatingip. The potential floating ip leak would occur if we called removeFloatingIP and the call to deleteFloatingIP failed. The leak would occur the next time we tried to clean up the server as we no longer find the floatingip to delete as it had been disassocated from the server. To fix this issue we just need to drop the call to removeFloating and call deleteFloatingIP as we don't really need to disassociate the floatingip first. Unfortunately, there is a potential bug in nova-api when using neutron that this call can disassociate the floatingip and then fail to delete it. In this case we'll end up leaking the floatingip in the same way. The fix for this nova issue is here: I53b0c9d949404288e8687b304361a74b53d69ef3 closes-bug: 1356161 Change-Id: I0c78823198fac0d31235d93505a4251edbf9e612	2014-08-12 19:26:09 -07:00
Mike Heald	bc40b65b47	Add stdout and stderr to exception when ready script fails When a command run through ssh fails, nodepool raises an exception saying that it failed, but doesn't include any other information. This patch adds in the output from stderr and stdout if output=True, as well as the exact command run to help debug problems. Also, added output parameter to the fake ssh client for testing. Change-Id: If55643aef91c90b7c27fe4f532d51d9ef72b1ab4	2014-08-12 12:18:30 +01:00
Jenkins	cdb1e87c23	Merge "Record provider AZ info in graphite."	2014-08-05 19:26:14 +00:00
Jenkins	f69d62bb62	Merge "Add support for network labels"	2014-08-01 23:53:40 +00:00
Clark Boylan	35192322c6	Record provider AZ info in graphite. When constructing launchStats() subkeys for graphite send the existing key data but also add a key for provider.az info if we have an az. This rolls up the az data into non az prefixed key but also provides az specific information. This will track launch stats for individual AZs. Change-Id: Ie67238950e9bd927f942f21fadb7f3e894de118d	2014-08-01 16:51:45 -07:00
Jenkins	8e7c5a1620	Merge "Drop voluptuous from requirements"	2014-08-01 23:51:20 +00:00
Jenkins	6f300e739c	Merge "Use correct provider in test-case"	2014-07-29 12:08:27 +00:00
Jenkins	0e0273fc60	Merge "Cleaning up index.rst file"	2014-07-28 13:09:50 +00:00
Ian Wienand	9c26406d95	Use correct provider in test-case A typo here had the second provider using the available value for the from the first provider. Change-Id: Iba85aeba6beaa80f8a02eff3f3fe6ebd394d36b0	2014-07-28 11:15:39 +10:00
James E. Blair	fb27c84bbf	Revert "Track last allocations to ensure forward-progress" This reverts commit 9f553c9a9752071129f6e8a31535829c5e9a0d91. We have observed a problem with this patch where nodepool may attempt to allocate more than the configured quota. Steps to reproduce: * Nodepool has a very high (>2x quota) request load * Restart nodepool * First pass through allocator appears normal and allocates up to quota * Second pass allocates an additional $quota worth of nodes, all from the last-defined provider * Repeats until request load is satisfied This caused us to request thousands of nodes from our providers at once. Change-Id: I08e5fd2de668cc2fc2d68bb1bf09f2d725f82c7f	2014-07-23 14:57:09 -07:00
Christian Berendt	45163917ed	Cleaning up index.rst file Removed notes about the generation of the file. Change-Id: I5d46f4b52dba3b07c360bcaf872ed2c6b0579555	2014-07-21 08:26:39 +02:00
Jenkins	dc518014a0	Merge "Enable debugging output"	2014-07-14 12:51:15 +00:00
Jenkins	46f8c4537a	Merge "Fix Configuration link in docs"	2014-07-14 12:49:45 +00:00
Antoine Musso	84d2c7a156	Drop voluptuous from requirements voluptuous is a library that can be used to validate a YAML schema. It is unused and if it ever will, should be bumped to 0.7+ anyway. Change-Id: If26d7326714206f3736aaea0e2d6ecc57f995692	2014-07-10 12:02:37 +02:00
K Jonathan Harker	4b6b74a17f	Enable debugging output Nodepool currently writes to the DEBUG logging level with no way to view the logging at that level. This creates a --debug argument to show those logging events. Change-Id: I6d4541a587b5ecd654287fda6da0e138998c200a	2014-07-03 15:29:25 -07:00
Dan Prince	e471cea178	Add support for network labels This patch adds the ability to specify a net-label instead of using a net-id (network UUID). Rather than use network UUID's in our nodepool.yaml config files it would be nice to use the more meaningful network labels instead. This should make the config file more readable across the various cloud providers and matches how we use image names (instead of image UUIDs) as well. The current implementation relies on the os-tenant-networks extension in Nova to provide the network label lookup. Given that nodepool is currently focused around using novaclient this made the most sense. We may at some point in the future want to use the Neutron API directly for this information or perhaps use a combination of both approaches to accommodate a variety of provider API deployment choices. Tested locally on my TripleO overcloud using two Neutron networks. Change-Id: I9bdd35adf2d85659cf1b992ccd2fcf98fb124528	2014-07-03 15:32:57 -04:00
Jenkins	fc335e41be	Merge "Track last allocations to ensure forward-progress"	2014-07-02 22:41:18 +00:00
Jenkins	01a7270ad9	Merge "Update pbr version"	2014-07-02 07:39:47 +00:00
Ian Wienand	9f553c9a97	Track last allocations to ensure forward-progress Track the last round of allocations to ensure that we don't starve a particular label. If a label made a request but didn't get any nodes the last time round, it is given nodes preferentially on the next calculation. This results in a round-robin allocation during heavy contention. Note that in the much more usual no-contention case, when everyone is getting some of their allocations, this makes no change of over the status quo. AllocationHistory() is added as a new object to track the request and grant of allocations. It is instantiated an passed along with an AllocationRequeset(). When all requests are finished, grantsDone() is called on the object to store the history for that round. By keeping the AllocationHistory object, one could imagine much fancier algorithms where preference is given proportionally based on how many prior allocations have failed, etc. This is intended to provide infrastructure for such a change, but mostly to be a simple first-pass at the problem with minimal changes to the status quo. Change-Id: I0ff9aa74fef807bd84bf51e7ba1ed176c22f5365 Closes-Bug: #1308407	2014-07-01 19:14:56 +00:00
Elizabeth K. Joseph	9c4f91a9f3	Fix Configuration link in docs The configuration link on the installation page was trying to link to a local #configuration, we instead want it to reference the configuration.html page. Change-Id: Ie1d3bb7300902185e07aef91e58e403ca67981ba	2014-07-01 11:44:20 -07:00
Joshua Hesketh	f41385d146	Make template and node hostnames configurable Instead of assuming nodes are for openstack.org make the hostnames of all nodes and templates configurable on a provider level. Change-Id: I5d5650fe6b22ecb25b994767e48e7742d7238a18	2014-06-30 18:03:46 +10:00
Longgeek	7a2721e50c	Update pbr version Change-Id: I1aa8a8ceb39e88aeb5989b33484237fcb778480f	2014-06-29 00:20:36 +08:00
Jenkins	93618a2f70	Merge "Handle task manager shutdown more correctly"	2014-06-26 19:14:32 +00:00
Jenkins	653955efda	Merge "Show expected output in test-case error"	2014-06-22 16:02:27 +00:00
Jenkins	dc12f678ad	Merge "Add @localhost to openstack_citest user example"	2014-06-22 16:00:43 +00:00
Jenkins	2430fe0b1a	Merge "Pass in hostname as a script parameter"	2014-06-22 15:59:34 +00:00
Ian Wienand	1d5397ee8a	Show expected output in test-case error It can get a bit confusing as to what test is looking for what result. Add a message to the failure case to clear things up. Looks like: --- raise mismatch_error MismatchError: 1 != 2: Error at pos 1, expected [1, 2, 3, 1] and got [1, 1, 1, 1] ====================================================================== --- TrivialFix Change-Id: I40b57394b270f419b032301f490d4ba791c66396	2014-06-19 11:23:27 +10:00
Ian Wienand	b87d94df9d	Add @localhost to openstack_citest user example Without this, mysql can match the anonymous user first and then rejects the openstack_citest password [1] leading to some confusing tox output. [1] http://bugs.mysql.com/bug.php?id=36576 TrivialFix Change-Id: Ic9a753307960634f0e5c40abf06ec5bac92d9897	2014-06-18 08:41:24 +10:00
James E. Blair	8ccc227b2d	Handle task manager shutdown more correctly If a task manager was stopped with tasks in queue, they would not be processed. This could result in deadlocked threads if one of the long-running threads was waiting on a task while the main thread shut down a manager. Change-Id: I3f58c599d472d134984e63b41e9d493be9e9d70b	2014-06-17 08:26:33 -07:00
Jenkins	0dc4b59b7d	Merge "Check for stale PID lock when starting"	2014-06-14 18:23:55 +00:00
Christian Berendt	8bbfde2199	Use import from six.moves to import the queue module The name of the synchronized queue class is queue instead of Queue in Python3. Change-Id: I508268561f95c9fed2d39fb45731aab5d9d74111	2014-06-07 21:07:02 +02:00
K Jonathan Harker	8e254b4994	Pass in hostname as a script parameter Pass the hostname in as the first parameter to both the setup script and the ready script. Change-Id: I0de51156b56ae750dd519da0da68b85ac5d41267	2014-06-06 16:23:19 -07:00
Bob Ball	02a45bf2d9	Check for stale PID lock when starting The PID lock file does not actually test whether the lock is valid during acquisition. This leads to a failure to acquire the lock (essentially just a file-based lock) when starting the process after a failure / kill. Change-Id: Iebf0e077377278eb28b3280a8abfc605ac68a759	2014-06-06 09:44:33 +01:00
Jenkins	174a5d7f6d	Merge "Add warnings about the installation of libzmq1"	2014-06-06 07:30:47 +00:00
Jenkins	13bcfa5664	Merge "Remove libzmq-dev from dependency list"	2014-06-06 07:22:37 +00:00
Xinyu Zhao	bd0a5cce05	Remove libzmq-dev from dependency list System installed library of libzmq-dev will have libzmq1, which doesn't support RCVTIMEO in some linux distros, eg. ubuntu 12.04 and pip will install libzmq and compile the supporting version anyway so remove libzmq-dev from dependency installation by apt-get in README.rst. Change-Id: Ifd67c7ded5db7dbb82624d2aa08843278f2de72c	2014-06-05 19:04:56 +00:00
Clark Boylan	37034efc30	Increase watermark sleep in tests for reliability. The nodepool tests depend on watermark sleep being long enough that all of the fake nodes are marked ready before the next deficit check. If it is not long enough then the tests may boot extra servers causing asserts in tests to fail. Increase watermark sleep to one second from half a second and let each test sleep for three seconds before checking node states. Change-Id: Ia94527b46bad26b184af8fa02b3a1e2a1f7a3430	2014-06-05 10:25:03 -07:00
Bob Ball	8b0ba7bcf2	Add warnings about the installation of libzmq1 With the incorrect compile-time options for libzmq1, pyzmq will give an error such as the following: AttributeError: Socket has no such option: RCVTIMEO Change-Id: I719a8de89b26dba974d7af8d631b7cdd729a074b	2014-06-04 15:50:47 +00:00
James E. Blair	b6539f9cdd	Log task durations Change-Id: I87c6f870ccb806d3484b38ac123ac89201a854cd	2014-06-03 14:31:15 -07:00
James E. Blair	734435b772	Log task manager queue length A long queue isn't necessarily bad, but more information could be helpful. Change-Id: I34d6bb5c1627af1fbc700458d8950add296bc1bc	2014-06-03 14:31:15 -07:00
James E. Blair	bbd0c0ba7d	Prevent listserver tasks from piling up If a bunch of threads waiting for servers all decided that the server list was expired at around the same time, they could all end up submitting server list tasks which defeats the caching. If listing all servers was slow, that would only exacerbate the situation. Instead, wrap the actual list server API call inside of a non-blocking lock to make sure that it only happens once per period. Change-Id: I2a09ab3a226001d9de4268d366f65ef3e69cdd0d	2014-06-03 14:31:15 -07:00
James E. Blair	1a0b20ef2c	Check the returned image status We did not check the status of a completed image build; if it went into the ERROR state, we assumed it worked. Check the return value. Change-Id: I2acf8ac4c5641aa69932230d3414e92620f6e735	2014-06-03 14:31:15 -07:00
Jenkins	1b3b85e76d	Merge "Use except x as y instead of except x, y"	2014-06-03 09:20:45 +00:00
Clark Boylan	7a90ee061d	Display node AZ in `nodepool list` output Add the availability zone data for a node to the `nodepool list` output. This should aid in debugging of AZs. Change-Id: If861e666c5d9eec4f4f1ddf1bb431fb06436b6e6	2014-06-02 14:40:23 -07:00
Clark Boylan	a297ee63ec	Support provider AZ lists Apparently not all clouds will schedule AZs as expected by nova. It may be the case that an AZ is hard set on the provider side when no specific AZ is requested. Add AZ support to nodepool so that it can request AZs and better load balance across AZs provided. Do this by randomly selecting an AZ from the list of AZs provided in the config. This should give us a good distribution across all AZs. If a differently weighted distribution is required a nodepool provider object can be created per AZ with a single item AZ list. Note this requires an update to the nodepool database. Change-Id: I428336ad817a8eb7d311a68767849aab0bcf015f	2014-06-02 12:06:42 -07:00
Christian Berendt	e3dd94d65c	Use except x as y instead of except x, y According to https://docs.python.org/3/howto/pyporting.html the syntax changed in Python 3.x. The new syntax is usable with Python >= 2.6 and should be preferred to be compatible with Python3. Enabled hacking check H231. Change-Id: Ide60f971493440311f1dcc594e33d536beb925e5	2014-05-29 23:57:48 +02:00

... 44 45 46 47 48 ...

2497 Commits