OpenStack Compute (Nova)
Go to file
Julia Kreger 67be896e0f [ironic] Minimize window for a resource provider to be lost
This patch is based upon a downstream patch which came up in discussion
amongst the ironic community when some operators began discussing a case
where resource providers had disappeared from a running deployment with
several thousand baremetal nodes.

Discussion amongst operators and developers ensued and we were able
to determine that this was still an issue in the current upstream code
and that time difference between collecting data and then reconciling
the records was a source of the issue. Per Arun, they have been running
this change downstream and had not seen any reoccurances of the issue
since the patch was applied.

This patch was originally authored by Arun S A G, and below is his
original commit mesage.

An instance could be launched and scheduled to a compute node between
get_uuids_by_host() call and _get_node_list() call. If that happens
the ironic node.instance_uuid may not be None but the instance_uuid
will be missing from the instance list returned by get_uuids_by_host()
method. This is possible because _get_node_list() takes several minutes to return
in large baremetal clusters and a lot can happen in that time.

This causes the compute node to be orphaned and associated resource
provider to be deleted from placement. Once the resource provider is
deleted it is never created again until the service restarts. Since
resource provider is deleted subsequent boots/rebuilds to the same
host will fail.

This behaviour is visibile in VMbooter nodes because it constantly
launches and deletes instances there by increasing the likelihood
of this race condition happening in large ironic clusters.

To reduce the chance of this race condition we call _get_node_list()
first followed by get_uuids_by_host() method.

Change-Id: I55bde8dd33154e17bbdb3c4b0e7a83a20e8487e8
Co-Authored-By: Arun S A G <saga@yahoo-inc.com>
Related-Bug: #1841481
(cherry picked from commit f84d5917c6)
(cherry picked from commit 0c36bd28eb)
2022-08-17 13:30:20 -07:00
api-guide/source Fallback to same-cell resize with qos ports 2021-03-02 14:39:09 +01:00
api-ref/source compute: Validate a BDMs disk_bus when provided 2020-08-03 21:31:25 +01:00
devstack [stable-only] fix lower-constraints and disable qos resize 2020-12-15 20:56:14 +01:00
doc Ignore PCI devices with 32bit domain 2021-06-17 10:51:21 +01:00
etc/nova Allow versioned discovery unauthenticated 2020-04-03 21:24:28 +00:00
gate [CI] Fix gate by using zuulv3 live migration and grenade jobs 2021-06-25 11:16:33 +02:00
nova [ironic] Minimize window for a resource provider to be lost 2022-08-17 13:30:20 -07:00
playbooks [CI] Fix gate by using zuulv3 live migration and grenade jobs 2021-06-25 11:16:33 +02:00
releasenotes [ironic] Minimize window for a resource provider to be lost 2022-08-17 13:30:20 -07:00
roles [CI] Fix gate by using zuulv3 live migration and grenade jobs 2021-06-25 11:16:33 +02:00
tools Move 'check-cherry-picks' test to gate, n-v check 2021-06-18 11:25:39 +01:00
.coveragerc Remove nova/openstack/* from .coveragerc 2016-10-12 16:20:49 -04:00
.gitignore Delete the placement code 2019-04-28 20:06:15 +00:00
.gitreview Update .gitreview for stable/ussuri 2020-04-23 21:37:31 +00:00
.mailmap Add mailmap entry 2014-05-07 12:14:26 -07:00
.pre-commit-config.yaml Switch to hacking 2.x 2020-01-17 11:30:40 +00:00
.stestr.conf Finish stestr migration 2017-11-24 16:51:12 -05:00
.zuul.yaml [stable-only] Make sdk broken job non voting until it is fixed 2022-06-01 16:35:36 +00:00
CONTRIBUTING.rst [Community goal] Update contributor documentation 2020-03-25 12:01:37 +00:00
HACKING.rst Merge "Make it easier to run a selection of tests relevant to ongoing work" 2019-11-22 20:58:18 +00:00
LICENSE initial commit 2010-05-27 23:05:26 -07:00
MAINTAINERS Fix broken URLs 2017-09-07 15:42:31 +02:00
README.rst Start README.rst with a better title 2019-11-19 17:29:28 +01:00
babel.cfg Get rid of distutils.extra. 2012-02-08 19:30:39 -08:00
bindep.txt Added openssh-client into bindep 2019-10-23 07:21:23 +00:00
requirements.txt [stable-only] fix lower-constraints and disable qos resize 2020-12-15 20:56:14 +01:00
setup.cfg api: Add support for new cyborg extra specs 2020-04-08 13:19:39 +00:00
setup.py Updated from global requirements 2017-03-02 11:50:48 +00:00
test-requirements.txt [stable-only] fix lower-constraints and disable qos resize 2020-12-15 20:56:14 +01:00
tox.ini Merge "[CI] Install dependencies for docs target" into stable/ussuri 2022-05-31 13:45:34 +00:00

README.rst

OpenStack Nova

image

OpenStack Nova provides a cloud computing fabric controller, supporting a wide variety of compute technologies, including: libvirt (KVM, Xen, LXC and more), Hyper-V, VMware, XenServer, OpenStack Ironic and PowerVM.

Use the following resources to learn more.

API

To learn how to use Nova's API, consult the documentation available online at:

For more information on OpenStack APIs, SDKs and CLIs in general, refer to:

Operators

To learn how to deploy and configure OpenStack Nova, consult the documentation available online at:

In the unfortunate event that bugs are discovered, they should be reported to the appropriate bug tracker. If you obtained the software from a 3rd party operating system vendor, it is often wise to use their own bug tracker for reporting problems. In all other cases use the master OpenStack bug tracker, available at:

Developers

For information on how to contribute to Nova, please see the contents of the CONTRIBUTING.rst.

Any new code must follow the development guidelines detailed in the HACKING.rst file, and pass all unit tests.

Further developer focused documentation is available at:

Other Information

During each Summit and Project Team Gathering, we agree on what the whole community wants to focus on for the upcoming release. The plans for nova can be found at: