Go to file

Matt Riedemann 11cb42f396 Restore RT.old_resources if ComputeNode.save() fails

When starting nova-compute for the first time with a new node,
the ResourceTracker will create a new ComputeNode record in
_init_compute_node but without all of the fields set on the
ComputeNode, for example "free_disk_gb".

Later _update_usage_from_instances will set some fields on the
ComputeNode record (even if there are no instances on the node,
why - I don't know) like free_disk_gb.

This will make the eventual call from _update() to _resource_change()
update the value in the old_resouces dict and return True, and then
_update() will try to update those ComputeNode changes to the database.
If that update fails, for example due to a DBConnectionError, the
value in old_resources will still be for the current version of the node
in memory but not what is actually in the database.

Note that this failure does not result in the compute service failing
to start because ComputeManager._update_available_resource_for_node
traps the Exception and just logs it.

A subsequent trip through the RT._update() method - because of the
update_available_resource periodic task - will call _resource_change
but because old_resource matches the current state of the node, it
returns False and the RT does not attempt to persist the changes to
the DB. _update() will then go on to call _update_to_placement
which will create the resource provider in placement along with its
inventory, making it potentially a candidate for scheduling.

This can be a problem later in the scheduler because the
HostState._update_from_compute_node method may skip setting fields
on the HostState object if free_disk_gb is not set in the
ComputeNode record - which can then break filters and weighers
later in the scheduling process (see bug 1834691 and bug 1834694).

The fix proposed here is simple: if the ComputeNode.save() in
RT._update() fails, restore the previous value in old_resources
so that the subsequent run through _resource_change will compare the
correct state of the object and retry the update.

An alternative to this would be killing the compute service on startup
if there is a DB error but that could have unintended side effects,
especially if the DB error is transient and can be fixed on the next
try.

Obviously the scheduler code needs to be more robust also, but those
improvements are left for separate changes related to the other bugs
mentioned above.

Also, ComputeNode.update_from_virt_driver could be updated to set
free_disk_gb if possible to workaround the tight coupling in the
HostState._update_from_compute_node code, but that's also sort of
a whack-a-mole type change best made separately.

Change-Id: Id3c847be32d8a1037722d08bf52e4b88dc5adc97
Closes-Bug: #1834712

2019-07-17 10:29:10 +01:00

api-guide/source

Merge "docs: Rework all things metadata'y"

2019-06-23 09:13:08 +00:00

api-ref/source

Merge "Remove needs:* todo from deprecated APIs api-ref"

2019-07-13 00:34:10 +00:00

devstack

Merge "Add nova-multi-cell job"

2019-04-30 21:18:42 +00:00

doc

Merge "Defaults missing group_policy to 'none'"

2019-07-16 17:37:05 +00:00

etc/nova

Merge "Summarize output of sample configuration generator"

2019-06-16 07:30:03 +00:00

gate

Add integration testing for heal_allocations

2019-06-29 11:03:55 +00:00

nova

Restore RT.old_resources if ComputeNode.save() fails

2019-07-17 10:29:10 +01:00

playbooks/legacy

nova-lvm: Disable [validation]/run_validation in tempest.conf

2019-07-09 19:16:16 +00:00

releasenotes

Merge "Defaults missing group_policy to 'none'"

2019-07-16 17:37:05 +00:00

tools

Replace git.openstack.org URLs with opendev.org URLs

2019-04-24 13:59:57 +08:00

.coveragerc

Remove nova/openstack/* from .coveragerc

2016-10-12 16:20:49 -04:00

.gitignore

Delete the placement code

2019-04-28 20:06:15 +00:00

.gitreview

OpenDev Migration Patch

2019-04-19 19:45:52 +00:00

.mailmap

Add mailmap entry

2014-05-07 12:14:26 -07:00

.stestr.conf

Finish stestr migration

2017-11-24 16:51:12 -05:00

.zuul.yaml

Add neutron-tempest-iptables_hybrid job to experimental queue

2019-07-02 10:38:31 +03:00

babel.cfg

Get rid of distutils.extra.

2012-02-08 19:30:39 -08:00

bindep.txt

Merge "Bindep does not catch missing libpcre3-dev on Ubuntu"

2018-02-14 07:31:09 +00:00

CONTRIBUTING.rst

Update links in documents

2018-01-12 17:05:11 +08:00

HACKING.rst

Hacking N362: Don't abbrev/alias privsep import

2019-04-04 20:42:43 +00:00

LICENSE

initial commit

2010-05-27 23:05:26 -07:00

lower-constraints.txt

Use Adapter global_request_id kwarg

2019-07-15 14:30:35 -05:00

MAINTAINERS

Fix broken URLs

2017-09-07 15:42:31 +02:00

README.rst

Docs: modernise links

2018-03-24 20:27:11 +08:00

requirements.txt

Use Adapter global_request_id kwarg

2019-07-15 14:30:35 -05:00

setup.cfg

Update Python 3 test runtimes for Train

2019-05-09 17:35:43 +08:00

setup.py

Updated from global requirements

2017-03-02 11:50:48 +00:00

test-requirements.txt

Merge "Exclude broken ironicclient versions 2.7.1"

2019-06-03 19:44:11 +00:00

tox.ini

Add Python 3 Train unit tests

2019-07-05 13:58:37 -04:00

README.rst

Team and repository tags

OpenStack Nova

OpenStack Nova provides a cloud computing fabric controller, supporting a wide variety of compute technologies, including: libvirt (KVM, Xen, LXC and more), Hyper-V, VMware, XenServer, OpenStack Ironic and PowerVM.

Use the following resources to learn more.

API

To learn how to use Nova's API, consult the documentation available online at:

For more information on OpenStack APIs, SDKs and CLIs in general, refer to:

Operators

To learn how to deploy and configure OpenStack Nova, consult the documentation available online at:

OpenStack Nova

In the unfortunate event that bugs are discovered, they should be reported to the appropriate bug tracker. If you obtained the software from a 3rd party operating system vendor, it is often wise to use their own bug tracker for reporting problems. In all other cases use the master OpenStack bug tracker, available at:

Bug Tracker

Developers

For information on how to contribute to Nova, please see the contents of the CONTRIBUTING.rst.

Any new code must follow the development guidelines detailed in the HACKING.rst file, and pass all unit tests.

Further developer focused documentation is available at:

Other Information

During each Summit and Project Team Gathering, we agree on what the whole community wants to focus on for the upcoming release. The plans for nova can be found at:

Nova Specs