nova/nova
Stephen Finucane ce95af2caf Don't unset Instance.old_flavor, new_flavor until necessary
Since change Ia6d8a7909081b0b856bd7e290e234af7e42a2b38, the resource
tracker's 'drop_move_claim' method has been capable of freeing up
resource usage. However, this relies on accurate resource reporting.
It transpires that there's a race whereby the resource tracker's
'update_available_resource' periodic task can end up not accounting for
usage from migrations that are in the process of being completed. The
root cause is the resource tracker's reliance on the stashed flavor in a
given migration record [1]. Previously, this information was deleted by
the compute manager at the start of the confirm migration operation [2].
The compute manager would then call the virt driver [3], which could
take a not insignificant amount of time to return, before finally
dropping the move claim. If the periodic task ran between the clearing
of the stashed flavor and the return of the virt driver, it would find a
migration record with no stashed flavor and would therefore ignore this
record for accounting purposes [4], resulting in an incorrect record for
the compute node, and an exception when the 'drop_move_claim' attempts
to free up the resources that aren't being tracked.

The solution to this issue is pretty simple. Instead of unsetting the
old flavor record from the migration at the start of the various move
operations, do it afterwards.

[1] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1288
[2] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4310-L4315
[3] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4330-L4331
[4] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1300

Change-Id: I4760b01b695c94fa371b72216d398388cf981d28
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Partial-Bug: #1879878
Related-Bug: #1834349
Related-Bug: #1818914
(cherry picked from commit 44376d2e21)
2020-09-11 17:23:07 +01:00
..
accelerator Delete ARQs for an instance when the instance is deleted. 2020-03-24 22:44:18 -07:00
api Merge "Add checks for volume status when rebuilding" into stable/ussuri 2020-08-28 01:10:02 +00:00
cmd Add nova-status upgrade check and reno for policy new defaults 2020-05-04 18:33:37 +00:00
compute Don't unset Instance.old_flavor, new_flavor until necessary 2020-09-11 17:23:07 +01:00
conductor Support live migration with vpmem 2020-04-07 13:13:13 +00:00
conf Reserve DISK_GB resource for the image cache 2020-05-20 07:25:32 +00:00
console Merge "Allow TLS ciphers/protocols to be configurable for console proxies" 2020-02-24 17:27:02 +00:00
db Merge "remove DISTINCT ON SQL instruction that does nothing on MySQL" 2020-03-25 23:18:58 +00:00
hacking Switch to hacking 2.x 2020-01-17 11:30:40 +00:00
image Remove 'nova.image.api' module 2020-02-18 11:45:39 +00:00
keymgr
locale Imported Translations from Zanata 2020-04-28 08:35:34 +00:00
network Merge "nova-net: Remove unused parameters" 2020-03-26 21:46:16 +00:00
notifications Remove 'nova.image.api' module 2020-02-18 11:45:39 +00:00
objects compute: Validate a BDMs disk_bus when provided 2020-08-03 21:31:25 +01:00
pci support pci numa affinity policies in flavor and image 2019-12-11 14:39:12 +00:00
policies Merge "Add new default roles in quota class policies" 2020-04-21 08:39:41 +00:00
privsep images: Make JSON the default output format of calls to qemu-img info 2020-04-16 16:38:24 +01:00
scheduler Enable and use COMPUTE_ACCELERATORS trait. 2020-03-27 22:42:37 -07:00
servicegroup Handle ServiceNotFound in DbDriver._report_state 2019-12-04 09:50:17 -05:00
tests Don't unset Instance.old_flavor, new_flavor until necessary 2020-09-11 17:23:07 +01:00
virt Merge "Set different VirtualDevice.key" into stable/ussuri 2020-09-07 20:35:45 +00:00
volume Merge "Add retry to cinder API calls related to volume detach" 2020-04-20 17:36:33 +00:00
__init__.py
availability_zones.py trivial: Fetch 'Service' objects once when building AZs 2020-02-05 21:26:23 +00:00
baserpc.py
block_device.py hacking: Resolve W605 (invalid escape sequence) 2019-06-24 14:24:06 -05:00
cache_utils.py trivial: Remove unused 'cache_utils' APIs 2020-02-05 17:20:28 +00:00
config.py remove support of oslo.messaging 9.8.0 warning message 2020-05-22 12:51:46 +00:00
context.py Reset the cell cache for database access in Service 2020-04-08 17:48:18 +00:00
crypto.py
debugger.py
exception.py Removed the host FQDN from the exception message 2020-09-03 00:23:19 +00:00
exception_wrapper.py
filters.py
hooks.py
i18n.py
loadables.py trivial: Remove dead code 2019-12-12 10:55:02 +00:00
manager.py
middleware.py Rename 'nova.common.config' module to 'nova.middleware' 2019-08-16 00:53:03 +01:00
monkey_patch.py Monkey patch original current_thread _active 2020-02-12 16:34:56 -05:00
policy.py Use oslo policy flag to disable default change warning instead of all 2020-04-15 02:23:32 +00:00
profiler.py
quota.py Make quotas respect instance_list_per_project_cells 2020-05-19 02:20:28 +00:00
rpc.py
safe_utils.py
service.py Reset the cell cache for database access in Service 2020-04-08 17:48:18 +00:00
service_auth.py
test.py func tests: move _run_periodics() into base class 2020-03-24 10:10:53 -04:00
utils.py compute: Extract _get_bdm_image_metadata into nova.utils 2020-04-09 08:39:36 +01:00
version.py
weights.py
wsgi.py