nova/nova
Stephen Finucane 44376d2e21 Don't unset Instance.old_flavor, new_flavor until necessary
Since change Ia6d8a7909081b0b856bd7e290e234af7e42a2b38, the resource
tracker's 'drop_move_claim' method has been capable of freeing up
resource usage. However, this relies on accurate resource reporting.
It transpires that there's a race whereby the resource tracker's
'update_available_resource' periodic task can end up not accounting for
usage from migrations that are in the process of being completed. The
root cause is the resource tracker's reliance on the stashed flavor in a
given migration record [1]. Previously, this information was deleted by
the compute manager at the start of the confirm migration operation [2].
The compute manager would then call the virt driver [3], which could
take a not insignificant amount of time to return, before finally
dropping the move claim. If the periodic task ran between the clearing
of the stashed flavor and the return of the virt driver, it would find a
migration record with no stashed flavor and would therefore ignore this
record for accounting purposes [4], resulting in an incorrect record for
the compute node, and an exception when the 'drop_move_claim' attempts
to free up the resources that aren't being tracked.

The solution to this issue is pretty simple. Instead of unsetting the
old flavor record from the migration at the start of the various move
operations, do it afterwards.

[1] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1288
[2] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4310-L4315
[3] https://github.com/openstack/nova/blob/6557d67/nova/compute/manager.py#L4330-L4331
[4] https://github.com/openstack/nova/blob/6557d67/nova/compute/resource_tracker.py#L1300

Change-Id: I4760b01b695c94fa371b72216d398388cf981d28
Signed-off-by: Stephen Finucane <stephenfin@redhat.com>
Partial-Bug: #1879878
Related-Bug: #1834349
Related-Bug: #1818914
2020-09-01 16:19:27 +01:00
..
accelerator Delete ARQs by UUID if Cyborg ARQ bind fails. 2020-07-23 15:26:07 +08:00
api Ensure source compute is up when confirming a resize 2020-08-26 14:50:07 +01:00
cmd Remove six.PY2 and six.PY3 2020-08-15 07:45:23 +00:00
compute Don't unset Instance.old_flavor, new_flavor until necessary 2020-09-01 16:19:27 +01:00
conductor Remove six.add_metaclass 2020-08-15 07:45:39 +00:00
conf Merge "Change default num_retries for glance to 3" 2020-08-28 11:12:23 +00:00
console Remove six.add_metaclass 2020-08-15 07:45:39 +00:00
db Merge "db: fix database migrations when name includes dash" 2020-08-25 23:29:49 +00:00
hacking Remove hacking rules for python 2/3 compatibility 2020-06-17 08:19:13 +00:00
image Remove six.reraise 2020-08-15 07:45:49 +00:00
keymgr
locale Imported Translations from Zanata 2020-04-26 07:51:21 +00:00
network Remove six.reraise 2020-08-15 07:45:49 +00:00
notifications scheduler: Request vTPM trait based on flavor or image 2020-07-16 17:54:44 +01:00
objects compute: Validate a BDMs disk_bus when provided 2020-07-29 16:05:48 +00:00
pci Remove six.add_metaclass 2020-08-15 07:45:39 +00:00
policies Few todo fixes for API new policies 2020-08-22 09:35:29 -05:00
privsep trivial: Remove log translations 2020-05-27 09:40:47 +00:00
scheduler Merge "Remove deprecated scheduler filters" 2020-08-26 12:14:55 +00:00
servicegroup trivial: Remove remaining '_LI' instances 2020-05-18 17:00:57 +01:00
storage rbd: Move rbd_utils out of libvirt driver under nova.storage 2020-08-26 13:12:05 +01:00
tests Don't unset Instance.old_flavor, new_flavor until necessary 2020-09-01 16:19:27 +01:00
virt Merge "QEMU/KVM: accept vmxnet3 NIC" 2020-08-30 12:55:42 +00:00
volume Remove six.reraise 2020-08-15 07:45:49 +00:00
__init__.py Eventlet monkey patching should be as early as possible 2019-03-22 09:27:16 +00:00
availability_zones.py Remove six.PY2 and six.PY3 2020-08-15 07:45:23 +00:00
baserpc.py
block_device.py utils: Move 'get_bdm_image_metadata' to nova.block_device 2020-07-08 11:56:01 +01:00
cache_utils.py trivial: Remove unused 'cache_utils' APIs 2020-02-05 17:20:28 +00:00
config.py remove support of oslo.messaging 9.8.0 warning message 2020-05-07 19:54:23 +01:00
context.py Reset the cell cache for database access in Service 2020-04-08 17:48:18 +00:00
crypto.py crypto: Add support for creating, destroying vTPM secrets 2020-07-16 17:58:36 +01:00
debugger.py trivial: Remove remaining '_LW' instances 2020-05-18 17:00:41 +01:00
exception.py api: Reject non-spawn operations for vTPM 2020-08-24 19:37:01 +01:00
exception_wrapper.py Use 'Exception.__traceback__' for versioned notifications 2020-06-08 14:38:33 +01:00
filters.py trivial: Remove remaining '_LI' instances 2020-05-18 17:00:57 +01:00
i18n.py trivial: Remove remaining '_LI' instances 2020-05-18 17:00:57 +01:00
loadables.py trivial: Remove dead code 2019-12-12 10:55:02 +00:00
manager.py Remove six.add_metaclass 2020-08-15 07:45:39 +00:00
middleware.py Rename 'nova.common.config' module to 'nova.middleware' 2019-08-16 00:53:03 +01:00
monkey_patch.py Remove eventlet hub workaround for monotonic clock 2020-05-22 16:46:37 +01:00
policy.py trivial: Remove remaining '_LW' instances 2020-05-18 17:00:41 +01:00
profiler.py
quota.py Make quotas respect instance_list_per_project_cells 2020-05-15 17:21:29 -04:00
rpc.py Remove unnecessary wrapper 2019-05-29 17:14:13 +01:00
safe_utils.py
service.py Merge "Remove monotonic usage" 2020-08-17 15:29:46 +00:00
service_auth.py
test.py Remove six.reraise 2020-08-15 07:45:49 +00:00
utils.py api: Reject non-spawn operations for vTPM 2020-08-24 19:37:01 +01:00
version.py trivial: Remove remaining '_LE' instances 2020-05-18 16:52:20 +01:00
weights.py Remove six.add_metaclass 2020-08-15 07:45:39 +00:00
wsgi.py trivial: Remove remaining '_LI' instances 2020-05-18 17:00:57 +01:00