[stable-only] Delete allocations even if _confirm_resize raises (part 2)

The backport https://review.opendev.org/#/c/652153/ to fix
bug 1821594 did not account for how the _delete_allocation_after_move
method before Stein is tightly coupled to the migration status
being set to "confirmed" which is what the _confirm_resize method
does after self.driver.confirm_migration returns.

However, if self.driver.confirm_migration raises an exception
we still want to cleanup the allocations held on the source node
and for that we call _delete_allocation_after_move. But because
of that tight coupling before Stein, we need to temporarily
mutate the migration status to "confirmed" to get the cleanup
method to do what we want.

This isn't a problem starting in Stein because change
I0851e2d54a1fdc82fe3291fb7e286e790f121e92 removed that
tight coupling on the migration status, so this is a stable
branch only change.

Note that we don't call self.reportclient.delete_allocation_for_instance
directly since before Stein we still need to account for a
migration that does not move the source node allocations to the
migration record, and that logic is in _delete_allocation_after_move.

A simple unit test assertion is added here but the functional
test added in change I9d6478f492351b58aa87b8f56e907ee633d6d1c6
will assert the bug is fixed properly before Stein.

Change-Id: I933687891abef4878de09481937d576ce5899511
Closes-Bug: #1821594
(cherry picked from commit dac3239e92)
This commit is contained in:
Matt Riedemann 2019-05-15 12:27:17 -04:00 committed by Tony Breeds
parent b8435d6001
commit 5600309a1f
2 changed files with 18 additions and 4 deletions

View File

@ -3768,9 +3768,16 @@ class ComputeManager(manager.Manager):
# Whether an error occurred or not, at this point the # Whether an error occurred or not, at this point the
# instance is on the dest host so to avoid leaking # instance is on the dest host so to avoid leaking
# allocations in placement, delete them here. # allocations in placement, delete them here.
self._delete_allocation_after_move( # NOTE(mriedem): _delete_allocation_after_move is tightly
context, instance, migration, old_instance_type, # coupled to the migration status on the confirm step so
migration.source_node) # we unfortunately have to mutate the migration status to
# have _delete_allocation_after_move cleanup the allocation
# held by the migration consumer.
with utils.temporary_mutation(
migration, status='confirmed'):
self._delete_allocation_after_move(
context, instance, migration, old_instance_type,
migration.source_node)
do_confirm_resize(context, instance, migration.id) do_confirm_resize(context, instance, migration.id)

View File

@ -6689,13 +6689,20 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase):
migration_get_by_id.return_value = self.migration migration_get_by_id.return_value = self.migration
instance_get_by_uuid.return_value = self.instance instance_get_by_uuid.return_value = self.instance
def fake_delete_allocation_after_move(_context, instance, migration,
flavor, nodename):
# The migration.status must be 'confirmed' for the method to
# properly cleanup the allocation held by the migration.
self.assertEqual('confirmed', migration.status)
error = exception.HypervisorUnavailable( error = exception.HypervisorUnavailable(
host=self.migration.source_compute) host=self.migration.source_compute)
with test.nested( with test.nested(
mock.patch.object(self.compute, 'network_api'), mock.patch.object(self.compute, 'network_api'),
mock.patch.object(self.compute.driver, 'confirm_migration', mock.patch.object(self.compute.driver, 'confirm_migration',
side_effect=error), side_effect=error),
mock.patch.object(self.compute, '_delete_allocation_after_move') mock.patch.object(self.compute, '_delete_allocation_after_move',
side_effect=fake_delete_allocation_after_move)
) as ( ) as (
network_api, confirm_migration, delete_allocation network_api, confirm_migration, delete_allocation
): ):