Browse Source

[stable-only] Delete allocations even if _confirm_resize raises (part 2)

The backport https://review.opendev.org/#/c/652153/ to fix
bug 1821594 did not account for how the _delete_allocation_after_move
method before Stein is tightly coupled to the migration status
being set to "confirmed" which is what the _confirm_resize method
does after self.driver.confirm_migration returns.

However, if self.driver.confirm_migration raises an exception
we still want to cleanup the allocations held on the source node
and for that we call _delete_allocation_after_move. But because
of that tight coupling before Stein, we need to temporarily
mutate the migration status to "confirmed" to get the cleanup
method to do what we want.

This isn't a problem starting in Stein because change
I0851e2d54a1fdc82fe3291fb7e286e790f121e92 removed that
tight coupling on the migration status, so this is a stable
branch only change.

Note that we don't call self.reportclient.delete_allocation_for_instance
directly since before Stein we still need to account for a
migration that does not move the source node allocations to the
migration record, and that logic is in _delete_allocation_after_move.

A simple unit test assertion is added here but the functional
test added in change I9d6478f492351b58aa87b8f56e907ee633d6d1c6
will assert the bug is fixed properly before Stein.

Change-Id: I933687891abef4878de09481937d576ce5899511
Closes-Bug: #1821594
(cherry picked from commit dac3239e92)
tags/17.0.11
Matt Riedemann 4 months ago
parent
commit
5600309a1f
2 changed files with 18 additions and 4 deletions
  1. 10
    3
      nova/compute/manager.py
  2. 8
    1
      nova/tests/unit/compute/test_compute_mgr.py

+ 10
- 3
nova/compute/manager.py View File

@@ -3768,9 +3768,16 @@ class ComputeManager(manager.Manager):
3768 3768
                     # Whether an error occurred or not, at this point the
3769 3769
                     # instance is on the dest host so to avoid leaking
3770 3770
                     # allocations in placement, delete them here.
3771
-                    self._delete_allocation_after_move(
3772
-                        context, instance, migration, old_instance_type,
3773
-                        migration.source_node)
3771
+                    # NOTE(mriedem): _delete_allocation_after_move is tightly
3772
+                    # coupled to the migration status on the confirm step so
3773
+                    # we unfortunately have to mutate the migration status to
3774
+                    # have _delete_allocation_after_move cleanup the allocation
3775
+                    # held by the migration consumer.
3776
+                    with utils.temporary_mutation(
3777
+                            migration, status='confirmed'):
3778
+                        self._delete_allocation_after_move(
3779
+                            context, instance, migration, old_instance_type,
3780
+                            migration.source_node)
3774 3781
 
3775 3782
         do_confirm_resize(context, instance, migration.id)
3776 3783
 

+ 8
- 1
nova/tests/unit/compute/test_compute_mgr.py View File

@@ -6689,13 +6689,20 @@ class ComputeManagerMigrationTestCase(test.NoDBTestCase):
6689 6689
         migration_get_by_id.return_value = self.migration
6690 6690
         instance_get_by_uuid.return_value = self.instance
6691 6691
 
6692
+        def fake_delete_allocation_after_move(_context, instance, migration,
6693
+                                              flavor, nodename):
6694
+            # The migration.status must be 'confirmed' for the method to
6695
+            # properly cleanup the allocation held by the migration.
6696
+            self.assertEqual('confirmed', migration.status)
6697
+
6692 6698
         error = exception.HypervisorUnavailable(
6693 6699
             host=self.migration.source_compute)
6694 6700
         with test.nested(
6695 6701
             mock.patch.object(self.compute, 'network_api'),
6696 6702
             mock.patch.object(self.compute.driver, 'confirm_migration',
6697 6703
                               side_effect=error),
6698
-            mock.patch.object(self.compute, '_delete_allocation_after_move')
6704
+            mock.patch.object(self.compute, '_delete_allocation_after_move',
6705
+                              side_effect=fake_delete_allocation_after_move)
6699 6706
         ) as (
6700 6707
             network_api, confirm_migration, delete_allocation
6701 6708
         ):

Loading…
Cancel
Save