When there are failures in driver.cleanup, we are seeing live-migrations
that get stuck in the live-migrating state. While there has been a patch
to stop the cause listed in the bug this closes, there are other
failures (such as a token timeout when talking to cinder or neutron)
that could trigger this same failure mode.
When we hit an error this late in live-migration, it should be a very
rare event, so its best to just put the instance and migration into an
error state, and help alert both the operator and API user to the
failure that has occurred.
For backport into Newton, 'migrate_instance_start' had to be patched
in the unit test (nova/tests/unit/compute/test_compute.py).
Closes-Bug: #1662626
Change-Id: Idfdce9e7dd8106af01db0358ada15737cb846395
(cherry picked from commit b56f8fc2d1)
(cherry picked from commit 012fa9353f)