Trigger instance recovery audit when host goes offline
When the VIM handles a force-lock operation, it tells nova to fail the instances and reschedules the instance recovery audit for a 30s audit (reschedule_audit_instances), because the audit will evacuate the instances once the host goes offline. However, if the previous audit happened more than 30s earlier (the normal interval is 330s), the audit will trigger right away. At this point, nova has not yet failed the instances, so the recovery audit runs (recover_instances) and since it doesn't see any failed instances, it does nothing and it schedules the next audit for 330s. By that time, the host could have come back online and the evacuates cannot be done at that time (since the host must be offline to do an evacuate). The solution is to call recover_instances once the host goes offline. This will have the effect of setting the audit interval to 30s. When the audit runs the next time, it will see the instances are failed and evacuate them. Story: 2002860 Task: 22809 Change-Id: I80473d6f41850f9cfc7be8125fe8fda4fdc5a56c Signed-off-by: Don Penney <don.penney@windriver.com>
This commit is contained in:
parent
1348e0ad59
commit
d1215497a4
@ -396,6 +396,9 @@ class HostDirector(object):
|
|||||||
% host.name)
|
% host.name)
|
||||||
instance_director = directors.get_instance_director()
|
instance_director = directors.get_instance_director()
|
||||||
instance_director.host_offline(host)
|
instance_director.host_offline(host)
|
||||||
|
# Now that the host is offline, we may be able to recover instances
|
||||||
|
# on that host (i.e. evacuate them).
|
||||||
|
instance_director.recover_instances()
|
||||||
|
|
||||||
@staticmethod
|
@staticmethod
|
||||||
def host_audit(host):
|
def host_audit(host):
|
||||||
|
Loading…
Reference in New Issue
Block a user