Add note about reschedules and num_attempts in filter_properties

The "retry" entry in filter_properties is not set if reschedules
are disabled, which happens in these cases:

1. [scheduler]/max_attempts=1
2. The server is forced to a specific host and/or node.

More times than I'd like to admit, I've had to re-learn that
filter_properties['retry']['num_attempts'] will always be >1 in
conductor build_instances during a reschedule because if
reschedules are disabled, the compute service aborts the build
on failure and we don't even get back to conductor.

This change adds a note since it's hard to keep in your head how
the retry logic is all tied together from the API, superconductor,
compute and cell conductor during a reschedule scenario.

Change-Id: I83536b179000f41f9618a4b6f2a16b4440fd61ba
Related-Bug: #1781286
This commit is contained in:
Matt Riedemann 2018-07-12 18:25:07 -04:00
parent 536e5fa57f
commit 276130c6d1
1 changed files with 7 additions and 0 deletions

View File

@ -582,6 +582,13 @@ class ComputeTaskManager(base.Base):
host_lists = self._schedule_instances(context, spec_obj, host_lists = self._schedule_instances(context, spec_obj,
instance_uuids, return_alternates=True) instance_uuids, return_alternates=True)
except Exception as exc: except Exception as exc:
# NOTE(mriedem): If we're rescheduling from a failed build on a
# compute, "retry" will be set and num_attempts will be >1 because
# populate_retry above will increment it. If the server build was
# forced onto a host/node or [scheduler]/max_attempts=1, "retry"
# won't be in filter_properties and we won't get here because
# nova-compute will just abort the build since reschedules are
# disabled in those cases.
num_attempts = filter_properties.get( num_attempts = filter_properties.get(
'retry', {}).get('num_attempts', 1) 'retry', {}).get('num_attempts', 1)
updates = {'vm_state': vm_states.ERROR, 'task_state': None} updates = {'vm_state': vm_states.ERROR, 'task_state': None}