Update git submodules
* Update cyborg from branch 'master'
- Merge "Check during ARQ bind that the target instance has no other ARQs."
- Check during ARQ bind that the target instance has no other ARQs.
In the ARQ create/bind flow, new ARQs can be created for a device profile
any time. However, if an attempt is made to bind those created ARQs to an
instance, it must be ensured that the instance does not already have other
ARQs bound to it. In the future, we may allow that for hot adds,
but not now.
Duplicate binds may be requested if the APIs are invoked directly, by an
admin or a script. That is, the admin/script calls POST to create ARQs,
then calls PATCH to bind them to some instance UUID, which happens to be
an existing instance with ARQs. IOW, this is user error.
This can also happen in principle when an instance is rescheduled to
another host by Nova - but only in the future when ARQ deletions are
made asynchronous. During rescheduling of an instance, Nova is expected
to delete the bound ARQs for that instance before creating/binding ARQs
to a different host. That is fine today because ARQ deletions are
synchronous -- they complete before returning to the caller.
But, in the future, Cyborg could/should delete ARQs asynchronously, i.e.,
the API call returns before the deletion has completed. That is because
the unbind part of the deletion may require device cleanup if some kind,
which may take time. When ARQ deletions become asynchronous, there could
be a race condition where the deletion is still in progress when the
next bind call for rescheduling comes in. That is, the race
scenario is like this:
* Nova calls Cyborg to delete ARQs for an instance prior to rescheduling.
The call returns while the deletion is still in progress.
* Nova calls Cyborg to create ARQs for a device profile. Since there is
no reference to any instance or host here, Cyborg has to satisfy this
request.
* Nova calls Cyborg to bind the new ARQs to the same instance but a
different host.
To handle this race, one option is to queue the bind till the deletion
succeeds: but that is complex. Another option is to let the bind go
through if the instance's ARQs are all in deleting state. But that still
allows duplicate binds for some window of time till the deletions
complete. This scenario may be rare enough that the simple solution of
failing the duplicate bind is adequate.
Note that the Nova code deletes the created ARQs if bind fails during
rescheduling:
https://review.opendev.org/#/c/673735/44/nova/conductor/manager.py@598
Change-Id: I85bf8fa70f776520f3b1f5eef09ca1e054a210f3
This commit is contained in: