Merge "Add Placement error-handling to user guide"
This commit is contained in:
@@ -114,6 +114,7 @@ Error Handling
|
|||||||
|
|
||||||
etsi_vnf_error_handling
|
etsi_vnf_error_handling
|
||||||
db_sync_error_handling
|
db_sync_error_handling
|
||||||
|
placement_error_handling
|
||||||
|
|
||||||
.. TODO(h-asahina): add `Action Driver`
|
.. TODO(h-asahina): add `Action Driver`
|
||||||
* https://etherpad.opendev.org/p/tacker-wallaby-revise-docs
|
* https://etherpad.opendev.org/p/tacker-wallaby-revise-docs
|
||||||
|
247
doc/source/user/placement_error_handling.rst
Normal file
247
doc/source/user/placement_error_handling.rst
Normal file
@@ -0,0 +1,247 @@
|
|||||||
|
============================
|
||||||
|
VDU Placement error-handling
|
||||||
|
============================
|
||||||
|
|
||||||
|
This document describes how to error-handling VDU placement errors
|
||||||
|
due to insufficient availability zone's resources.
|
||||||
|
|
||||||
|
VDU Placement error due to insufficient resources
|
||||||
|
-------------------------------------------------
|
||||||
|
|
||||||
|
The placement constraints are defined in `ETSI NFV-SOL 003 v3.3.1`_ and
|
||||||
|
that VNFM sends to NFVO in order to the resource placement decision.
|
||||||
|
In VNF Lifecycle Management (LCM), there are some error cases that VDUs
|
||||||
|
are not deployed due to insufficient availability zone's resources.
|
||||||
|
|
||||||
|
When stack create/update fails, it can be detected from `Show stack
|
||||||
|
details`_ of Heat-API response whether the failure is due to
|
||||||
|
insufficient resources.
|
||||||
|
The error message that indicates insufficient resources is extracted
|
||||||
|
from the parameter "stack_status_reason" in the response.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
In the case of insufficient resources, the error occurs after stack
|
||||||
|
create/update returns an acceptance response, so the "Show stack
|
||||||
|
details" response can be used to detect the cause.
|
||||||
|
|
||||||
|
The following are examples of an error message stored in
|
||||||
|
"stack_status_reason" when resources are insufficient.
|
||||||
|
|
||||||
|
+ ex1) Set the flavor defined in "OS::Nova::Server" to a large value
|
||||||
|
that cannot be deployed (not enough storage/not enough vcpu/not enough
|
||||||
|
memory).
|
||||||
|
|
||||||
|
+ Resource CREATE failed: ResourceInError: resources.<VDU-name>: Went
|
||||||
|
to status ERROR due to “Message: No valid host was found. , Code:
|
||||||
|
500”
|
||||||
|
|
||||||
|
+ ex2) Specifies an extra-spec that cannot be assigned for the flavor
|
||||||
|
defined in "OS::Nova::Server."
|
||||||
|
|
||||||
|
+ Resource CREATE failed: ResourceInError: resources.<VDU-name>: Went
|
||||||
|
to status ERROR due to “Message: Exceeded maximum number of retries.
|
||||||
|
Exhausted all hosts available for retrying build failures for
|
||||||
|
instance <server-UUID>., Code: 500”
|
||||||
|
|
||||||
|
Availability zone reselection
|
||||||
|
-----------------------------
|
||||||
|
|
||||||
|
If VDU placement errors due to insufficient availability zone's
|
||||||
|
resources occur, availability zone reselection is possible as
|
||||||
|
error-handling.
|
||||||
|
|
||||||
|
The VNF LCM v2 API (instantiate/heal/scale for VNF) process can change
|
||||||
|
the availability zone to be used from the one notified by the NFVO if
|
||||||
|
necessary.
|
||||||
|
If the availability zone notified by the NFVO has insufficient
|
||||||
|
resources, the VNF is re-created/updated in a different availability
|
||||||
|
zone.
|
||||||
|
The availability zone is reselected and the VNF is re-created/updated
|
||||||
|
until there are no more candidates.
|
||||||
|
|
||||||
|
Settings
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
The settings for performing availability zone reselection are following.
|
||||||
|
|
||||||
|
+ Using StandardUserData as the UserData class
|
||||||
|
|
||||||
|
+ Set `placement_fallback_best_effort = True` in the ``tacker.conf`` file
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Maximum number of retries for reselection of availability zone is
|
||||||
|
unlimited by default.
|
||||||
|
If the retry limit needs to be set, set the limit number to
|
||||||
|
`placement_az_select_retry` in ``tacker.conf``.
|
||||||
|
(Default value ``0`` means unlimited number of retries.)
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Regular expression for detecting insufficient resource error is
|
||||||
|
following by default.
|
||||||
|
|
||||||
|
``Resource CREATE failed: ResourceInError: resources\.(.*)\.(.*): (.*)
|
||||||
|
| Resource UPDATE failed: resources\.(.*): Resource CREATE failed:
|
||||||
|
ResourceInError: resources\.(.*): (.*)``
|
||||||
|
|
||||||
|
If the detection condition for insufficient resource error needs to be
|
||||||
|
changed, set regular expression to `placement_az_resource_error` in
|
||||||
|
``tacker.conf``.
|
||||||
|
|
||||||
|
It is out of community support if you change `placement_az_resource_error`
|
||||||
|
from the default, so please do it at your own risk.
|
||||||
|
|
||||||
|
Policy
|
||||||
|
~~~~~~
|
||||||
|
Availability zones in error are excluded from the reselection
|
||||||
|
candidates, and are reselected preferentially from unselected
|
||||||
|
availability zones.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Affinity/Anti-Affinity of PlacementConstraint and resource states of
|
||||||
|
availability zones are not considered during reselection.
|
||||||
|
|
||||||
|
The availability zone in error can be identified in the following way.
|
||||||
|
|
||||||
|
1. Call Heat-API "Show stack details" after an error occurs in "stack
|
||||||
|
create/update"
|
||||||
|
2. Identify the VDU where the error occurred due to insufficient resource
|
||||||
|
by the stack_status_reason in the response of 1.
|
||||||
|
3. Identify the availability zone by the VDU identified in 2.
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Insufficient resources in availability zones that once failed during
|
||||||
|
reselection attempts may be resolved, but the availability zones will
|
||||||
|
not be reselected.
|
||||||
|
In Scale/Heal operations, VDUs that have already been deployed will
|
||||||
|
not be re-created.
|
||||||
|
|
||||||
|
Availability zone reselection for each VNF LCM v2 API
|
||||||
|
(instantiate/heal/scale for VNF) is as follows.
|
||||||
|
|
||||||
|
Precondition: availability zones AZ-1/AZ-2/AZ-3/AZ-4/AZ-5 exist and VNFs
|
||||||
|
VDU1-0/VDU1-1/VDU2-0/VDU2-1 are deployed
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
VNFs in VDU1 are in the same availability zone (Affinity), and VNFs in
|
||||||
|
VDU2 and VDU1/VDU2 are in different availability zones (Anti-Affinity).
|
||||||
|
|
||||||
|
+ Instantiate
|
||||||
|
|
||||||
|
+ Before reselection, the following attempts to deploy failed (AZ-1
|
||||||
|
and AZ-2 have insufficient resource)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-1
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
+ VDU1-0/1: Reselect the following (except AZ-1/AZ-2/AZ-3, select AZ-4
|
||||||
|
or AZ-5)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-4
|
||||||
|
+ VDU1-1: AZ-4
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
+ VDU2-0: Reselect the following (except AZ-2/AZ-3/AZ-4, select AZ-1 or
|
||||||
|
AZ-5)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-4
|
||||||
|
+ VDU1-1: AZ-4
|
||||||
|
+ VDU2-0: AZ-5
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
The above is an example, and the reselection target is randomly
|
||||||
|
selected from unselected availability zones.
|
||||||
|
|
||||||
|
+ Heal (VDU1-1/VDU2-0)
|
||||||
|
|
||||||
|
+ Before reselection, the following attempts to deploy failed (AZ-1
|
||||||
|
and AZ-2 have insufficient resource)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-1
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
+ VDU1-1: Reselect the following (except AZ-1/AZ-2/AZ-3, select AZ-4
|
||||||
|
or AZ-5)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-4
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
Only Heal target VNFs are targeted for availability zone
|
||||||
|
reselection.
|
||||||
|
Therefore, Affinity may not be satisfied due to the operation of
|
||||||
|
reselection.
|
||||||
|
|
||||||
|
+ VDU2-0: Reselect the following (except AZ-1/AZ-2/AZ-3/AZ-4, select
|
||||||
|
AZ-5)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-4
|
||||||
|
+ VDU2-0: AZ-5
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
+ Scale out (add VDU1-2/VDU1-3)
|
||||||
|
|
||||||
|
+ Before reselection, VDU1-3 deploy failed (AZ-1 has insufficient
|
||||||
|
resource)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-1
|
||||||
|
+ VDU1-2: AZ-1
|
||||||
|
+ VDU1-3: AZ-1
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
+ VDU1-2/3: Reselect the following (except AZ-1/AZ-2/AZ-3, select AZ-4
|
||||||
|
or AZ-5)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-1
|
||||||
|
+ VDU1-2: AZ-4
|
||||||
|
+ VDU1-3: AZ-4
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
In the case of Affinity, even if VDU1-2 has been successfully
|
||||||
|
deployed, both VDU1-2/VDU1-3 availability zones will be reselected.
|
||||||
|
Existing VDU1-0/VDU1-1 will not be reselected, so all VDUs may not
|
||||||
|
be in the same availability zone even in Affinity case.
|
||||||
|
|
||||||
|
+ Scale out (add VDU2-2/VDU2-3)
|
||||||
|
|
||||||
|
+ Before reselection, VDU2-3 deploy failed (AZ-5 has insufficient
|
||||||
|
resource)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-1
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
+ VDU2-2: AZ-4
|
||||||
|
+ VDU2-3: AZ-5
|
||||||
|
|
||||||
|
+ VDU2-3: Reselect the following (except AZ-5, select AZ-1 or AZ-2 or
|
||||||
|
AZ-3 or AZ-4)
|
||||||
|
|
||||||
|
+ VDU1-0: AZ-1
|
||||||
|
+ VDU1-1: AZ-1
|
||||||
|
+ VDU2-0: AZ-2
|
||||||
|
+ VDU2-1: AZ-3
|
||||||
|
+ VDU2-2: AZ-4
|
||||||
|
+ VDU2-3: AZ-1
|
||||||
|
|
||||||
|
.. note::
|
||||||
|
If there are no unselected availability zones left, randomly select
|
||||||
|
a reselection target from the selected availability zones.
|
||||||
|
In this case, Anti-Affinity cannot be satisfied.
|
||||||
|
|
||||||
|
.. _ETSI NFV-SOL 003 v3.3.1: https://www.etsi.org/deliver/etsi_gs/NFV-SOL/001_099/003/03.03.01_60/gs_nfv-sol003v030301p.pdf
|
||||||
|
|
||||||
|
.. _Show stack details: https://docs.openstack.org/api-ref/orchestration/v1/index.html?expanded=show-stack-details-detail#show-stack-details
|
Reference in New Issue
Block a user