fenix/doc/source/user/advanced_workflow.rst

.. _advanced_workflow:

=======================
Fenix Advanced Workflow
=======================

Example advanced workflow is implemented as 'fenix/workflow/workflows/vnf.py'.
This workflow utilizes the ETSI defined instance and instance group constraints.
Later there needs to be a workflow also with NUMA and CPU pinning. That will
be very similar, but it will need more specific placement decisions which
mean scaling has to be for exact instances and moving operations have to be
calculated to have the same pinning obeyed.

Workflow states are similar to 'default' workflow, but there is some
differences addressed below.

The major difference is that VNFM is supposed to update VNF instance and
instance group constrains dynamically always to match VNF current state.
Constraints can be seen in API documentation as APIs are used to update the
constraints to Fenix DB. Constraints help Fenix workflow to optimize the
workflow as fast as it can, when it knows how many instances can be affected
and all other constraints that also makes sure there is zero impact to VNF
service.

States
======

MAINTENANCE
-----------

Difference to default workflow here is that by the time the maintenance is called
and we enter to this first state all VNFs affected needs to have instance and
instance group constraints updated to Fenix. A perfect VNFM side implementation
should always make sure the changes in VNF will be reflected here.

SCALE_IN
--------

As Fenix is now aware of all the constraints, it can optimize many things. One
is to scale exact instances as we know max_impacted_members for each instance
group, we can optimize how much we scale down to have optimal amount of empty
compute nodes while still have optimal amount of instances left as
max_impacted_members. Other thing here is when using NUMA and CPU pinning.
We definitely need to dictate the scaled down instances as we need
exact NUMA and CPUs free to be able to have empty compute host. Also when
making the move operations to pinned instances we know it will always succeed.
A special need might also be in edge could system, where there is very few
compute host available.

After Fenix workflow has made its math, it may suggest the instances to be
scaled. If VNFM reject this, retry can let VNFM decide how it scales down,
while it might not be optimal.

VNFM needs to update instance and instance group constraints after scaling.

PREPARE_MAINTENANCE
-------------------

After state 'SCALE_IN' the empty compute capacity can be scattered. Now workflow
need to make math of how to get empty compute nodes in the best possible way.
As we have all the constraints we can do operations parallel for different
compute nodes, VNFs and their instances in different instance groups.

Compared to default workflow 'maintenance.planned' notification is always for
single instance only.

START_MAINTENANCE
-----------------

Biggest enhancement here is that hosts can be handled parallel if feasible.

PLANNED_MAINTENANCE
-------------------

As we have all the constraints we can do operations parallel for different
compute nodes, VNFs and their instances in different instance groups.

Compared to default workflow 'maintenance.planned' notification is always for
single instance only.

MAINTENANCE_COMPLETE
--------------------

This is same as in default workflow, but VNFM needs to update instance and
instance group constraints after scaling.

MAINTENANCE_DONE
----------------

This will now make the maintenance session idle until infrastructure admin will
delete it.

MAINTENANCE_FAILED
------------------

This will now make the maintenance session idle until infrastructure admin will
fix and continue the session or delete it.