Detailed session information and enhancements
- Add GET /v1/maintenance/{session_id}/detail - Add 'maintenance.session' event. This can be used to track workflow. It gives you percent of hosts maintained. Other enhancements: - Add Sample VNFM for OpenStack: vnfm.py (Kubernetes renamed to vnfm_k8s.py) - Add Sample VNF for OpenStack: maintenance_hot_tpl.yaml - Update testing instructions (tools) - Update documentation - Add more tools for testing: - fenix_db_reset (flushed the database) - set_config.py (set the AODH / Ceilometer config) - Add admin tool: infra_admin.py This tool can run maintenance workflow and track its progress - Make sure everything is written in database. If Fenix is restarted, it initialise existing 'ongoing' workflows from database. More functions to database API and utilization in example workflows. story: 2004336 Task: #27922 Change-Id: I794b11a8684f5fc513cb8f5affcd370ec70f3dbc Signed-off-by: Tomi Juvonen <tomi.juvonen@nokia.com>
This commit is contained in:
parent
ef8bbb388b
commit
244fb3ced0
README.rst
doc/source
fenix
api/v1
db
tests/db/sqlalchemy
tools
README.mdfenix_db_resetinfra_admin.pymaintenance_hot_tpl.yamlsession.jsonset_config.pyvnfm.pyvnfm_k8s.py
utils
workflow
@ -27,6 +27,7 @@ would also be telling about adding or removing a host.
|
|||||||
* Documentation: https://fenix.readthedocs.io/en/latest/index.html
|
* Documentation: https://fenix.readthedocs.io/en/latest/index.html
|
||||||
* Developer Documentation: https://wiki.openstack.org/wiki/Fenix
|
* Developer Documentation: https://wiki.openstack.org/wiki/Fenix
|
||||||
* Source: https://opendev.org/x/fenix
|
* Source: https://opendev.org/x/fenix
|
||||||
|
* Running sample workflows: https://opendev.org/x/fenix/src/branch/master/fenix/tools/README.md
|
||||||
* Bug tracking and Blueprints: https://storyboard.openstack.org/#!/project/x/fenix
|
* Bug tracking and Blueprints: https://storyboard.openstack.org/#!/project/x/fenix
|
||||||
* How to contribute: https://docs.openstack.org/infra/manual/developers.html
|
* How to contribute: https://docs.openstack.org/infra/manual/developers.html
|
||||||
* `Fenix Specifications <specifications/index.html>`_
|
* `Fenix Specifications <specifications/index.html>`_
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
####################
|
###
|
||||||
Host Maintenance API
|
API
|
||||||
####################
|
###
|
||||||
|
|
||||||
.. toctree::
|
.. toctree::
|
||||||
:maxdepth: 2
|
:maxdepth: 2
|
||||||
|
@ -1,28 +1,29 @@
|
|||||||
:tocdepth: 2
|
:tocdepth: 2
|
||||||
|
|
||||||
#######################
|
######
|
||||||
Host Maintenance API v1
|
API v1
|
||||||
#######################
|
######
|
||||||
|
|
||||||
.. rest_expand_all::
|
.. rest_expand_all::
|
||||||
|
|
||||||
#####
|
#########
|
||||||
Admin
|
Admin API
|
||||||
#####
|
#########
|
||||||
|
|
||||||
These APIs are meant for infrastructure admin who is in charge of triggering
|
These APIs are meant for infrastructure admin who is in charge of triggering
|
||||||
the rolling maintenance and upgrade workflows.
|
the rolling maintenance and upgrade workflow sessions.
|
||||||
|
|
||||||
.. include:: maintenance.inc
|
.. include:: maintenance.inc
|
||||||
|
|
||||||
#######
|
###########
|
||||||
Project
|
Project API
|
||||||
#######
|
###########
|
||||||
|
|
||||||
These APIs are meant for projects having instances on top of the infrastructure
|
These APIs are meant for projects (tenant/VNF) having instances on top of the
|
||||||
under corresponding rolling maintenance or upgrade session. Usage of these APIs
|
infrastructure under corresponding rolling maintenance or upgrade session.
|
||||||
expects there is an application manager (VNFM) that can interact with Fenix
|
Usage of these APIs expects there is an application manager (VNFM) that can
|
||||||
workflow via these APIs. If this is not the case, workflow should have a default
|
interact with Fenix workflow via these APIs. If this is not the case, workflow
|
||||||
behavior for instances owned by projects, that are not interacting with Fenix.
|
should have a default behavior for instances owned by projects, that are not
|
||||||
|
interacting with Fenix.
|
||||||
|
|
||||||
.. include:: project.inc
|
.. include:: project.inc
|
||||||
|
@ -1,13 +1,13 @@
|
|||||||
.. -*- rst -*-
|
.. -*- rst -*-
|
||||||
|
|
||||||
===========
|
==========================
|
||||||
Maintenance
|
Admin workflow session API
|
||||||
===========
|
==========================
|
||||||
|
|
||||||
Create maintenance session
|
Create maintenance session
|
||||||
==========================
|
==========================
|
||||||
|
|
||||||
.. rest_method:: POST /v1/maintenance/
|
.. rest_method:: POST /v1/maintenance
|
||||||
|
|
||||||
Create a new maintenance session. You can specify a list of 'hosts' to be
|
Create a new maintenance session. You can specify a list of 'hosts' to be
|
||||||
maintained or have an empty list to indicate those should be self-discovered.
|
maintained or have an empty list to indicate those should be self-discovered.
|
||||||
@ -49,7 +49,7 @@ Response codes
|
|||||||
Update maintenance session (planned future functionality)
|
Update maintenance session (planned future functionality)
|
||||||
=========================================================
|
=========================================================
|
||||||
|
|
||||||
.. rest_method:: PUT /v1/maintenance/{session_id}/
|
.. rest_method:: PUT /v1/maintenance/{session_id}
|
||||||
|
|
||||||
Update existing maintenance session. This can be used to continue a failed
|
Update existing maintenance session. This can be used to continue a failed
|
||||||
session after manually fixing what failed. Workflow should then run
|
session after manually fixing what failed. Workflow should then run
|
||||||
@ -79,7 +79,7 @@ Response codes
|
|||||||
Get maintenance sessions
|
Get maintenance sessions
|
||||||
========================
|
========================
|
||||||
|
|
||||||
.. rest_method:: GET /v1/maintenance/
|
.. rest_method:: GET /v1/maintenance
|
||||||
|
|
||||||
Get all ongoing maintenance sessions.
|
Get all ongoing maintenance sessions.
|
||||||
|
|
||||||
@ -88,7 +88,7 @@ Response codes
|
|||||||
|
|
||||||
.. rest_status_code:: success status.yaml
|
.. rest_status_code:: success status.yaml
|
||||||
|
|
||||||
- 200: get-maintenance-sessions-get
|
- 200: maintenance-sessions-get
|
||||||
|
|
||||||
.. rest_status_code:: error status.yaml
|
.. rest_status_code:: error status.yaml
|
||||||
|
|
||||||
@ -98,7 +98,7 @@ Response codes
|
|||||||
Get maintenance session
|
Get maintenance session
|
||||||
=======================
|
=======================
|
||||||
|
|
||||||
.. rest_method:: GET /v1/maintenance/{session_id}/
|
.. rest_method:: GET /v1/maintenance/{session_id}
|
||||||
|
|
||||||
Get a maintenance session state.
|
Get a maintenance session state.
|
||||||
|
|
||||||
@ -114,7 +114,38 @@ Response codes
|
|||||||
|
|
||||||
.. rest_status_code:: success status.yaml
|
.. rest_status_code:: success status.yaml
|
||||||
|
|
||||||
- 200: get-maintenance-session-get
|
- 200: maintenance-session-get
|
||||||
|
|
||||||
|
.. rest_status_code:: error status.yaml
|
||||||
|
|
||||||
|
- 400
|
||||||
|
- 404
|
||||||
|
- 422
|
||||||
|
- 500
|
||||||
|
|
||||||
|
Get maintenance session details
|
||||||
|
===============================
|
||||||
|
|
||||||
|
.. rest_method:: GET /v1/maintenance/{session_id}/detail
|
||||||
|
|
||||||
|
Get a maintenance session details. This information can be usefull to see
|
||||||
|
detailed status of a maintennace session or to troubleshoot a failed session.
|
||||||
|
Usually session should fail on simple problem, that can be fast manually
|
||||||
|
fixed. Then one can update maintenance session state to continue from 'prev_state'.
|
||||||
|
|
||||||
|
Request
|
||||||
|
-------
|
||||||
|
|
||||||
|
.. rest_parameters:: parameters.yaml
|
||||||
|
|
||||||
|
- session_id: session_id
|
||||||
|
|
||||||
|
Response codes
|
||||||
|
--------------
|
||||||
|
|
||||||
|
.. rest_status_code:: success status.yaml
|
||||||
|
|
||||||
|
- 200: maintenance-session-detail-get
|
||||||
|
|
||||||
.. rest_status_code:: error status.yaml
|
.. rest_status_code:: error status.yaml
|
||||||
|
|
||||||
@ -126,7 +157,7 @@ Response codes
|
|||||||
Delete maintenance session
|
Delete maintenance session
|
||||||
==========================
|
==========================
|
||||||
|
|
||||||
.. rest_method:: DELETE /v1/maintenance/{session_id}/
|
.. rest_method:: DELETE /v1/maintenance/{session_id}
|
||||||
|
|
||||||
Delete a maintenance session. Usually called after the session is successfully
|
Delete a maintenance session. Usually called after the session is successfully
|
||||||
finished.
|
finished.
|
||||||
@ -141,12 +172,3 @@ finished.
|
|||||||
- 400
|
- 400
|
||||||
- 422
|
- 422
|
||||||
- 500
|
- 500
|
||||||
|
|
||||||
Future
|
|
||||||
======
|
|
||||||
|
|
||||||
On top of some expected changes mentioned above, it will also be handy to get
|
|
||||||
detailed information about the steps run already in the maintenance session.
|
|
||||||
This will be helpful when need to figure out any correcting actions to
|
|
||||||
successfully finish a failed session. For now admin can update failed session
|
|
||||||
state to previous or his wanted state to try continue a failed session.
|
|
||||||
|
@ -36,7 +36,7 @@ uuid-path:
|
|||||||
#############################################################################
|
#############################################################################
|
||||||
action-metadata:
|
action-metadata:
|
||||||
description: |
|
description: |
|
||||||
Metadata; hints to plug-ins
|
Metadata; hints to plug-ins.
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: dictionary
|
type: dictionary
|
||||||
@ -44,7 +44,17 @@ action-metadata:
|
|||||||
action-plugin-name:
|
action-plugin-name:
|
||||||
description: |
|
description: |
|
||||||
plug-in name. Default workflow executes same type of plug-ins in an
|
plug-in name. Default workflow executes same type of plug-ins in an
|
||||||
alphabetical order
|
alphabetical order.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: string
|
||||||
|
|
||||||
|
action-plugin-state:
|
||||||
|
description: |
|
||||||
|
Action plug-in state. This is workflow and action plug-in specific
|
||||||
|
information to be passed from action plug-in to workflow. Helps
|
||||||
|
understanding how action plug-in was executed and to troubleshoot
|
||||||
|
accordingly.
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: string
|
type: string
|
||||||
@ -77,6 +87,20 @@ boolean:
|
|||||||
required: true
|
required: true
|
||||||
type: boolean
|
type: boolean
|
||||||
|
|
||||||
|
datetime-string:
|
||||||
|
description: |
|
||||||
|
Date and time string according to ISO 8601.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: string
|
||||||
|
|
||||||
|
details:
|
||||||
|
description: |
|
||||||
|
Workflow internal special usage detail. Example nova-compute service id.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: string
|
||||||
|
|
||||||
group-uuid:
|
group-uuid:
|
||||||
description: |
|
description: |
|
||||||
Instance group uuid. Should match with OpenStack server group if one exists.
|
Instance group uuid. Should match with OpenStack server group if one exists.
|
||||||
@ -84,6 +108,21 @@ group-uuid:
|
|||||||
required: true
|
required: true
|
||||||
type: string
|
type: string
|
||||||
|
|
||||||
|
host-type:
|
||||||
|
description: |
|
||||||
|
Host type as it is wanted to be used in workflow implementation.
|
||||||
|
Example workflows uses values as compute and controller.
|
||||||
|
in: body
|
||||||
|
required: false
|
||||||
|
type: list of strings
|
||||||
|
|
||||||
|
hostname:
|
||||||
|
description: |
|
||||||
|
Name of the host.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: string
|
||||||
|
|
||||||
hosts:
|
hosts:
|
||||||
description: |
|
description: |
|
||||||
Hosts to be maintained. An empty list can indicate hosts are to be
|
Hosts to be maintained. An empty list can indicate hosts are to be
|
||||||
@ -102,7 +141,7 @@ instance-action:
|
|||||||
instance-actions:
|
instance-actions:
|
||||||
description: |
|
description: |
|
||||||
instance ID : action string. This variable is not needed in reply to state
|
instance ID : action string. This variable is not needed in reply to state
|
||||||
MAINTENANCE, SCALE_IN or MAINTENANCE_COMPLETE
|
MAINTENANCE, SCALE_IN or MAINTENANCE_COMPLETE.
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: dictionary
|
type: dictionary
|
||||||
@ -128,6 +167,14 @@ instance-name:
|
|||||||
required: true
|
required: true
|
||||||
type: string
|
type: string
|
||||||
|
|
||||||
|
instance-state:
|
||||||
|
description: |
|
||||||
|
State of the instance as in underlying cloud. Can be different in
|
||||||
|
different clouds like OpenStack or Kubernetes.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: string
|
||||||
|
|
||||||
lead-time:
|
lead-time:
|
||||||
description: |
|
description: |
|
||||||
How long lead time VNF needs for 'migration_type' operation. VNF needs to
|
How long lead time VNF needs for 'migration_type' operation. VNF needs to
|
||||||
@ -177,30 +224,50 @@ max-interruption-time:
|
|||||||
|
|
||||||
metadata:
|
metadata:
|
||||||
description: |
|
description: |
|
||||||
Metadata; like hints to projects
|
Hint to project/tenant/VNF to know what capability the infrastructure
|
||||||
|
is offering to instance when it moves to already maintained host in
|
||||||
|
'PLANNED_MAINTENANCE' state action. This may have impact on how
|
||||||
|
the instance is to be moved or if instance is to be upgraded and
|
||||||
|
VNF needs to re-instantiate it as its 'OWN_ACTION'. This could be the
|
||||||
|
case with new hardware or instance could be wanted to be upgraded
|
||||||
|
anyhow at the same time of the infrastructure maintenance.
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: dictionary
|
type: dictionary
|
||||||
|
|
||||||
migration-type:
|
migration-type:
|
||||||
description: |
|
description: |
|
||||||
LIVE_MIGRATION, MIGRATION or OWN_ACTION
|
'LIVE_MIGRATE', 'MIGRATE' or 'OWN_ACTION'
|
||||||
Own action is create new and delete old instance.
|
Own action is create new and delete old instance.
|
||||||
Note! VNF need to obey resource_mitigation with own action
|
Note! VNF need to obey resource_mitigation with own action
|
||||||
This affects to order of delete old and create new to not over
|
This affects to order of delete old and create new to not over
|
||||||
commit the resources. In Kubernetes also EVICTION supported. There admin
|
commit the resources. In Kubernetes also 'EVICTION' supported. There admin
|
||||||
will delete instance and VNF automation like ReplicaSet will make a new
|
will delete instance and VNF automation like ReplicaSet will make a new
|
||||||
instance
|
instance.
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: string
|
type: string
|
||||||
|
|
||||||
|
percent_done:
|
||||||
|
description: |
|
||||||
|
How many percent of hosts are maintained.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: dictionary
|
||||||
|
|
||||||
|
plugin:
|
||||||
|
description: |
|
||||||
|
Action plugin name.
|
||||||
|
in: body
|
||||||
|
required: true
|
||||||
|
type: dictionary
|
||||||
|
|
||||||
recovery-time:
|
recovery-time:
|
||||||
description: |
|
description: |
|
||||||
VNF recovery time after operation to instance. Workflow needs to take
|
VNF recovery time after operation to instance. Workflow needs to take
|
||||||
into account recovery_time for previous instance moved and only then
|
into account recovery_time for previous instance moved and only then
|
||||||
start moving next obyeing max_impacted_members
|
start moving next obyeing max_impacted_members
|
||||||
Note! regardless anti_affinity group or not
|
Note! regardless anti_affinity group or not.
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: integer
|
type: integer
|
||||||
@ -255,7 +322,7 @@ workflow-name:
|
|||||||
|
|
||||||
workflow-state:
|
workflow-state:
|
||||||
description: |
|
description: |
|
||||||
Maintenance workflow state.
|
Maintenance workflow state (States explained in the user guide)
|
||||||
in: body
|
in: body
|
||||||
required: true
|
required: true
|
||||||
type: string
|
type: string
|
||||||
|
@ -1,8 +1,8 @@
|
|||||||
.. -*- rst -*-
|
.. -*- rst -*-
|
||||||
|
|
||||||
=======
|
============================
|
||||||
Project
|
Project workflow session API
|
||||||
=======
|
============================
|
||||||
|
|
||||||
These APIs are generic for any cloud as instance ID should be something that can
|
These APIs are generic for any cloud as instance ID should be something that can
|
||||||
be matched to virtual machines or containers regardless of the cloud underneath.
|
be matched to virtual machines or containers regardless of the cloud underneath.
|
||||||
@ -10,7 +10,7 @@ be matched to virtual machines or containers regardless of the cloud underneath.
|
|||||||
Get project maintenance session
|
Get project maintenance session
|
||||||
===============================
|
===============================
|
||||||
|
|
||||||
.. rest_method:: GET /v1/maintenance/{session_id}/{project_id}/
|
.. rest_method:: GET /v1/maintenance/{session_id}/{project_id}
|
||||||
|
|
||||||
Get project instances belonging to the current state of maintenance session.
|
Get project instances belonging to the current state of maintenance session.
|
||||||
the Project-manager receives an AODH event alarm telling about different
|
the Project-manager receives an AODH event alarm telling about different
|
||||||
@ -31,7 +31,7 @@ Response codes
|
|||||||
|
|
||||||
.. rest_status_code:: success status.yaml
|
.. rest_status_code:: success status.yaml
|
||||||
|
|
||||||
- 200: get-project-maintenance-session-post
|
- 200: project-maintenance-session-post
|
||||||
|
|
||||||
.. rest_status_code:: error status.yaml
|
.. rest_status_code:: error status.yaml
|
||||||
|
|
||||||
@ -42,7 +42,7 @@ Response codes
|
|||||||
Input from project to maintenance session
|
Input from project to maintenance session
|
||||||
=========================================
|
=========================================
|
||||||
|
|
||||||
.. rest_method:: PUT /v1/maintenance/{session_id}/{project_id}/
|
.. rest_method:: PUT /v1/maintenance/{session_id}/{project_id}
|
||||||
|
|
||||||
Project having instances on top of the infrastructure handled by a maintenance
|
Project having instances on top of the infrastructure handled by a maintenance
|
||||||
session might need to make own action for its instances on top of a host going
|
session might need to make own action for its instances on top of a host going
|
||||||
@ -78,9 +78,9 @@ Response codes
|
|||||||
- 422
|
- 422
|
||||||
- 500
|
- 500
|
||||||
|
|
||||||
============================
|
===========================
|
||||||
Project with NFV constraints
|
Project NFV constraints API
|
||||||
============================
|
===========================
|
||||||
|
|
||||||
These APIs are for VNFs, VNMF and EM that are made to support ETSI defined
|
These APIs are for VNFs, VNMF and EM that are made to support ETSI defined
|
||||||
standard VIM interface for sophisticated interaction to optimize rolling
|
standard VIM interface for sophisticated interaction to optimize rolling
|
||||||
|
@ -0,0 +1,212 @@
|
|||||||
|
{
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instances": [
|
||||||
|
{
|
||||||
|
"instance_id": "da8f96ae-a1fe-4e6b-a852-6951d513a440",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-2",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "INSTANCE_ACTION_DONE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_nonha_app_2",
|
||||||
|
"state": "active",
|
||||||
|
"details": null,
|
||||||
|
"action": null,
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "73190018-eab0-4074-bed0-4b0c274a1c8b"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"instance_id": "22d869d7-2a67-4d70-bb3c-dcc14a014d78",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-4",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "ACK_PLANNED_MAINTENANCE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_nonha_app_3",
|
||||||
|
"state": "active",
|
||||||
|
"details": null,
|
||||||
|
"action": "MIGRATE",
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "c0930990-65ac-4bca-88cb-7cb0e7d5c420"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"instance_id": "89467f5c-d5f8-461f-8b5c-236ce54138be",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-2",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "INSTANCE_ACTION_DONE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_nonha_app_1",
|
||||||
|
"state": "active",
|
||||||
|
"details": null,
|
||||||
|
"action": null,
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "c6eba3ae-cb9e-4a1f-af10-13c66f61e4d9"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"instance_id": "5243f1a4-9f7b-4c91-abd5-533933bb9c90",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-3",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "INSTANCE_ACTION_DONE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_ha_app_0",
|
||||||
|
"state": "active",
|
||||||
|
"details": "floating_ip",
|
||||||
|
"action": null,
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "d67176ff-e2e4-45e3-9a52-c069a3a66c5e"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"instance_id": "4e2e24d7-0e5d-4a92-8edc-e343b33b9f10",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-3",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "INSTANCE_ACTION_DONE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_nonha_app_0",
|
||||||
|
"state": "active",
|
||||||
|
"details": null,
|
||||||
|
"action": null,
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "f2f7fd7f-8900-4b24-91dc-098f797790e1"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"instance_id": "92aa44f9-7ce4-4ba4-a29c-e03096ad1047",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-4",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "ACK_PLANNED_MAINTENANCE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_ha_app_1",
|
||||||
|
"state": "active",
|
||||||
|
"details": null,
|
||||||
|
"action": "MIGRATE",
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "f35c9ba5-e5f7-4843-bae5-7df9bac2a33c"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"instance_id": "afa2cf43-6a1f-4508-ba59-12b773f8b926",
|
||||||
|
"action_done": false,
|
||||||
|
"host": "overcloud-novacompute-0",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"project_state": "ACK_PLANNED_MAINTENANCE",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"instance_name": "demo_nonha_app_4",
|
||||||
|
"state": "active",
|
||||||
|
"details": null,
|
||||||
|
"action": "MIGRATE",
|
||||||
|
"project_id": "444b05e6f4764189944f00a7288cd281",
|
||||||
|
"id": "fea38e9b-3d7c-4358-ba2e-06e9c340342d"
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"state": "PLANNED_MAINTENANCE",
|
||||||
|
"session": {
|
||||||
|
"workflow": "vnf",
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"updated_at": "2020-04-15T11:44:04.000000",
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"maintenance_at": "2020-04-15T11:43:28.000000",
|
||||||
|
"state": "PLANNED_MAINTENANCE",
|
||||||
|
"prev_state": "START_MAINTENANCE",
|
||||||
|
"meta": "{'openstack': 'upgrade'}"
|
||||||
|
},
|
||||||
|
"hosts": [
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-novacompute-3",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": false,
|
||||||
|
"maintained": true,
|
||||||
|
"details": "3de22382-5500-4d13-b9a2-470cc21002ee",
|
||||||
|
"type": "compute",
|
||||||
|
"id": "426ea4b9-4438-44ee-9849-1b3ffcc42ad6",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-novacompute-2",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": false,
|
||||||
|
"maintained": true,
|
||||||
|
"details": "91457572-dabf-4aff-aab9-e12a5c6656cd",
|
||||||
|
"type": "compute",
|
||||||
|
"id": "74f0f6d1-520a-4e5b-b69c-c3265d874b14",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-novacompute-5",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": false,
|
||||||
|
"maintained": true,
|
||||||
|
"details": "87921762-0c70-4d3e-873a-240cb2e5c0bf",
|
||||||
|
"type": "compute",
|
||||||
|
"id": "8d0f764e-11e8-4b96-8f6a-9c8fc0eebca2",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-novacompute-1",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": false,
|
||||||
|
"maintained": true,
|
||||||
|
"details": "52c7270a-cfc2-41dd-a574-f4c4c54aa78d",
|
||||||
|
"type": "compute",
|
||||||
|
"id": "be7fd08c-0c5f-4bf4-a95b-bc3b3c01d918",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-novacompute-0",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": true,
|
||||||
|
"maintained": false,
|
||||||
|
"details": "ea68bd0d-a5b6-4f06-9bff-c6eb0b248530",
|
||||||
|
"type": "compute",
|
||||||
|
"id": "ce46f423-e485-4494-8bb7-e1a2b038bb8e",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-novacompute-4",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": true,
|
||||||
|
"maintained": false,
|
||||||
|
"details": "d5271d60-db14-4011-9497-b1529486f62b",
|
||||||
|
"type": "compute",
|
||||||
|
"id": "efdf668c-b1cc-4539-bdb6-aea9afbcc897",
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15T11:43:09.000000",
|
||||||
|
"hostname": "overcloud-controller-0",
|
||||||
|
"updated_at": null,
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"disabled": false,
|
||||||
|
"maintained": true,
|
||||||
|
"details": "9a68c85e-42f7-4e40-b64a-2e7a9e2ccd03",
|
||||||
|
"type": "controller",
|
||||||
|
"id": "f4631941-8a51-44ee-b814-11a898729f3c",
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"percent_done": 71,
|
||||||
|
"action_plugin_instances": [
|
||||||
|
{
|
||||||
|
"created_at": "2020-04-15 11:12:16",
|
||||||
|
"updated_at": null,
|
||||||
|
"id": "4e864972-b692-487b-9204-b4d6470db266",
|
||||||
|
"session_id": "47479bca-7f0e-11ea-99c9-2c600c9893ee",
|
||||||
|
"hostname": "overcloud-novacompute-4",
|
||||||
|
"plugin": "dummy",
|
||||||
|
"state": null
|
||||||
|
}
|
||||||
|
]
|
||||||
|
}
|
@ -19,28 +19,60 @@
|
|||||||
.. literalinclude:: samples/maintenance-session-put-200.json
|
.. literalinclude:: samples/maintenance-session-put-200.json
|
||||||
:language: javascript
|
:language: javascript
|
||||||
|
|
||||||
get-maintenance-sessions-get: |
|
maintenance-sessions-get: |
|
||||||
.. rest_parameters:: parameters.yaml
|
.. rest_parameters:: parameters.yaml
|
||||||
|
|
||||||
- session_id: uuid-list
|
- session_id: uuid-list
|
||||||
|
|
||||||
.. literalinclude:: samples/get-maintenance-sessions-get-200.json
|
.. literalinclude:: samples/maintenance-sessions-get-200.json
|
||||||
:language: javascript
|
:language: javascript
|
||||||
|
|
||||||
get-maintenance-session-get: |
|
maintenance-session-get: |
|
||||||
.. rest_parameters:: parameters.yaml
|
.. rest_parameters:: parameters.yaml
|
||||||
|
|
||||||
- state: workflow-state
|
- state: workflow-state
|
||||||
|
|
||||||
.. literalinclude:: samples/get-maintenance-session-get-200.json
|
.. literalinclude:: samples/maintenance-session-get-200.json
|
||||||
:language: javascript
|
:language: javascript
|
||||||
|
|
||||||
get-project-maintenance-session-post: |
|
maintenance-session-detail-get: |
|
||||||
|
.. rest_parameters:: parameters.yaml
|
||||||
|
|
||||||
|
- action: migration-type
|
||||||
|
- action_done: boolean
|
||||||
|
- created_at: datetime-string
|
||||||
|
- details: details
|
||||||
|
- disabled: boolean
|
||||||
|
- host: hostname
|
||||||
|
- hostname: hostname
|
||||||
|
- id: uuid
|
||||||
|
- instance_id: uuid
|
||||||
|
- instance_name: instance-name
|
||||||
|
- maintained: boolean
|
||||||
|
- maintenance_at: datetime-string
|
||||||
|
- meta: metadata
|
||||||
|
- percent_done: percent_done
|
||||||
|
- plugin: plugin
|
||||||
|
- prev_state: workflow-state
|
||||||
|
- project_id: uuid
|
||||||
|
- project_state: workflow-state-reply
|
||||||
|
- session_id: uuid
|
||||||
|
- state(action_plugin_instances): action-plugin-state
|
||||||
|
- state(instances): instance-state
|
||||||
|
- state: workflow-state
|
||||||
|
- type: host-type
|
||||||
|
- updated_at: datetime-string
|
||||||
|
- workflow: workflow-name
|
||||||
|
|
||||||
|
.. literalinclude:: samples/maintenance-session-detail-get-200.json
|
||||||
|
:language: javascript
|
||||||
|
|
||||||
|
project-maintenance-session-post: |
|
||||||
.. rest_parameters:: parameters.yaml
|
.. rest_parameters:: parameters.yaml
|
||||||
|
|
||||||
- instance_ids: instance-ids
|
- instance_ids: instance-ids
|
||||||
|
|
||||||
.. literalinclude:: samples/get-project-maintenance-session-post-200.json
|
.. literalinclude:: samples/project-maintenance-session-post-200.json
|
||||||
:language: javascript
|
:language: javascript
|
||||||
|
|
||||||
201:
|
201:
|
||||||
|
@ -77,12 +77,38 @@ Example:
|
|||||||
Event type 'maintenance.session'
|
Event type 'maintenance.session'
|
||||||
--------------------------------
|
--------------------------------
|
||||||
|
|
||||||
--Not yet implemented--
|
|
||||||
|
|
||||||
This event type is meant for infrastructure admin to know the changes in the
|
This event type is meant for infrastructure admin to know the changes in the
|
||||||
ongoing maintenance workflow session. When implemented, there will not be a need
|
ongoing maintenance workflow session. This can be used instead of polling API.
|
||||||
for polling the state through an API.
|
Via API you will get more detailed information if you need to troubleshoot.
|
||||||
|
|
||||||
|
payload
|
||||||
|
~~~~~~~~
|
||||||
|
|
||||||
|
+--------------+--------+------------------------------------------------------------------------------+
|
||||||
|
| Name | Type | Description |
|
||||||
|
+==============+========+==============================================================================+
|
||||||
|
| service | string | Origin service name: Fenix |
|
||||||
|
+--------------+--------+------------------------------------------------------------------------------+
|
||||||
|
| state | string | Maintenance workflow state (States explained in the user guide) |
|
||||||
|
+--------------+--------+------------------------------------------------------------------------------+
|
||||||
|
| session_id | string | UUID of the related maintenance session |
|
||||||
|
+--------------+--------+------------------------------------------------------------------------------+
|
||||||
|
| percent_done | string | How many percent of hosts are maintained |
|
||||||
|
+--------------+--------+------------------------------------------------------------------------------+
|
||||||
|
| project_id | string | workflow admin project ID |
|
||||||
|
+--------------+--------+------------------------------------------------------------------------------+
|
||||||
|
|
||||||
|
Example:
|
||||||
|
|
||||||
|
.. code-block:: json
|
||||||
|
|
||||||
|
{
|
||||||
|
"service": "fenix",
|
||||||
|
"state": "IN_MAINTENANCE",
|
||||||
|
"session_id": "76e55df8-1c51-11e8-9928-0242ac110002",
|
||||||
|
"percent_done": 34,
|
||||||
|
"project_id": "ead0dbcaf3564cbbb04842e3e54960e3"
|
||||||
|
}
|
||||||
|
|
||||||
Project
|
Project
|
||||||
=======
|
=======
|
||||||
|
@ -66,7 +66,11 @@ class V1Controller(rest.RestController):
|
|||||||
else:
|
else:
|
||||||
args[0] = 'http404-nonexistingcontroller'
|
args[0] = 'http404-nonexistingcontroller'
|
||||||
elif depth == 3 and route == "maintenance":
|
elif depth == 3 and route == "maintenance":
|
||||||
args[0] = "project"
|
last = self._routes.get(args[2], args[2])
|
||||||
|
if last == "detail":
|
||||||
|
args[0] = "session"
|
||||||
|
else:
|
||||||
|
args[0] = "project"
|
||||||
elif depth == 4 and route == "maintenance":
|
elif depth == 4 and route == "maintenance":
|
||||||
args[0] = "project_instance"
|
args[0] = "project_instance"
|
||||||
else:
|
else:
|
||||||
|
@ -160,9 +160,10 @@ class SessionController(BaseController):
|
|||||||
self.engine_rpcapi = maintenance.EngineRPCAPI()
|
self.engine_rpcapi = maintenance.EngineRPCAPI()
|
||||||
|
|
||||||
# GET /v1/maintenance/<session_id>
|
# GET /v1/maintenance/<session_id>
|
||||||
|
# GET /v1/maintenance/<session_id>/detail
|
||||||
@policy.authorize('maintenance:session', 'get')
|
@policy.authorize('maintenance:session', 'get')
|
||||||
@expose(content_type='application/json')
|
@expose(content_type='application/json')
|
||||||
def get(self, session_id):
|
def get(self, session_id, detail=None):
|
||||||
try:
|
try:
|
||||||
jsonschema.validate(session_id, schema.uid)
|
jsonschema.validate(session_id, schema.uid)
|
||||||
except jsonschema.exceptions.ValidationError as e:
|
except jsonschema.exceptions.ValidationError as e:
|
||||||
@ -173,7 +174,15 @@ class SessionController(BaseController):
|
|||||||
LOG.error("Unexpected data")
|
LOG.error("Unexpected data")
|
||||||
abort(400)
|
abort(400)
|
||||||
try:
|
try:
|
||||||
session = self.engine_rpcapi.admin_get_session(session_id)
|
if detail:
|
||||||
|
if detail != "detail":
|
||||||
|
description = "Invalid path %s" % detail
|
||||||
|
LOG.error(description)
|
||||||
|
abort(400, six.text_type(description))
|
||||||
|
session = (
|
||||||
|
self.engine_rpcapi.admin_get_session_detail(session_id))
|
||||||
|
else:
|
||||||
|
session = self.engine_rpcapi.admin_get_session(session_id)
|
||||||
except RemoteError as e:
|
except RemoteError as e:
|
||||||
self.handle_remote_error(e)
|
self.handle_remote_error(e)
|
||||||
if session is None:
|
if session is None:
|
||||||
|
@ -37,9 +37,13 @@ class EngineRPCAPI(service.RPCClient):
|
|||||||
return self.call('admin_create_session', data=data)
|
return self.call('admin_create_session', data=data)
|
||||||
|
|
||||||
def admin_get_session(self, session_id):
|
def admin_get_session(self, session_id):
|
||||||
"""Get maintenance workflow session details"""
|
"""Get maintenance workflow session state"""
|
||||||
return self.call('admin_get_session', session_id=session_id)
|
return self.call('admin_get_session', session_id=session_id)
|
||||||
|
|
||||||
|
def admin_get_session_detail(self, session_id):
|
||||||
|
"""Get maintenance workflow session details"""
|
||||||
|
return self.call('admin_get_session_detail', session_id=session_id)
|
||||||
|
|
||||||
def admin_delete_session(self, session_id):
|
def admin_delete_session(self, session_id):
|
||||||
"""Delete maintenance workflow session thread"""
|
"""Delete maintenance workflow session thread"""
|
||||||
return self.call('admin_delete_session', session_id=session_id)
|
return self.call('admin_delete_session', session_id=session_id)
|
||||||
|
@ -115,11 +115,23 @@ def create_session(values):
|
|||||||
return IMPL.create_session(values)
|
return IMPL.create_session(values)
|
||||||
|
|
||||||
|
|
||||||
|
def update_session(values):
|
||||||
|
return IMPL.update_session(values)
|
||||||
|
|
||||||
|
|
||||||
def remove_session(session_id):
|
def remove_session(session_id):
|
||||||
"""Remove a session from the tables."""
|
"""Remove a session from the tables."""
|
||||||
return IMPL.remove_session(session_id)
|
return IMPL.remove_session(session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def get_session(session_id):
|
||||||
|
return IMPL.maintenance_session_get(session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def get_sessions():
|
||||||
|
return IMPL.maintenance_session_get_all()
|
||||||
|
|
||||||
|
|
||||||
def create_action_plugin(values):
|
def create_action_plugin(values):
|
||||||
"""Create a action from the values."""
|
"""Create a action from the values."""
|
||||||
return IMPL.create_action_plugin(values)
|
return IMPL.create_action_plugin(values)
|
||||||
@ -129,10 +141,22 @@ def create_action_plugins(session_id, action_dict_list):
|
|||||||
return IMPL.create_action_plugins(action_dict_list)
|
return IMPL.create_action_plugins(action_dict_list)
|
||||||
|
|
||||||
|
|
||||||
|
def get_action_plugins(session_id):
|
||||||
|
return IMPL.action_plugins_get_all(session_id)
|
||||||
|
|
||||||
|
|
||||||
def create_action_plugin_instance(values):
|
def create_action_plugin_instance(values):
|
||||||
return IMPL.create_action_plugin_instance(values)
|
return IMPL.create_action_plugin_instance(values)
|
||||||
|
|
||||||
|
|
||||||
|
def get_action_plugin_instances(session_id):
|
||||||
|
return IMPL.action_plugin_instances_get_all(session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def update_action_plugin_instance(values):
|
||||||
|
return IMPL.update_action_plugin_instance(values)
|
||||||
|
|
||||||
|
|
||||||
def remove_action_plugin_instance(ap_instance):
|
def remove_action_plugin_instance(ap_instance):
|
||||||
return IMPL.remove_action_plugin_instance(ap_instance)
|
return IMPL.remove_action_plugin_instance(ap_instance)
|
||||||
|
|
||||||
@ -141,11 +165,19 @@ def create_downloads(download_dict_list):
|
|||||||
return IMPL.create_downloads(download_dict_list)
|
return IMPL.create_downloads(download_dict_list)
|
||||||
|
|
||||||
|
|
||||||
|
def get_downloads(session_id):
|
||||||
|
return IMPL.download_get_all(session_id)
|
||||||
|
|
||||||
|
|
||||||
def create_host(values):
|
def create_host(values):
|
||||||
"""Create a host from the values."""
|
"""Create a host from the values."""
|
||||||
return IMPL.create_host(values)
|
return IMPL.create_host(values)
|
||||||
|
|
||||||
|
|
||||||
|
def update_host(values):
|
||||||
|
return IMPL.update_host(values)
|
||||||
|
|
||||||
|
|
||||||
def create_hosts(session_id, hostnames):
|
def create_hosts(session_id, hostnames):
|
||||||
hosts = []
|
hosts = []
|
||||||
for hostname in hostnames:
|
for hostname in hostnames:
|
||||||
@ -174,6 +206,10 @@ def create_hosts_by_details(session_id, hosts_dict_list):
|
|||||||
return IMPL.create_hosts(hosts)
|
return IMPL.create_hosts(hosts)
|
||||||
|
|
||||||
|
|
||||||
|
def get_hosts(session_id):
|
||||||
|
return IMPL.hosts_get(session_id)
|
||||||
|
|
||||||
|
|
||||||
def create_projects(session_id, project_ids):
|
def create_projects(session_id, project_ids):
|
||||||
projects = []
|
projects = []
|
||||||
for project_id in project_ids:
|
for project_id in project_ids:
|
||||||
@ -185,6 +221,18 @@ def create_projects(session_id, project_ids):
|
|||||||
return IMPL.create_projects(projects)
|
return IMPL.create_projects(projects)
|
||||||
|
|
||||||
|
|
||||||
|
def update_project(values):
|
||||||
|
return IMPL.update_project(values)
|
||||||
|
|
||||||
|
|
||||||
|
def get_projects(session_id):
|
||||||
|
return IMPL.projects_get(session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def update_instance(values):
|
||||||
|
return IMPL.update_instance(values)
|
||||||
|
|
||||||
|
|
||||||
def create_instance(values):
|
def create_instance(values):
|
||||||
"""Create a instance from the values."""
|
"""Create a instance from the values."""
|
||||||
return IMPL.create_instance(values)
|
return IMPL.create_instance(values)
|
||||||
@ -199,6 +247,10 @@ def remove_instance(session_id, instance_id):
|
|||||||
return IMPL.remove_instance(session_id, instance_id)
|
return IMPL.remove_instance(session_id, instance_id)
|
||||||
|
|
||||||
|
|
||||||
|
def get_instances(session_id):
|
||||||
|
return IMPL.instances_get(session_id)
|
||||||
|
|
||||||
|
|
||||||
def update_project_instance(values):
|
def update_project_instance(values):
|
||||||
return IMPL.update_project_instance(values)
|
return IMPL.update_project_instance(values)
|
||||||
|
|
||||||
|
@ -58,8 +58,6 @@ def upgrade():
|
|||||||
sa.Column('maintained', sa.Boolean, default=False),
|
sa.Column('maintained', sa.Boolean, default=False),
|
||||||
sa.Column('disabled', sa.Boolean, default=False),
|
sa.Column('disabled', sa.Boolean, default=False),
|
||||||
sa.Column('details', sa.String(length=255), nullable=True),
|
sa.Column('details', sa.String(length=255), nullable=True),
|
||||||
sa.Column('plugin', sa.String(length=255), nullable=True),
|
|
||||||
sa.Column('plugin_state', sa.String(length=32), nullable=True),
|
|
||||||
sa.UniqueConstraint('session_id', 'hostname', name='_session_host_uc'),
|
sa.UniqueConstraint('session_id', 'hostname', name='_session_host_uc'),
|
||||||
sa.PrimaryKeyConstraint('id'))
|
sa.PrimaryKeyConstraint('id'))
|
||||||
|
|
||||||
|
@ -135,6 +135,15 @@ def maintenance_session_get(session_id):
|
|||||||
return _maintenance_session_get(get_session(), session_id)
|
return _maintenance_session_get(get_session(), session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def _maintenance_session_get_all(session):
|
||||||
|
query = model_query(models.MaintenanceSession, session)
|
||||||
|
return query
|
||||||
|
|
||||||
|
|
||||||
|
def maintenance_session_get_all():
|
||||||
|
return _maintenance_session_get_all(get_session())
|
||||||
|
|
||||||
|
|
||||||
def create_session(values):
|
def create_session(values):
|
||||||
values = values.copy()
|
values = values.copy()
|
||||||
msession = models.MaintenanceSession()
|
msession = models.MaintenanceSession()
|
||||||
@ -152,6 +161,18 @@ def create_session(values):
|
|||||||
return maintenance_session_get(msession.session_id)
|
return maintenance_session_get(msession.session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def update_session(values):
|
||||||
|
session = get_session()
|
||||||
|
session_id = values.session_id
|
||||||
|
with session.begin():
|
||||||
|
msession = _maintenance_session_get(session,
|
||||||
|
session_id)
|
||||||
|
msession.update(values)
|
||||||
|
msession.save(session=session)
|
||||||
|
|
||||||
|
return maintenance_session_get(session_id)
|
||||||
|
|
||||||
|
|
||||||
def remove_session(session_id):
|
def remove_session(session_id):
|
||||||
session = get_session()
|
session = get_session()
|
||||||
with session.begin():
|
with session.begin():
|
||||||
@ -276,6 +297,22 @@ def action_plugin_instances_get_all(session_id):
|
|||||||
return _action_plugin_instances_get_all(get_session(), session_id)
|
return _action_plugin_instances_get_all(get_session(), session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def update_action_plugin_instance(values):
|
||||||
|
session = get_session()
|
||||||
|
session_id = values.session_id
|
||||||
|
plugin = values.plugin
|
||||||
|
hostname = values.hostname
|
||||||
|
with session.begin():
|
||||||
|
ap_instance = _action_plugin_instance_get(session,
|
||||||
|
session_id,
|
||||||
|
plugin,
|
||||||
|
hostname)
|
||||||
|
ap_instance.update(values)
|
||||||
|
ap_instance.save(session=session)
|
||||||
|
|
||||||
|
return action_plugin_instance_get(session_id, plugin, hostname)
|
||||||
|
|
||||||
|
|
||||||
def create_action_plugin_instance(values):
|
def create_action_plugin_instance(values):
|
||||||
values = values.copy()
|
values = values.copy()
|
||||||
ap_instance = models.MaintenanceActionPluginInstance()
|
ap_instance = models.MaintenanceActionPluginInstance()
|
||||||
@ -402,6 +439,18 @@ def create_host(values):
|
|||||||
return host_get(mhost.session_id, mhost.hostname)
|
return host_get(mhost.session_id, mhost.hostname)
|
||||||
|
|
||||||
|
|
||||||
|
def update_host(values):
|
||||||
|
session = get_session()
|
||||||
|
session_id = values.session_id
|
||||||
|
hostname = values.hostname
|
||||||
|
with session.begin():
|
||||||
|
mhost = _host_get(session, session_id, hostname)
|
||||||
|
mhost.update(values)
|
||||||
|
mhost.save(session=session)
|
||||||
|
|
||||||
|
return host_get(session_id, hostname)
|
||||||
|
|
||||||
|
|
||||||
def create_hosts(values_list):
|
def create_hosts(values_list):
|
||||||
for values in values_list:
|
for values in values_list:
|
||||||
vals = values.copy()
|
vals = values.copy()
|
||||||
@ -468,6 +517,18 @@ def create_project(values):
|
|||||||
return project_get(mproject.session_id, mproject.project_id)
|
return project_get(mproject.session_id, mproject.project_id)
|
||||||
|
|
||||||
|
|
||||||
|
def update_project(values):
|
||||||
|
session = get_session()
|
||||||
|
session_id = values.session_id
|
||||||
|
project_id = values.project_id
|
||||||
|
with session.begin():
|
||||||
|
mproject = _project_get(session, session_id, project_id)
|
||||||
|
mproject.update(values)
|
||||||
|
mproject.save(session=session)
|
||||||
|
|
||||||
|
return project_get(session_id, project_id)
|
||||||
|
|
||||||
|
|
||||||
def create_projects(values_list):
|
def create_projects(values_list):
|
||||||
for values in values_list:
|
for values in values_list:
|
||||||
vals = values.copy()
|
vals = values.copy()
|
||||||
@ -476,7 +537,7 @@ def create_projects(values_list):
|
|||||||
mproject = models.MaintenanceProject()
|
mproject = models.MaintenanceProject()
|
||||||
mproject.update(vals)
|
mproject.update(vals)
|
||||||
if _project_get(session, mproject.session_id,
|
if _project_get(session, mproject.session_id,
|
||||||
mproject.project_id):
|
mproject.project_id):
|
||||||
selected = ['project_id']
|
selected = ['project_id']
|
||||||
raise db_exc.FenixDBDuplicateEntry(
|
raise db_exc.FenixDBDuplicateEntry(
|
||||||
model=mproject.__class__.__name__,
|
model=mproject.__class__.__name__,
|
||||||
@ -512,6 +573,18 @@ def instances_get(session_id):
|
|||||||
return _instances_get(get_session(), session_id)
|
return _instances_get(get_session(), session_id)
|
||||||
|
|
||||||
|
|
||||||
|
def update_instance(values):
|
||||||
|
session = get_session()
|
||||||
|
session_id = values.session_id
|
||||||
|
instance_id = values.instance_id
|
||||||
|
with session.begin():
|
||||||
|
minstance = _instance_get(session, session_id, instance_id)
|
||||||
|
minstance.update(values)
|
||||||
|
minstance.save(session=session)
|
||||||
|
|
||||||
|
return instance_get(session_id, instance_id)
|
||||||
|
|
||||||
|
|
||||||
def create_instance(values):
|
def create_instance(values):
|
||||||
values = values.copy()
|
values = values.copy()
|
||||||
minstance = models.MaintenanceInstance()
|
minstance = models.MaintenanceInstance()
|
||||||
|
@ -99,8 +99,6 @@ class MaintenanceHost(mb.FenixBase):
|
|||||||
maintained = sa.Column(sa.Boolean, default=False)
|
maintained = sa.Column(sa.Boolean, default=False)
|
||||||
disabled = sa.Column(sa.Boolean, default=False)
|
disabled = sa.Column(sa.Boolean, default=False)
|
||||||
details = sa.Column(sa.String(length=255), nullable=True)
|
details = sa.Column(sa.String(length=255), nullable=True)
|
||||||
plugin = sa.Column(sa.String(length=255), nullable=True)
|
|
||||||
plugin_state = sa.Column(sa.String(length=32), nullable=True)
|
|
||||||
|
|
||||||
def to_dict(self):
|
def to_dict(self):
|
||||||
return super(MaintenanceHost, self).to_dict()
|
return super(MaintenanceHost, self).to_dict()
|
||||||
|
@ -117,9 +117,7 @@ def _get_fake_host_values(uuid=_get_fake_uuid(),
|
|||||||
'type': 'compute',
|
'type': 'compute',
|
||||||
'maintained': False,
|
'maintained': False,
|
||||||
'disabled': False,
|
'disabled': False,
|
||||||
'details': None,
|
'details': None}
|
||||||
'plugin': None,
|
|
||||||
'plugin_state': None}
|
|
||||||
return hdict
|
return hdict
|
||||||
|
|
||||||
|
|
||||||
|
@ -10,7 +10,18 @@ Files:
|
|||||||
|
|
||||||
- 'demo-ha.yaml': demo-ha ReplicaSet to make 2 anti-affinity PODS.
|
- 'demo-ha.yaml': demo-ha ReplicaSet to make 2 anti-affinity PODS.
|
||||||
- 'demo-nonha.yaml': demo-nonha ReplicaSet to make n nonha PODS.
|
- 'demo-nonha.yaml': demo-nonha ReplicaSet to make n nonha PODS.
|
||||||
- 'vnfm.py': VNFM to test k8s.py workflow.
|
- 'vnfm_k8s.py': VNFM to test k8s.py (Kubernetes example) workflow.
|
||||||
|
- 'vnfm.py': VNFM to test nfv.py (OpenStack example) workflow.
|
||||||
|
- 'infra_admin.py': Tool to act as infrastructure admin. Tool catch also
|
||||||
|
the 'maintenance.session' and 'maintenance.host' events to keep track
|
||||||
|
where the maintenance is going. You will see when certain host is maintained
|
||||||
|
and how many percent of hosts are maintained.
|
||||||
|
- 'session.json': Example to define maintenance session parameters as JSON
|
||||||
|
file to be given as input to 'infra_admin.py'. Example if for nfv.py workflow.
|
||||||
|
This could be used for any advanced workflow testing giving software downloads
|
||||||
|
and real action plugins.
|
||||||
|
- 'set_config.py': You can use this to set Fenix AODH/Ceilometer configuration.
|
||||||
|
- 'fenix_db_reset': Flush the Fenix database.
|
||||||
|
|
||||||
## Kubernetes workflow (k8s.py)
|
## Kubernetes workflow (k8s.py)
|
||||||
|
|
||||||
@ -92,7 +103,7 @@ kluster. Under here is what you can run in different terminals. Terminals
|
|||||||
should be running in master node. Here is short description:
|
should be running in master node. Here is short description:
|
||||||
|
|
||||||
- Term1: Used for logging Fenix
|
- Term1: Used for logging Fenix
|
||||||
- Term2: Infrastructure admin commands
|
- Term2: Infrastructure admin
|
||||||
- Term3: VNFM logging for testing and setting up the VNF
|
- Term3: VNFM logging for testing and setting up the VNF
|
||||||
|
|
||||||
#### Term1: Fenix-engine logging
|
#### Term1: Fenix-engine logging
|
||||||
@ -114,6 +125,8 @@ Debugging and other configuration changes to '.conf' files under '/etc/fenix'
|
|||||||
|
|
||||||
#### Term2: Infrastructure admin window
|
#### Term2: Infrastructure admin window
|
||||||
|
|
||||||
|
##### Admin commands as command line and curl
|
||||||
|
|
||||||
Use DevStack admin as user. Set your variables needed accordingly
|
Use DevStack admin as user. Set your variables needed accordingly
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
@ -148,12 +161,42 @@ If maintenance run till the end with 'MAINTENANCE_DONE', you are ready to run it
|
|||||||
again if you wish. 'MAINTENANCE_FAILED' or in case of exceptions, you should
|
again if you wish. 'MAINTENANCE_FAILED' or in case of exceptions, you should
|
||||||
recover system before trying to test again. This is covered in Term3 below.
|
recover system before trying to test again. This is covered in Term3 below.
|
||||||
|
|
||||||
#### Term3: VNFM (fenix/tools/vnfm.py)
|
##### Admin commands using admin tool
|
||||||
|
|
||||||
Use DevStack admin as user.
|
Go to Fenix tools directory
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
. ~/devstack/operc admin admin
|
cd /opt/stack/fenix/fenix/tools
|
||||||
|
```
|
||||||
|
Call admin tool and it will run the maintenance workflow. Admin tool defaults
|
||||||
|
to 'OpenStack' and 'nfv' workflow, so you can override those by exporting
|
||||||
|
environmental variables
|
||||||
|
|
||||||
|
```sh
|
||||||
|
. ~/devstack/openrc admin admin
|
||||||
|
export WORKFLOW=k8s
|
||||||
|
export CLOUD_TYPE=k8s
|
||||||
|
python infra_admin.py
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want to choose freely parameters for maintenance workflow session,
|
||||||
|
you can give session.json file as input. With this option infra_admin.py
|
||||||
|
will only override the 'maintenance_at' to be 20seconds in future when
|
||||||
|
Fenix is called.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
python infra_admin.py --file session.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Maintenance will start by pressing enter, just follow instructions on the
|
||||||
|
console.
|
||||||
|
|
||||||
|
#### Term3: VNFM (fenix/tools/vnfm_k8s.py)
|
||||||
|
|
||||||
|
Use DevStack as demo user for testing demo application
|
||||||
|
|
||||||
|
```sh
|
||||||
|
. ~/devstack/operc demo demo
|
||||||
```
|
```
|
||||||
|
|
||||||
Go to Fenix Kubernetes tool directory for testing
|
Go to Fenix Kubernetes tool directory for testing
|
||||||
@ -181,7 +224,7 @@ is 32 cpus, so value is "15" in both yaml files. Replicas can be changed in
|
|||||||
demo-nonha.yaml. Minimum 2 (if minimum of 3 worker nodes) to maximum
|
demo-nonha.yaml. Minimum 2 (if minimum of 3 worker nodes) to maximum
|
||||||
'(amount_of_worker_nodes-1)*2'. Greater amount means more scaling needed and
|
'(amount_of_worker_nodes-1)*2'. Greater amount means more scaling needed and
|
||||||
longer maintenance window as less parallel actions possible. Surely constraints
|
longer maintenance window as less parallel actions possible. Surely constraints
|
||||||
in vnfm.py also can be changed for different behavior.
|
in vnfm_k8s.py also can be changed for different behavior.
|
||||||
|
|
||||||
You can delete pods used like this
|
You can delete pods used like this
|
||||||
|
|
||||||
@ -192,11 +235,11 @@ kubectl delete replicaset.apps demo-ha demo-nonha --namespace=demo
|
|||||||
Start Kubernetes VNFM that we need for testing
|
Start Kubernetes VNFM that we need for testing
|
||||||
|
|
||||||
```sh
|
```sh
|
||||||
python vnfm.py
|
python vnfm_k8s.py
|
||||||
```
|
```
|
||||||
|
|
||||||
Now you can start maintenance session in Term2. When workflow failed or
|
Now you can start maintenance session in Term2. When workflow failed or
|
||||||
completed; you first kill vnfm.py with "ctrl+c" and delete maintenance session
|
completed; you first kill vnfm_k8s.py with "ctrl+c" and delete maintenance session
|
||||||
in Term2.
|
in Term2.
|
||||||
|
|
||||||
If workflow failed something might need to be manually fixed. Here you
|
If workflow failed something might need to be manually fixed. Here you
|
||||||
@ -221,7 +264,8 @@ kubectl delete replicaset.apps demo-ha demo-nonha --namespace=demo;sleep 15;kube
|
|||||||
|
|
||||||
## OpenStack workflows (default.py and nvf.py)
|
## OpenStack workflows (default.py and nvf.py)
|
||||||
|
|
||||||
OpenStack workflows can be tested by using OPNFV Doctor project for testing.
|
OpenStack workflows can be tested by using OPNFV Doctor project for testing
|
||||||
|
or to use Fenix own tools.
|
||||||
Workflows:
|
Workflows:
|
||||||
|
|
||||||
- default.py is the first example workflow with VNFM interaction.
|
- default.py is the first example workflow with VNFM interaction.
|
||||||
@ -290,7 +334,7 @@ cpu_allocation_ratio = 1.0
|
|||||||
allow_resize_to_same_host = False
|
allow_resize_to_same_host = False
|
||||||
```
|
```
|
||||||
|
|
||||||
### Workflow default.py
|
### Workflow default.py testing with Doctor
|
||||||
|
|
||||||
On controller node clone Doctor to be able to test. Doctor currently requires
|
On controller node clone Doctor to be able to test. Doctor currently requires
|
||||||
Python 3.6:
|
Python 3.6:
|
||||||
@ -331,13 +375,13 @@ sudo systemctl restart devstack@fenix*
|
|||||||
|
|
||||||
You can also make changed to Doctor before running Doctor test
|
You can also make changed to Doctor before running Doctor test
|
||||||
|
|
||||||
### Workflow vnf.py
|
### Workflow vnf.py testing with Doctor
|
||||||
|
|
||||||
This workflow differs from above as it expects ETSI FEAT03 constraints.
|
This workflow differs from above as it expects ETSI FEAT03 constraints.
|
||||||
In Doctor testing it means we also need to use different application manager (VNFM)
|
In Doctor testing it means we also need to use different application manager (VNFM)
|
||||||
|
|
||||||
Where default.py worklow used the sample.py application manager vnf.py
|
Where default.py worklow used the sample.py application manager vnf.py
|
||||||
workflow uses vnfm.py workflow (doctor/doctor_tests/app_manager/vnfm.py)
|
workflow uses vnfm_k8s.py workflow (doctor/doctor_tests/app_manager/vnfm_k8s.py)
|
||||||
|
|
||||||
Only change to testing is that you should export variable to use different
|
Only change to testing is that you should export variable to use different
|
||||||
application manager.
|
application manager.
|
||||||
@ -354,3 +398,115 @@ export APP_MANAGER_TYPE=sample
|
|||||||
```
|
```
|
||||||
Doctor modifies the message where it calls maintenance accordingly to use
|
Doctor modifies the message where it calls maintenance accordingly to use
|
||||||
either 'default' or 'nfv' as workflow in Fenix side
|
either 'default' or 'nfv' as workflow in Fenix side
|
||||||
|
|
||||||
|
### Workflow vnf.py testing with Fenix
|
||||||
|
|
||||||
|
Where Doctor is made to automate everything as a test case, Fenix provides
|
||||||
|
different tools for admin and VNFM:
|
||||||
|
|
||||||
|
- 'vnfm.py': VNFM to test nfv.py.
|
||||||
|
- 'infra_admin.py': Tool to act as infrastructure admin.
|
||||||
|
|
||||||
|
Use 3 terminal windows (Term1, Term2 and Term3) to test Fenix with Kubernetes
|
||||||
|
kluster. Under here is what you can run in different terminals. Terminals
|
||||||
|
should be running in master node. Here is short description:
|
||||||
|
|
||||||
|
- Term1: Used for logging Fenix
|
||||||
|
- Term2: Infrastructure admin
|
||||||
|
- Term3: VNFM logging for testing and setting up the VNF
|
||||||
|
|
||||||
|
#### Term1: Fenix-engine logging
|
||||||
|
|
||||||
|
If any changes to Fenix make them under '/opt/stack/fenix'; restart Fenix and
|
||||||
|
see logs
|
||||||
|
|
||||||
|
```sh
|
||||||
|
sudo systemctl restart devstack@fenix*;sudo journalctl -f --unit devstack@fenix-engine
|
||||||
|
```
|
||||||
|
|
||||||
|
API logs can also be seen
|
||||||
|
|
||||||
|
```sh
|
||||||
|
sudo journalctl -f --unit devstack@fenix-api
|
||||||
|
```
|
||||||
|
|
||||||
|
Debugging and other configuration changes to '.conf' files under '/etc/fenix'
|
||||||
|
|
||||||
|
#### Term2: Infrastructure admin window
|
||||||
|
|
||||||
|
Go to Fenix tools directory for testing
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cd /opt/stack/fenix/fenix/tools
|
||||||
|
```
|
||||||
|
|
||||||
|
Make flavor for testing that takes the half of the amount of VCPUs on single
|
||||||
|
compute node (here we have 48 VCPUs on each compute) This is required by
|
||||||
|
the current example 'vnfm.py' and the vnf 'maintenance_hot_tpl.yaml' that
|
||||||
|
is used in testing. 'vnf.py' workflow is not bind to these in any way, but
|
||||||
|
can be used with different VNFs and VNFM.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
openstack flavor create --ram 512 --vcpus 24 --disk 1 --public demo_maint_flavor
|
||||||
|
```
|
||||||
|
|
||||||
|
Call admin tool and it will run the nvf.py workflow.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
. ~/devstack/openrc admin admin
|
||||||
|
python infra_admin.py
|
||||||
|
```
|
||||||
|
|
||||||
|
If you want to choose freely parameters for maintenance workflow session,
|
||||||
|
you can give 'session.json' file as input. With this option 'infra_admin.py'
|
||||||
|
will only override the 'maintenance_at' to be 20 seconds in future when
|
||||||
|
Fenix is called.
|
||||||
|
|
||||||
|
```sh
|
||||||
|
python infra_admin.py --file session.json
|
||||||
|
```
|
||||||
|
|
||||||
|
Maintenance will start by pressing enter, just follow instructions on the
|
||||||
|
console.
|
||||||
|
|
||||||
|
In case you failed to remove maintenance workflow session, you can do it
|
||||||
|
manually as instructed above in 'Admin commands as command line and curl'.
|
||||||
|
|
||||||
|
#### Term3: VNFM (fenix/tools/vnfm.py)
|
||||||
|
|
||||||
|
Use DevStack as demo user for testing demo application
|
||||||
|
|
||||||
|
```sh
|
||||||
|
. ~/devstack/openrc demo demo
|
||||||
|
```
|
||||||
|
|
||||||
|
Go to Fenix tools directory for testing
|
||||||
|
|
||||||
|
```sh
|
||||||
|
cd /opt/stack/fenix/fenix/tools
|
||||||
|
```
|
||||||
|
|
||||||
|
Start VNFM that we need for testing
|
||||||
|
|
||||||
|
```sh
|
||||||
|
python vnfm.py
|
||||||
|
```
|
||||||
|
|
||||||
|
Now you can start maintenance session in Term2. When workflow failed or
|
||||||
|
completed; you first kill vnfm.py with "ctrl+c" and then delete maintenance
|
||||||
|
session in Term2.
|
||||||
|
|
||||||
|
If workflow failed something might need to be manually fixed.
|
||||||
|
Here you can remove the heat stack if vnfm.py failed to sdo that:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
openstack stack delete -y --wait demo_stack
|
||||||
|
```
|
||||||
|
|
||||||
|
It may also be that workflow failed somewhere in the middle and some
|
||||||
|
'nova-compute' are disabled. You can enable those. Here you can see the
|
||||||
|
states:
|
||||||
|
|
||||||
|
```sh
|
||||||
|
openstack compute service list
|
||||||
|
```
|
||||||
|
9
fenix/tools/fenix_db_reset
Normal file
9
fenix/tools/fenix_db_reset
Normal file
@ -0,0 +1,9 @@
|
|||||||
|
MYSQLPW=admin
|
||||||
|
# Fenix DB
|
||||||
|
[ `mysql -uroot -p$MYSQLPW -e "SELECT host, user FROM mysql.user;" | grep fenix | wc -l` -eq 0 ] && {
|
||||||
|
mysql -uroot -p$MYSQLPW -hlocalhost -e "CREATE USER 'fenix'@'localhost' IDENTIFIED BY 'fenix';"
|
||||||
|
mysql -uroot -p$MYSQLPW -hlocalhost -e "GRANT ALL PRIVILEGES ON fenix.* TO 'fenix'@'' identified by 'fenix';FLUSH PRIVILEGES;"
|
||||||
|
}
|
||||||
|
mysql -ufenix -pfenix -hlocalhost -e "DROP DATABASE IF EXISTS fenix;"
|
||||||
|
mysql -ufenix -pfenix -hlocalhost -e "CREATE DATABASE fenix CHARACTER SET utf8;"
|
||||||
|
|
320
fenix/tools/infra_admin.py
Normal file
320
fenix/tools/infra_admin.py
Normal file
@ -0,0 +1,320 @@
|
|||||||
|
# Copyright (c) 2020 Nokia Corporation.
|
||||||
|
# All Rights Reserved.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||||
|
# not use this file except in compliance with the License. You may obtain
|
||||||
|
# a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||||
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
|
# License for the specific language governing permissions and limitations
|
||||||
|
# under the License.
|
||||||
|
import aodhclient.client as aodhclient
|
||||||
|
import argparse
|
||||||
|
import datetime
|
||||||
|
from flask import Flask
|
||||||
|
from flask import request
|
||||||
|
import json
|
||||||
|
from keystoneauth1 import loading
|
||||||
|
from keystoneclient import client as ks_client
|
||||||
|
import logging as lging
|
||||||
|
import os
|
||||||
|
from oslo_config import cfg
|
||||||
|
from oslo_log import log as logging
|
||||||
|
import requests
|
||||||
|
import sys
|
||||||
|
from threading import Thread
|
||||||
|
import time
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
try:
|
||||||
|
import fenix.utils.identity_auth as identity_auth
|
||||||
|
except ValueError:
|
||||||
|
sys.path.append('../utils')
|
||||||
|
import identity_auth
|
||||||
|
|
||||||
|
try:
|
||||||
|
input = raw_input
|
||||||
|
except NameError:
|
||||||
|
pass
|
||||||
|
|
||||||
|
LOG = logging.getLogger(__name__)
|
||||||
|
streamlog = lging.StreamHandler(sys.stdout)
|
||||||
|
formatter = lging.Formatter("%(asctime)s: %(message)s")
|
||||||
|
streamlog.setFormatter(formatter)
|
||||||
|
LOG.logger.addHandler(streamlog)
|
||||||
|
LOG.logger.setLevel(logging.INFO)
|
||||||
|
|
||||||
|
|
||||||
|
def get_identity_auth(conf, project=None, username=None, password=None):
|
||||||
|
loader = loading.get_plugin_loader('password')
|
||||||
|
return loader.load_from_options(
|
||||||
|
auth_url=conf.service_user.os_auth_url,
|
||||||
|
username=(username or conf.service_user.os_username),
|
||||||
|
password=(password or conf.service_user.os_password),
|
||||||
|
user_domain_name=conf.service_user.os_user_domain_name,
|
||||||
|
project_name=(project or conf.service_user.os_project_name),
|
||||||
|
tenant_name=(project or conf.service_user.os_project_name),
|
||||||
|
project_domain_name=conf.service_user.os_project_domain_name)
|
||||||
|
|
||||||
|
|
||||||
|
class InfraAdmin(object):
|
||||||
|
|
||||||
|
def __init__(self, conf, log):
|
||||||
|
self.conf = conf
|
||||||
|
self.log = log
|
||||||
|
self.app = None
|
||||||
|
|
||||||
|
def start(self):
|
||||||
|
self.log.info('InfraAdmin start...')
|
||||||
|
self.app = InfraAdminManager(self.conf, self.log)
|
||||||
|
self.app.start()
|
||||||
|
|
||||||
|
def stop(self):
|
||||||
|
self.log.info('InfraAdmin stop...')
|
||||||
|
if not self.app:
|
||||||
|
return
|
||||||
|
headers = {
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
'Accept': 'application/json',
|
||||||
|
}
|
||||||
|
url = 'http://%s:%d/shutdown'\
|
||||||
|
% (self.conf.host,
|
||||||
|
self.conf.port)
|
||||||
|
requests.post(url, data='', headers=headers)
|
||||||
|
|
||||||
|
|
||||||
|
class InfraAdminManager(Thread):
|
||||||
|
|
||||||
|
def __init__(self, conf, log, project='service'):
|
||||||
|
Thread.__init__(self)
|
||||||
|
self.conf = conf
|
||||||
|
self.log = log
|
||||||
|
self.project = project
|
||||||
|
# Now we are as admin:admin:admin by default. This means we listen
|
||||||
|
# notifications/events as admin
|
||||||
|
# This means Fenix service user needs to be admin:admin:admin
|
||||||
|
# self.auth = identity_auth.get_identity_auth(conf,
|
||||||
|
# project=self.project)
|
||||||
|
self.auth = get_identity_auth(conf,
|
||||||
|
project='service',
|
||||||
|
username='fenix',
|
||||||
|
password='admin')
|
||||||
|
self.session = identity_auth.get_session(auth=self.auth)
|
||||||
|
self.keystone = ks_client.Client(version='v3', session=self.session)
|
||||||
|
self.aodh = aodhclient.Client(2, self.session)
|
||||||
|
self.headers = {
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
'Accept': 'application/json'}
|
||||||
|
self.project_id = self.keystone.projects.list(name=self.project)[0].id
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
|
self.create_alarm()
|
||||||
|
services = self.keystone.services.list()
|
||||||
|
for service in services:
|
||||||
|
if service.type == 'maintenance':
|
||||||
|
LOG.info('maintenance service: %s:%s type %s'
|
||||||
|
% (service.name, service.id, service.type))
|
||||||
|
maint_id = service.id
|
||||||
|
self.endpoint = [ep.url for ep in self.keystone.endpoints.list()
|
||||||
|
if ep.service_id == maint_id and
|
||||||
|
ep.interface == 'public'][0]
|
||||||
|
self.log.info('maintenance endpoint: %s' % self.endpoint)
|
||||||
|
|
||||||
|
if self.conf.workflow_file:
|
||||||
|
with open(self.conf.workflow_file) as json_file:
|
||||||
|
self.session_request = yaml.safe_load(json_file)
|
||||||
|
else:
|
||||||
|
if self.conf.cloud_type == 'openstack':
|
||||||
|
metadata = {'openstack': 'upgrade'}
|
||||||
|
elif self.conf.cloud_type in ['k8s', 'kubernetes']:
|
||||||
|
metadata = {'kubernetes': 'upgrade'}
|
||||||
|
else:
|
||||||
|
metadata = {}
|
||||||
|
self.session_request = {'state': 'MAINTENANCE',
|
||||||
|
'workflow': self.conf.workflow,
|
||||||
|
'metadata': metadata,
|
||||||
|
'actions': [
|
||||||
|
{"plugin": "dummy",
|
||||||
|
"type": "host",
|
||||||
|
"metadata": {"foo": "bar"}}]}
|
||||||
|
|
||||||
|
self.start_maintenance()
|
||||||
|
|
||||||
|
def create_alarm(self):
|
||||||
|
alarms = {alarm['name']: alarm for alarm in self.aodh.alarm.list()}
|
||||||
|
alarm_name = "%s_MAINTENANCE_SESSION" % self.project
|
||||||
|
if alarm_name not in alarms:
|
||||||
|
alarm_request = dict(
|
||||||
|
name=alarm_name,
|
||||||
|
description=alarm_name,
|
||||||
|
enabled=True,
|
||||||
|
alarm_actions=[u'http://%s:%d/maintenance_session'
|
||||||
|
% (self.conf.host,
|
||||||
|
self.conf.port)],
|
||||||
|
repeat_actions=True,
|
||||||
|
severity=u'moderate',
|
||||||
|
type=u'event',
|
||||||
|
event_rule=dict(event_type=u'maintenance.session'))
|
||||||
|
self.aodh.alarm.create(alarm_request)
|
||||||
|
alarm_name = "%s_MAINTENANCE_HOST" % self.project
|
||||||
|
if alarm_name not in alarms:
|
||||||
|
alarm_request = dict(
|
||||||
|
name=alarm_name,
|
||||||
|
description=alarm_name,
|
||||||
|
enabled=True,
|
||||||
|
alarm_actions=[u'http://%s:%d/maintenance_host'
|
||||||
|
% (self.conf.host,
|
||||||
|
self.conf.port)],
|
||||||
|
repeat_actions=True,
|
||||||
|
severity=u'moderate',
|
||||||
|
type=u'event',
|
||||||
|
event_rule=dict(event_type=u'maintenance.host'))
|
||||||
|
self.aodh.alarm.create(alarm_request)
|
||||||
|
|
||||||
|
def start_maintenance(self):
|
||||||
|
self.log.info('Waiting AODH to initialize...')
|
||||||
|
time.sleep(5)
|
||||||
|
input('--Press ENTER to start maintenance session--')
|
||||||
|
|
||||||
|
maintenance_at = (datetime.datetime.utcnow() +
|
||||||
|
datetime.timedelta(seconds=20)
|
||||||
|
).strftime('%Y-%m-%d %H:%M:%S')
|
||||||
|
|
||||||
|
self.session_request['maintenance_at'] = maintenance_at
|
||||||
|
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
|
url = self.endpoint + "/maintenance"
|
||||||
|
self.log.info('Start maintenance session: %s\n%s\n%s' %
|
||||||
|
(url, self.headers, self.session_request))
|
||||||
|
ret = requests.post(url, data=json.dumps(self.session_request),
|
||||||
|
headers=self.headers)
|
||||||
|
session_id = ret.json()['session_id']
|
||||||
|
self.log.info('--== Maintenance session %s instantiated ==--'
|
||||||
|
% session_id)
|
||||||
|
|
||||||
|
def _alarm_data_decoder(self, data):
|
||||||
|
if "[" in data or "{" in data:
|
||||||
|
# string to list or dict removing unicode
|
||||||
|
data = yaml.load(data.replace("u'", "'"))
|
||||||
|
return data
|
||||||
|
|
||||||
|
def _alarm_traits_decoder(self, data):
|
||||||
|
return ({str(t[0]): self._alarm_data_decoder(str(t[2]))
|
||||||
|
for t in data['reason_data']['event']['traits']})
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
app = Flask('InfraAdmin')
|
||||||
|
|
||||||
|
@app.route('/maintenance_host', methods=['POST'])
|
||||||
|
def maintenance_host():
|
||||||
|
data = json.loads(request.data.decode('utf8'))
|
||||||
|
try:
|
||||||
|
payload = self._alarm_traits_decoder(data)
|
||||||
|
except Exception:
|
||||||
|
payload = ({t[0]: t[2] for t in
|
||||||
|
data['reason_data']['event']['traits']})
|
||||||
|
self.log.error('cannot parse alarm data: %s' % payload)
|
||||||
|
raise Exception('VNFM cannot parse alarm.'
|
||||||
|
'Possibly trait data over 256 char')
|
||||||
|
|
||||||
|
state = payload['state']
|
||||||
|
host = payload['host']
|
||||||
|
session_id = payload['session_id']
|
||||||
|
self.log.info("%s: Host: %s %s" % (session_id, host, state))
|
||||||
|
return 'OK'
|
||||||
|
|
||||||
|
@app.route('/maintenance_session', methods=['POST'])
|
||||||
|
def maintenance_session():
|
||||||
|
data = json.loads(request.data.decode('utf8'))
|
||||||
|
try:
|
||||||
|
payload = self._alarm_traits_decoder(data)
|
||||||
|
except Exception:
|
||||||
|
payload = ({t[0]: t[2] for t in
|
||||||
|
data['reason_data']['event']['traits']})
|
||||||
|
self.log.error('cannot parse alarm data: %s' % payload)
|
||||||
|
raise Exception('VNFM cannot parse alarm.'
|
||||||
|
'Possibly trait data over 256 char')
|
||||||
|
state = payload['state']
|
||||||
|
percent_done = payload['percent_done']
|
||||||
|
session_id = payload['session_id']
|
||||||
|
self.log.info("%s: %s%% done in state %s" % (session_id,
|
||||||
|
percent_done,
|
||||||
|
state))
|
||||||
|
if state in ['MAINTENANCE_FAILED', 'MAINTENANCE_DONE']:
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
|
input('--Press any key to remove %s session--' %
|
||||||
|
session_id)
|
||||||
|
self.log.info('Remove maintenance session %s....' % session_id)
|
||||||
|
|
||||||
|
url = ('%s/maintenance/%s' % (self.endpoint, session_id))
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
|
|
||||||
|
ret = requests.delete(url, data=None, headers=self.headers)
|
||||||
|
LOG.info('Press CTRL + C to quit')
|
||||||
|
if ret.status_code != 200:
|
||||||
|
raise Exception(ret.text)
|
||||||
|
|
||||||
|
return 'OK'
|
||||||
|
|
||||||
|
@app.route('/shutdown', methods=['POST'])
|
||||||
|
def shutdown():
|
||||||
|
self.log.info('shutdown InfraAdmin server at %s' % time.time())
|
||||||
|
func = request.environ.get('werkzeug.server.shutdown')
|
||||||
|
if func is None:
|
||||||
|
raise RuntimeError('Not running with the Werkzeug Server')
|
||||||
|
func()
|
||||||
|
return 'InfraAdmin shutting down...'
|
||||||
|
|
||||||
|
app.run(host=self.conf.host, port=self.conf.port)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
parser = argparse.ArgumentParser(description='Workflow Admin tool')
|
||||||
|
|
||||||
|
parser.add_argument('--file', type=str, default=None,
|
||||||
|
help='Workflow sesssion creation arguments file')
|
||||||
|
|
||||||
|
parser.add_argument('--host', type=str, default=None,
|
||||||
|
help='the ip of InfraAdmin')
|
||||||
|
|
||||||
|
parser.add_argument('--port', type=int, default=None,
|
||||||
|
help='the port of InfraAdmin')
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
opts = [
|
||||||
|
cfg.StrOpt('host',
|
||||||
|
default=(args.host or '127.0.0.1'),
|
||||||
|
help='the ip of InfraAdmin',
|
||||||
|
required=True),
|
||||||
|
cfg.IntOpt('port',
|
||||||
|
default=(args.port or '12349'),
|
||||||
|
help='the port of InfraAdmin',
|
||||||
|
required=True),
|
||||||
|
cfg.StrOpt('workflow',
|
||||||
|
default=os.environ.get('WORKFLOW', 'vnf'),
|
||||||
|
help='Workflow to be used',
|
||||||
|
required=True),
|
||||||
|
cfg.StrOpt('cloud_type',
|
||||||
|
default=os.environ.get('CLOUD_TYPE', 'openstack'),
|
||||||
|
help='Cloud type for metadata',
|
||||||
|
required=True),
|
||||||
|
cfg.StrOpt('workflow_file',
|
||||||
|
default=(args.file or None),
|
||||||
|
help='Workflow session creation arguments file',
|
||||||
|
required=True)]
|
||||||
|
|
||||||
|
CONF = cfg.CONF
|
||||||
|
CONF.register_opts(opts)
|
||||||
|
CONF.register_opts(identity_auth.os_opts, group='service_user')
|
||||||
|
|
||||||
|
app = InfraAdmin(CONF, LOG)
|
||||||
|
app.start()
|
||||||
|
try:
|
||||||
|
LOG.info('Press CTRL + C to quit')
|
||||||
|
while True:
|
||||||
|
time.sleep(2)
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
app.stop()
|
108
fenix/tools/maintenance_hot_tpl.yaml
Normal file
108
fenix/tools/maintenance_hot_tpl.yaml
Normal file
@ -0,0 +1,108 @@
|
|||||||
|
---
|
||||||
|
heat_template_version: 2017-02-24
|
||||||
|
description: Demo VNF test case
|
||||||
|
|
||||||
|
parameters:
|
||||||
|
ext_net:
|
||||||
|
type: string
|
||||||
|
default: public
|
||||||
|
# flavor_vcpus:
|
||||||
|
# type: number
|
||||||
|
# default: 24
|
||||||
|
maint_image:
|
||||||
|
type: string
|
||||||
|
default: cirros-0.4.0-x86_64-disk
|
||||||
|
ha_intances:
|
||||||
|
type: number
|
||||||
|
default: 2
|
||||||
|
nonha_intances:
|
||||||
|
type: number
|
||||||
|
default: 10
|
||||||
|
app_manager_alarm_url:
|
||||||
|
type: string
|
||||||
|
default: http://0.0.0.0:12348/maintenance
|
||||||
|
|
||||||
|
|
||||||
|
resources:
|
||||||
|
int_net:
|
||||||
|
type: OS::Neutron::Net
|
||||||
|
|
||||||
|
int_subnet:
|
||||||
|
type: OS::Neutron::Subnet
|
||||||
|
properties:
|
||||||
|
network_id: {get_resource: int_net}
|
||||||
|
cidr: "9.9.9.0/24"
|
||||||
|
dns_nameservers: ["8.8.8.8"]
|
||||||
|
ip_version: 4
|
||||||
|
|
||||||
|
int_router:
|
||||||
|
type: OS::Neutron::Router
|
||||||
|
properties:
|
||||||
|
external_gateway_info: {network: {get_param: ext_net}}
|
||||||
|
|
||||||
|
int_interface:
|
||||||
|
type: OS::Neutron::RouterInterface
|
||||||
|
properties:
|
||||||
|
router_id: {get_resource: int_router}
|
||||||
|
subnet: {get_resource: int_subnet}
|
||||||
|
|
||||||
|
# maint_instance_flavor:
|
||||||
|
# type: OS::Nova::Flavor
|
||||||
|
# properties:
|
||||||
|
# name: demo_maint_flavor
|
||||||
|
# ram: 512
|
||||||
|
# vcpus: {get_param: flavor_vcpus}
|
||||||
|
# disk: 1
|
||||||
|
|
||||||
|
ha_app_svrgrp:
|
||||||
|
type: OS::Nova::ServerGroup
|
||||||
|
properties:
|
||||||
|
name: demo_ha_app_group
|
||||||
|
policies: ['anti-affinity']
|
||||||
|
|
||||||
|
floating_ip:
|
||||||
|
type: OS::Nova::FloatingIP
|
||||||
|
properties:
|
||||||
|
pool: {get_param: ext_net}
|
||||||
|
|
||||||
|
multi_ha_instances:
|
||||||
|
type: OS::Heat::ResourceGroup
|
||||||
|
properties:
|
||||||
|
count: {get_param: ha_intances}
|
||||||
|
resource_def:
|
||||||
|
type: OS::Nova::Server
|
||||||
|
properties:
|
||||||
|
name: demo_ha_app_%index%
|
||||||
|
flavor: demo_maint_flavor
|
||||||
|
image: {get_param: maint_image}
|
||||||
|
networks:
|
||||||
|
- network: {get_resource: int_net}
|
||||||
|
scheduler_hints:
|
||||||
|
group: {get_resource: ha_app_svrgrp}
|
||||||
|
|
||||||
|
multi_nonha_instances:
|
||||||
|
type: OS::Heat::ResourceGroup
|
||||||
|
properties:
|
||||||
|
count: {get_param: nonha_intances}
|
||||||
|
resource_def:
|
||||||
|
type: OS::Nova::Server
|
||||||
|
properties:
|
||||||
|
name: demo_nonha_app_%index%
|
||||||
|
flavor: demo_maint_flavor
|
||||||
|
image: {get_param: maint_image}
|
||||||
|
networks:
|
||||||
|
- network: {get_resource: int_net}
|
||||||
|
|
||||||
|
association:
|
||||||
|
type: OS::Nova::FloatingIPAssociation
|
||||||
|
properties:
|
||||||
|
floating_ip: {get_resource: floating_ip}
|
||||||
|
server_id: {get_attr: [multi_ha_instances, resource.0]}
|
||||||
|
|
||||||
|
app_manager_alarm:
|
||||||
|
type: OS::Aodh::EventAlarm
|
||||||
|
properties:
|
||||||
|
alarm_actions:
|
||||||
|
- {get_param: app_manager_alarm_url}
|
||||||
|
event_type: "maintenance.scheduled"
|
||||||
|
repeat_actions: true
|
6
fenix/tools/session.json
Normal file
6
fenix/tools/session.json
Normal file
@ -0,0 +1,6 @@
|
|||||||
|
{
|
||||||
|
"state": "MAINTENANCE",
|
||||||
|
"metadata": {"openstack": "upgrade"},
|
||||||
|
"actions": [{"metadata": {"os": "upgrade"}, "type": "host", "plugin": "dummy"}],
|
||||||
|
"workflow": "vnf"
|
||||||
|
}
|
185
fenix/tools/set_config.py
Normal file
185
fenix/tools/set_config.py
Normal file
@ -0,0 +1,185 @@
|
|||||||
|
# Copyright (c) 2020 ZTE and others.
|
||||||
|
# All Rights Reserved.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||||
|
# not use this file except in compliance with the License. You may obtain
|
||||||
|
# a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||||
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
|
# License for the specific language governing permissions and limitations
|
||||||
|
# under the License.
|
||||||
|
import os
|
||||||
|
import shutil
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
|
||||||
|
cbase = "/var/lib/config-data/puppet-generated/ceilometer"
|
||||||
|
if not os.path.isdir(cbase):
|
||||||
|
cbase = ""
|
||||||
|
|
||||||
|
|
||||||
|
def set_notifier_topic():
|
||||||
|
ep_file = cbase + '/etc/ceilometer/event_pipeline.yaml'
|
||||||
|
ep_file_bak = cbase + '/etc/ceilometer/event_pipeline.yaml.bak'
|
||||||
|
event_notifier_topic = 'notifier://?topic=alarm.all'
|
||||||
|
config_modified = False
|
||||||
|
|
||||||
|
if not os.path.isfile(ep_file):
|
||||||
|
raise Exception("File doesn't exist: %s." % ep_file)
|
||||||
|
|
||||||
|
with open(ep_file, 'r') as file:
|
||||||
|
config = yaml.safe_load(file)
|
||||||
|
|
||||||
|
sinks = config['sinks']
|
||||||
|
for sink in sinks:
|
||||||
|
if sink['name'] == 'event_sink':
|
||||||
|
publishers = sink['publishers']
|
||||||
|
if event_notifier_topic not in publishers:
|
||||||
|
print('Add event notifier in ceilometer')
|
||||||
|
publishers.append(event_notifier_topic)
|
||||||
|
config_modified = True
|
||||||
|
else:
|
||||||
|
print('NOTE: event notifier is configured'
|
||||||
|
'in ceilometer as we needed')
|
||||||
|
|
||||||
|
if config_modified:
|
||||||
|
shutil.copyfile(ep_file, ep_file_bak)
|
||||||
|
with open(ep_file, 'w+') as file:
|
||||||
|
file.write(yaml.safe_dump(config))
|
||||||
|
|
||||||
|
|
||||||
|
def set_event_definitions():
|
||||||
|
ed_file = cbase + '/etc/ceilometer/event_definitions.yaml'
|
||||||
|
ed_file_bak = cbase + '/etc/ceilometer/event_definitions.bak'
|
||||||
|
orig_ed_file_exist = True
|
||||||
|
modify_config = False
|
||||||
|
|
||||||
|
if not os.path.isfile(ed_file):
|
||||||
|
# Deployment did not modify file, so it did not exist
|
||||||
|
src_file = '/etc/ceilometer/event_definitions.yaml'
|
||||||
|
if not os.path.isfile(src_file):
|
||||||
|
config = []
|
||||||
|
orig_ed_file_exist = False
|
||||||
|
else:
|
||||||
|
shutil.copyfile('/etc/ceilometer/event_definitions.yaml', ed_file)
|
||||||
|
if orig_ed_file_exist:
|
||||||
|
with open(ed_file, 'r') as file:
|
||||||
|
config = yaml.safe_load(file)
|
||||||
|
|
||||||
|
et_list = [et['event_type'] for et in config]
|
||||||
|
|
||||||
|
if 'compute.instance.update' in et_list:
|
||||||
|
print('NOTE: compute.instance.update allready configured')
|
||||||
|
else:
|
||||||
|
print('NOTE: add compute.instance.update to event_definitions.yaml')
|
||||||
|
modify_config = True
|
||||||
|
instance_update = {
|
||||||
|
'event_type': 'compute.instance.update',
|
||||||
|
'traits': {
|
||||||
|
'deleted_at': {'fields': 'payload.deleted_at',
|
||||||
|
'type': 'datetime'},
|
||||||
|
'disk_gb': {'fields': 'payload.disk_gb',
|
||||||
|
'type': 'int'},
|
||||||
|
'display_name': {'fields': 'payload.display_name'},
|
||||||
|
'ephemeral_gb': {'fields': 'payload.ephemeral_gb',
|
||||||
|
'type': 'int'},
|
||||||
|
'host': {'fields': 'publisher_id.`split(., 1, 1)`'},
|
||||||
|
'instance_id': {'fields': 'payload.instance_id'},
|
||||||
|
'instance_type': {'fields': 'payload.instance_type'},
|
||||||
|
'instance_type_id': {'fields': 'payload.instance_type_id',
|
||||||
|
'type': 'int'},
|
||||||
|
'launched_at': {'fields': 'payload.launched_at',
|
||||||
|
'type': 'datetime'},
|
||||||
|
'memory_mb': {'fields': 'payload.memory_mb',
|
||||||
|
'type': 'int'},
|
||||||
|
'old_state': {'fields': 'payload.old_state'},
|
||||||
|
'os_architecture': {
|
||||||
|
'fields':
|
||||||
|
"payload.image_meta.'org.openstack__1__architecture'"},
|
||||||
|
'os_distro': {
|
||||||
|
'fields':
|
||||||
|
"payload.image_meta.'org.openstack__1__os_distro'"},
|
||||||
|
'os_version': {
|
||||||
|
'fields':
|
||||||
|
"payload.image_meta.'org.openstack__1__os_version'"},
|
||||||
|
'resource_id': {'fields': 'payload.instance_id'},
|
||||||
|
'root_gb': {'fields': 'payload.root_gb',
|
||||||
|
'type': 'int'},
|
||||||
|
'service': {'fields': 'publisher_id.`split(., 0, -1)`'},
|
||||||
|
'state': {'fields': 'payload.state'},
|
||||||
|
'tenant_id': {'fields': 'payload.tenant_id'},
|
||||||
|
'user_id': {'fields': 'payload.user_id'},
|
||||||
|
'vcpus': {'fields': 'payload.vcpus', 'type': 'int'}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
config.append(instance_update)
|
||||||
|
|
||||||
|
if 'maintenance.scheduled' in et_list:
|
||||||
|
print('NOTE: maintenance.scheduled allready configured')
|
||||||
|
else:
|
||||||
|
print('NOTE: add maintenance.scheduled to event_definitions.yaml')
|
||||||
|
modify_config = True
|
||||||
|
mscheduled = {
|
||||||
|
'event_type': 'maintenance.scheduled',
|
||||||
|
'traits': {
|
||||||
|
'allowed_actions': {'fields': 'payload.allowed_actions'},
|
||||||
|
'instance_ids': {'fields': 'payload.instance_ids'},
|
||||||
|
'reply_url': {'fields': 'payload.reply_url'},
|
||||||
|
'actions_at': {'fields': 'payload.actions_at',
|
||||||
|
'type': 'datetime'},
|
||||||
|
'reply_at': {'fields': 'payload.reply_at', 'type': 'datetime'},
|
||||||
|
'state': {'fields': 'payload.state'},
|
||||||
|
'session_id': {'fields': 'payload.session_id'},
|
||||||
|
'project_id': {'fields': 'payload.project_id'},
|
||||||
|
'metadata': {'fields': 'payload.metadata'}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
config.append(mscheduled)
|
||||||
|
|
||||||
|
if 'maintenance.host' in et_list:
|
||||||
|
print('NOTE: maintenance.host allready configured')
|
||||||
|
else:
|
||||||
|
print('NOTE: add maintenance.host to event_definitions.yaml')
|
||||||
|
modify_config = True
|
||||||
|
mhost = {
|
||||||
|
'event_type': 'maintenance.host',
|
||||||
|
'traits': {
|
||||||
|
'host': {'fields': 'payload.host'},
|
||||||
|
'project_id': {'fields': 'payload.project_id'},
|
||||||
|
'state': {'fields': 'payload.state'},
|
||||||
|
'session_id': {'fields': 'payload.session_id'}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
config.append(mhost)
|
||||||
|
|
||||||
|
if 'maintenance.session' in et_list:
|
||||||
|
print('NOTE: maintenance.session allready configured')
|
||||||
|
else:
|
||||||
|
print('NOTE: add maintenance.session to event_definitions.yaml')
|
||||||
|
modify_config = True
|
||||||
|
mhost = {
|
||||||
|
'event_type': 'maintenance.session',
|
||||||
|
'traits': {
|
||||||
|
'percent_done': {'fields': 'payload.percent_done'},
|
||||||
|
'project_id': {'fields': 'payload.project_id'},
|
||||||
|
'state': {'fields': 'payload.state'},
|
||||||
|
'session_id': {'fields': 'payload.session_id'}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
config.append(mhost)
|
||||||
|
|
||||||
|
if modify_config:
|
||||||
|
if orig_ed_file_exist:
|
||||||
|
shutil.copyfile(ed_file, ed_file_bak)
|
||||||
|
else:
|
||||||
|
with open(ed_file_bak, 'w+') as file:
|
||||||
|
file.close()
|
||||||
|
with open(ed_file, 'w+') as file:
|
||||||
|
file.write(yaml.safe_dump(config))
|
||||||
|
|
||||||
|
set_notifier_topic()
|
||||||
|
set_event_definitions()
|
@ -12,21 +12,25 @@
|
|||||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
# License for the specific language governing permissions and limitations
|
# License for the specific language governing permissions and limitations
|
||||||
# under the License.
|
# under the License.
|
||||||
import aodhclient.client as aodhclient
|
|
||||||
import datetime
|
|
||||||
from flask import Flask
|
from flask import Flask
|
||||||
from flask import request
|
from flask import request
|
||||||
|
import heatclient.client as heatclient
|
||||||
|
from heatclient.common.template_utils import get_template_contents
|
||||||
|
from heatclient import exc as heat_excecption
|
||||||
import json
|
import json
|
||||||
|
from keystoneauth1 import loading
|
||||||
from keystoneclient import client as ks_client
|
from keystoneclient import client as ks_client
|
||||||
from kubernetes import client
|
|
||||||
from kubernetes import config
|
|
||||||
import logging as lging
|
import logging as lging
|
||||||
|
from neutronclient.v2_0 import client as neutronclient
|
||||||
|
import novaclient.client as novaclient
|
||||||
|
import os
|
||||||
from oslo_config import cfg
|
from oslo_config import cfg
|
||||||
from oslo_log import log as logging
|
from oslo_log import log as logging
|
||||||
import requests
|
import requests
|
||||||
import sys
|
import sys
|
||||||
from threading import Thread
|
from threading import Thread
|
||||||
import time
|
import time
|
||||||
|
import uuid
|
||||||
import yaml
|
import yaml
|
||||||
|
|
||||||
try:
|
try:
|
||||||
@ -56,6 +60,120 @@ CONF.register_opts(opts)
|
|||||||
CONF.register_opts(identity_auth.os_opts, group='service_user')
|
CONF.register_opts(identity_auth.os_opts, group='service_user')
|
||||||
|
|
||||||
|
|
||||||
|
class Stack(object):
|
||||||
|
|
||||||
|
def __init__(self, conf, log, project='demo'):
|
||||||
|
self.conf = conf
|
||||||
|
self.log = log
|
||||||
|
self.project = project
|
||||||
|
self.auth = identity_auth.get_identity_auth(conf, project=self.project)
|
||||||
|
self.session = identity_auth.get_session(self.auth)
|
||||||
|
self.heat = heatclient.Client(version='1', session=self.session)
|
||||||
|
self.stack_name = None
|
||||||
|
self.stack_id = None
|
||||||
|
self.template = None
|
||||||
|
self.parameters = {}
|
||||||
|
self.files = {}
|
||||||
|
|
||||||
|
# standard yaml.load will not work for hot tpl becasue of date format in
|
||||||
|
# heat_template_version is not string
|
||||||
|
def get_hot_tpl(self, template_file):
|
||||||
|
if not os.path.isfile(template_file):
|
||||||
|
raise Exception('File(%s) does not exist' % template_file)
|
||||||
|
return get_template_contents(template_file=template_file)
|
||||||
|
|
||||||
|
def _wait_stack_action_complete(self, action):
|
||||||
|
action_in_progress = '%s_IN_PROGRESS' % action
|
||||||
|
action_complete = '%s_COMPLETE' % action
|
||||||
|
action_failed = '%s_FAILED' % action
|
||||||
|
|
||||||
|
status = action_in_progress
|
||||||
|
stack_retries = 160
|
||||||
|
while status == action_in_progress and stack_retries > 0:
|
||||||
|
time.sleep(2)
|
||||||
|
try:
|
||||||
|
stack = self.heat.stacks.get(self.stack_name)
|
||||||
|
except heat_excecption.HTTPNotFound:
|
||||||
|
if action == 'DELETE':
|
||||||
|
# Might happen you never get status as stack deleted
|
||||||
|
status = action_complete
|
||||||
|
break
|
||||||
|
else:
|
||||||
|
raise Exception('unable to get stack')
|
||||||
|
status = stack.stack_status
|
||||||
|
stack_retries = stack_retries - 1
|
||||||
|
if stack_retries == 0 and status != action_complete:
|
||||||
|
raise Exception("stack %s not completed within 5min, status:"
|
||||||
|
" %s" % (action, status))
|
||||||
|
elif status == action_complete:
|
||||||
|
self.log.info('stack %s %s' % (self.stack_name, status))
|
||||||
|
elif status == action_failed:
|
||||||
|
raise Exception("stack %s failed" % action)
|
||||||
|
else:
|
||||||
|
self.log.error('stack %s %s' % (self.stack_name, status))
|
||||||
|
raise Exception("stack %s unknown result" % action)
|
||||||
|
|
||||||
|
def wait_stack_delete(self):
|
||||||
|
self._wait_stack_action_complete('DELETE')
|
||||||
|
|
||||||
|
def wait_stack_create(self):
|
||||||
|
self._wait_stack_action_complete('CREATE')
|
||||||
|
|
||||||
|
def wait_stack_update(self):
|
||||||
|
self._wait_stack_action_complete('UPDATE')
|
||||||
|
|
||||||
|
def create(self, stack_name, template, parameters={}, files={}):
|
||||||
|
self.stack_name = stack_name
|
||||||
|
self.template = template
|
||||||
|
self.parameters = parameters
|
||||||
|
self.files = files
|
||||||
|
stack = self.heat.stacks.create(stack_name=self.stack_name,
|
||||||
|
files=files,
|
||||||
|
template=template,
|
||||||
|
parameters=parameters)
|
||||||
|
self.stack_id = stack['stack']['id']
|
||||||
|
try:
|
||||||
|
self.wait_stack_create()
|
||||||
|
except Exception:
|
||||||
|
# It might not always work at first
|
||||||
|
self.log.info('retry creating maintenance stack.......')
|
||||||
|
self.delete()
|
||||||
|
time.sleep(5)
|
||||||
|
stack = self.heat.stacks.create(stack_name=self.stack_name,
|
||||||
|
files=files,
|
||||||
|
template=template,
|
||||||
|
parameters=parameters)
|
||||||
|
self.stack_id = stack['stack']['id']
|
||||||
|
self.wait_stack_create()
|
||||||
|
|
||||||
|
def update(self, stack_name, stack_id, template, parameters={}, files={}):
|
||||||
|
self.heat.stacks.update(stack_name=stack_name,
|
||||||
|
stack_id=stack_id,
|
||||||
|
files=files,
|
||||||
|
template=template,
|
||||||
|
parameters=parameters)
|
||||||
|
self.wait_stack_update()
|
||||||
|
|
||||||
|
def delete(self):
|
||||||
|
if self.stack_id is not None:
|
||||||
|
self.heat.stacks.delete(self.stack_name)
|
||||||
|
self.wait_stack_delete()
|
||||||
|
else:
|
||||||
|
self.log.info('no stack to delete')
|
||||||
|
|
||||||
|
|
||||||
|
def get_identity_auth(conf, project=None, username=None, password=None):
|
||||||
|
loader = loading.get_plugin_loader('password')
|
||||||
|
return loader.load_from_options(
|
||||||
|
auth_url=conf.service_user.os_auth_url,
|
||||||
|
username=(username or conf.service_user.os_username),
|
||||||
|
password=(password or conf.service_user.os_password),
|
||||||
|
user_domain_name=conf.service_user.os_user_domain_name,
|
||||||
|
project_name=(project or conf.service_user.os_project_name),
|
||||||
|
tenant_name=(project or conf.service_user.os_project_name),
|
||||||
|
project_domain_name=conf.service_user.os_project_domain_name)
|
||||||
|
|
||||||
|
|
||||||
class VNFM(object):
|
class VNFM(object):
|
||||||
|
|
||||||
def __init__(self, conf, log):
|
def __init__(self, conf, log):
|
||||||
@ -64,16 +182,18 @@ class VNFM(object):
|
|||||||
self.app = None
|
self.app = None
|
||||||
|
|
||||||
def start(self):
|
def start(self):
|
||||||
LOG.info('VNFM start......')
|
self.log.info('VNFM start...')
|
||||||
self.app = VNFManager(self.conf, self.log)
|
self.app = VNFManager(self.conf, self.log)
|
||||||
self.app.start()
|
self.app.start()
|
||||||
|
|
||||||
def stop(self):
|
def stop(self):
|
||||||
LOG.info('VNFM stop......')
|
self.log.info('VNFM stop...')
|
||||||
if not self.app:
|
if not self.app:
|
||||||
return
|
return
|
||||||
self.app.headers['X-Auth-Token'] = self.app.session.get_token()
|
self.log.info('delete VNF constraints...')
|
||||||
self.app.delete_constraints()
|
self.app.delete_constraints()
|
||||||
|
self.log.info('VNF delete start...')
|
||||||
|
self.app.stack.delete()
|
||||||
headers = {
|
headers = {
|
||||||
'Content-Type': 'application/json',
|
'Content-Type': 'application/json',
|
||||||
'Accept': 'application/json',
|
'Accept': 'application/json',
|
||||||
@ -86,29 +206,38 @@ class VNFM(object):
|
|||||||
|
|
||||||
class VNFManager(Thread):
|
class VNFManager(Thread):
|
||||||
|
|
||||||
def __init__(self, conf, log):
|
def __init__(self, conf, log, project='demo'):
|
||||||
Thread.__init__(self)
|
Thread.__init__(self)
|
||||||
self.conf = conf
|
self.conf = conf
|
||||||
self.log = log
|
|
||||||
self.port = self.conf.port
|
self.port = self.conf.port
|
||||||
|
self.log = log
|
||||||
self.intance_ids = None
|
self.intance_ids = None
|
||||||
# VNFM is started with OS_* exported as admin user
|
self.project = project
|
||||||
# We need that to query Fenix endpoint url
|
|
||||||
# Still we work with our tenant/poroject/vnf as demo
|
|
||||||
self.project = "demo"
|
|
||||||
LOG.info('VNFM project: %s' % self.project)
|
|
||||||
self.auth = identity_auth.get_identity_auth(conf, project=self.project)
|
self.auth = identity_auth.get_identity_auth(conf, project=self.project)
|
||||||
self.session = identity_auth.get_session(auth=self.auth)
|
self.session = identity_auth.get_session(auth=self.auth)
|
||||||
self.ks = ks_client.Client(version='v3', session=self.session)
|
self.keystone = ks_client.Client(version='v3', session=self.session)
|
||||||
self.aodh = aodhclient.Client(2, self.session)
|
auth = get_identity_auth(conf,
|
||||||
# Subscribe to mainenance event alarm from Fenix via AODH
|
project='service',
|
||||||
self.create_alarm()
|
username='fenix',
|
||||||
config.load_kube_config()
|
password='admin')
|
||||||
self.kaapi = client.AppsV1Api()
|
session = identity_auth.get_session(auth=auth)
|
||||||
self.kapi = client.CoreV1Api()
|
keystone = ks_client.Client(version='v3', session=session)
|
||||||
|
self.nova = novaclient.Client(version='2.34', session=self.session)
|
||||||
|
self.neutron = neutronclient.Client(session=self.session)
|
||||||
self.headers = {
|
self.headers = {
|
||||||
'Content-Type': 'application/json',
|
'Content-Type': 'application/json',
|
||||||
'Accept': 'application/json'}
|
'Accept': 'application/json'}
|
||||||
|
self.project_id = self.session.get_project_id()
|
||||||
|
self.stack = Stack(self.conf, self.log, self.project)
|
||||||
|
files, template = self.stack.get_hot_tpl('maintenance_hot_tpl.yaml')
|
||||||
|
ext_net = self.get_external_network()
|
||||||
|
parameters = {'ext_net': ext_net}
|
||||||
|
self.log.info('creating VNF...')
|
||||||
|
self.log.info('parameters: %s' % parameters)
|
||||||
|
self.stack.create('%s_stack' % self.project,
|
||||||
|
template,
|
||||||
|
parameters=parameters,
|
||||||
|
files=files)
|
||||||
self.headers['X-Auth-Token'] = self.session.get_token()
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
self.orig_number_of_instances = self.number_of_instances()
|
self.orig_number_of_instances = self.number_of_instances()
|
||||||
# List of instances
|
# List of instances
|
||||||
@ -118,66 +247,58 @@ class VNFManager(Thread):
|
|||||||
self.instance_constraints = None
|
self.instance_constraints = None
|
||||||
# Update existing instances to instance lists
|
# Update existing instances to instance lists
|
||||||
self.update_instances()
|
self.update_instances()
|
||||||
# How many instances needs to exists (with current VNF load)
|
nonha_instances = len(self.nonha_instances)
|
||||||
# max_impacted_members need to be updated accordingly
|
if nonha_instances < 7:
|
||||||
# if number of instances is scaled. example for demo-ha:
|
self.scale = 2
|
||||||
# max_impacted_members = len(self.ha_instances) - ha_group_limit
|
else:
|
||||||
self.ha_group_limit = 2
|
self.scale = int((nonha_instances) / 2)
|
||||||
self.nonha_group_limit = 2
|
self.log.info('Init nonha_instances: %s scale: %s: max_impacted %s' %
|
||||||
|
(nonha_instances, self.scale, nonha_instances - 1))
|
||||||
# Different instance groups constraints dict
|
# Different instance groups constraints dict
|
||||||
self.ha_group = None
|
self.ha_group = None
|
||||||
self.nonha_group = None
|
self.nonha_group = None
|
||||||
# VNF project_id (VNF ID)
|
self.nonha_group_id = str(uuid.uuid4())
|
||||||
self.project_id = None
|
self.ha_group_id = [sg.id for sg in self.nova.server_groups.list()
|
||||||
# HA instance_id that is active has active label
|
if sg.name == "%s_ha_app_group" % self.project][0]
|
||||||
|
# Floating IP used in HA instance
|
||||||
|
self.floating_ip = None
|
||||||
|
# HA instance_id that is active / has floating IP
|
||||||
self.active_instance_id = self.active_instance_id()
|
self.active_instance_id = self.active_instance_id()
|
||||||
|
|
||||||
services = self.ks.services.list()
|
services = keystone.services.list()
|
||||||
for service in services:
|
for service in services:
|
||||||
if service.type == 'maintenance':
|
if service.type == 'maintenance':
|
||||||
LOG.info('maintenance service: %s:%s type %s'
|
self.log.info('maintenance service: %s:%s type %s'
|
||||||
% (service.name, service.id, service.type))
|
% (service.name, service.id, service.type))
|
||||||
maint_id = service.id
|
maint_id = service.id
|
||||||
self.maint_endpoint = [ep.url for ep in self.ks.endpoints.list()
|
self.maint_endpoint = [ep.url for ep in keystone.endpoints.list()
|
||||||
if ep.service_id == maint_id and
|
if ep.service_id == maint_id and
|
||||||
ep.interface == 'public'][0]
|
ep.interface == 'public'][0]
|
||||||
LOG.info('maintenance endpoint: %s' % self.maint_endpoint)
|
self.log.info('maintenance endpoint: %s' % self.maint_endpoint)
|
||||||
self.update_constraints_lock = False
|
self.update_constraints_lock = False
|
||||||
self.update_constraints()
|
self.update_constraints()
|
||||||
# Instances waiting action to be done
|
|
||||||
self.pending_actions = {}
|
|
||||||
|
|
||||||
def create_alarm(self):
|
def get_external_network(self):
|
||||||
alarms = {alarm['name']: alarm for alarm in self.aodh.alarm.list()}
|
ext_net = None
|
||||||
alarm_name = "%s_MAINTENANCE_ALARM" % self.project
|
networks = self.neutron.list_networks()['networks']
|
||||||
if alarm_name in alarms:
|
for network in networks:
|
||||||
return
|
if network['router:external']:
|
||||||
alarm_request = dict(
|
ext_net = network['name']
|
||||||
name=alarm_name,
|
break
|
||||||
description=alarm_name,
|
if ext_net is None:
|
||||||
enabled=True,
|
raise Exception("external network not defined")
|
||||||
alarm_actions=[u'http://%s:%d/maintenance'
|
return ext_net
|
||||||
% (self.conf.ip,
|
|
||||||
self.conf.port)],
|
|
||||||
repeat_actions=True,
|
|
||||||
severity=u'moderate',
|
|
||||||
type=u'event',
|
|
||||||
event_rule=dict(event_type=u'maintenance.scheduled'))
|
|
||||||
self.aodh.alarm.create(alarm_request)
|
|
||||||
|
|
||||||
def delete_remote_instance_constraints(self, instance_id):
|
def delete_remote_instance_constraints(self, instance_id):
|
||||||
url = "%s/instance/%s" % (self.maint_endpoint, instance_id)
|
url = "%s/instance/%s" % (self.maint_endpoint, instance_id)
|
||||||
LOG.info('DELETE: %s' % url)
|
self.log.info('DELETE: %s' % url)
|
||||||
ret = requests.delete(url, data=None, headers=self.headers)
|
ret = requests.delete(url, data=None, headers=self.headers)
|
||||||
if ret.status_code != 200 and ret.status_code != 204:
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
if ret.status_code == 404:
|
raise Exception(ret.text)
|
||||||
LOG.info('Already deleted: %s' % instance_id)
|
|
||||||
else:
|
|
||||||
raise Exception(ret.text)
|
|
||||||
|
|
||||||
def update_remote_instance_constraints(self, instance):
|
def update_remote_instance_constraints(self, instance):
|
||||||
url = "%s/instance/%s" % (self.maint_endpoint, instance["instance_id"])
|
url = "%s/instance/%s" % (self.maint_endpoint, instance["instance_id"])
|
||||||
LOG.info('PUT: %s' % url)
|
self.log.info('PUT: %s' % url)
|
||||||
ret = requests.put(url, data=json.dumps(instance),
|
ret = requests.put(url, data=json.dumps(instance),
|
||||||
headers=self.headers)
|
headers=self.headers)
|
||||||
if ret.status_code != 200 and ret.status_code != 204:
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
@ -186,7 +307,7 @@ class VNFManager(Thread):
|
|||||||
def delete_remote_group_constraints(self, instance_group):
|
def delete_remote_group_constraints(self, instance_group):
|
||||||
url = "%s/instance_group/%s" % (self.maint_endpoint,
|
url = "%s/instance_group/%s" % (self.maint_endpoint,
|
||||||
instance_group["group_id"])
|
instance_group["group_id"])
|
||||||
LOG.info('DELETE: %s' % url)
|
self.log.info('DELETE: %s' % url)
|
||||||
ret = requests.delete(url, data=None, headers=self.headers)
|
ret = requests.delete(url, data=None, headers=self.headers)
|
||||||
if ret.status_code != 200 and ret.status_code != 204:
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
raise Exception(ret.text)
|
raise Exception(ret.text)
|
||||||
@ -194,13 +315,14 @@ class VNFManager(Thread):
|
|||||||
def update_remote_group_constraints(self, instance_group):
|
def update_remote_group_constraints(self, instance_group):
|
||||||
url = "%s/instance_group/%s" % (self.maint_endpoint,
|
url = "%s/instance_group/%s" % (self.maint_endpoint,
|
||||||
instance_group["group_id"])
|
instance_group["group_id"])
|
||||||
LOG.info('PUT: %s' % url)
|
self.log.info('PUT: %s' % url)
|
||||||
ret = requests.put(url, data=json.dumps(instance_group),
|
ret = requests.put(url, data=json.dumps(instance_group),
|
||||||
headers=self.headers)
|
headers=self.headers)
|
||||||
if ret.status_code != 200 and ret.status_code != 204:
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
raise Exception(ret.text)
|
raise Exception(ret.text)
|
||||||
|
|
||||||
def delete_constraints(self):
|
def delete_constraints(self):
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
for instance_id in self.instance_constraints:
|
for instance_id in self.instance_constraints:
|
||||||
self.delete_remote_instance_constraints(instance_id)
|
self.delete_remote_instance_constraints(instance_id)
|
||||||
self.delete_remote_group_constraints(self.nonha_group)
|
self.delete_remote_group_constraints(self.nonha_group)
|
||||||
@ -208,73 +330,82 @@ class VNFManager(Thread):
|
|||||||
|
|
||||||
def update_constraints(self):
|
def update_constraints(self):
|
||||||
while self.update_constraints_lock:
|
while self.update_constraints_lock:
|
||||||
LOG.info('Waiting update_constraints_lock...')
|
self.log.info('Waiting update_constraints_lock...')
|
||||||
time.sleep(1)
|
time.sleep(1)
|
||||||
self.update_constraints_lock = True
|
self.update_constraints_lock = True
|
||||||
LOG.info('Update constraints')
|
self.log.info('Update constraints')
|
||||||
if self.project_id is None:
|
|
||||||
self.project_id = self.ks.projects.list(name=self.project)[0].id
|
# Nova does not support groupping instances that do not belong to
|
||||||
# Pods groupped by ReplicaSet, so we use that id
|
# anti-affinity server_groups. Anyhow all instances need groupping
|
||||||
rs = {r.metadata.name: r.metadata.uid for r in
|
|
||||||
self.kaapi.list_namespaced_replica_set('demo').items}
|
|
||||||
max_impacted_members = len(self.nonha_instances) - 1
|
max_impacted_members = len(self.nonha_instances) - 1
|
||||||
nonha_group = {
|
nonha_group = {
|
||||||
"group_id": rs['demo-nonha'],
|
"group_id": self.nonha_group_id,
|
||||||
"project_id": self.project_id,
|
"project_id": self.project_id,
|
||||||
"group_name": "demo-nonha",
|
"group_name": "%s_nonha_app_group" % self.project,
|
||||||
"anti_affinity_group": False,
|
"anti_affinity_group": False,
|
||||||
"max_instances_per_host": 0,
|
"max_instances_per_host": 0,
|
||||||
"max_impacted_members": max_impacted_members,
|
"max_impacted_members": max_impacted_members,
|
||||||
"recovery_time": 10,
|
"recovery_time": 2,
|
||||||
"resource_mitigation": True}
|
"resource_mitigation": True}
|
||||||
LOG.info('create demo-nonha constraints: %s'
|
self.log.info('create %s_nonha_app_group constraints: %s'
|
||||||
% nonha_group)
|
% (self.project, nonha_group))
|
||||||
|
|
||||||
ha_group = {
|
ha_group = {
|
||||||
"group_id": rs['demo-ha'],
|
"group_id": self.ha_group_id,
|
||||||
"project_id": self.project_id,
|
"project_id": self.project_id,
|
||||||
"group_name": "demo-ha",
|
"group_name": "%s_ha_app_group" % self.project,
|
||||||
"anti_affinity_group": True,
|
"anti_affinity_group": True,
|
||||||
"max_instances_per_host": 1,
|
"max_instances_per_host": 1,
|
||||||
"max_impacted_members": 1,
|
"max_impacted_members": 1,
|
||||||
"recovery_time": 10,
|
"recovery_time": 4,
|
||||||
"resource_mitigation": True}
|
"resource_mitigation": True}
|
||||||
LOG.info('create demo-ha constraints: %s'
|
self.log.info('create %s_ha_app_group constraints: %s'
|
||||||
% ha_group)
|
% (self.project, ha_group))
|
||||||
|
if not self.ha_group or self.ha_group != ha_group:
|
||||||
|
LOG.info('ha instance group need update')
|
||||||
|
self.update_remote_group_constraints(ha_group)
|
||||||
|
self.ha_group = ha_group.copy()
|
||||||
|
if not self.nonha_group or self.nonha_group != nonha_group:
|
||||||
|
LOG.info('nonha instance group need update')
|
||||||
|
self.update_remote_group_constraints(nonha_group)
|
||||||
|
self.nonha_group = nonha_group.copy()
|
||||||
|
|
||||||
instance_constraints = {}
|
instance_constraints = {}
|
||||||
for ha_instance in self.ha_instances:
|
for ha_instance in self.ha_instances:
|
||||||
instance = {
|
instance = {
|
||||||
"instance_id": ha_instance.metadata.uid,
|
"instance_id": ha_instance.id,
|
||||||
"project_id": self.project_id,
|
"project_id": self.project_id,
|
||||||
"group_id": ha_group["group_id"],
|
"group_id": ha_group["group_id"],
|
||||||
"instance_name": ha_instance.metadata.name,
|
"instance_name": ha_instance.name,
|
||||||
"max_interruption_time": 120,
|
"max_interruption_time": 120,
|
||||||
"migration_type": "EVICTION",
|
"migration_type": "MIGRATE",
|
||||||
"resource_mitigation": True,
|
"resource_mitigation": True,
|
||||||
"lead_time": 40}
|
"lead_time": 40}
|
||||||
LOG.info('create ha instance constraints: %s' % instance)
|
self.log.info('create ha instance constraints: %s'
|
||||||
instance_constraints[ha_instance.metadata.uid] = instance
|
% instance)
|
||||||
|
instance_constraints[ha_instance.id] = instance
|
||||||
for nonha_instance in self.nonha_instances:
|
for nonha_instance in self.nonha_instances:
|
||||||
instance = {
|
instance = {
|
||||||
"instance_id": nonha_instance.metadata.uid,
|
"instance_id": nonha_instance.id,
|
||||||
"project_id": self.project_id,
|
"project_id": self.project_id,
|
||||||
"group_id": nonha_group["group_id"],
|
"group_id": nonha_group["group_id"],
|
||||||
"instance_name": nonha_instance.metadata.name,
|
"instance_name": nonha_instance.name,
|
||||||
"max_interruption_time": 120,
|
"max_interruption_time": 120,
|
||||||
"migration_type": "EVICTION",
|
"migration_type": "MIGRATE",
|
||||||
"resource_mitigation": True,
|
"resource_mitigation": True,
|
||||||
"lead_time": 40}
|
"lead_time": 40}
|
||||||
LOG.info('create nonha instance constraints: %s' % instance)
|
self.log.info('create nonha instance constraints: %s'
|
||||||
instance_constraints[nonha_instance.metadata.uid] = instance
|
% instance)
|
||||||
|
instance_constraints[nonha_instance.id] = instance
|
||||||
if not self.instance_constraints:
|
if not self.instance_constraints:
|
||||||
# Initial instance constraints
|
# Initial instance constraints
|
||||||
LOG.info('create initial instances constraints...')
|
self.log.info('create initial instances constraints...')
|
||||||
for instance in [instance_constraints[i] for i
|
for instance in [instance_constraints[i] for i
|
||||||
in instance_constraints]:
|
in instance_constraints]:
|
||||||
self.update_remote_instance_constraints(instance)
|
self.update_remote_instance_constraints(instance)
|
||||||
self.instance_constraints = instance_constraints.copy()
|
self.instance_constraints = instance_constraints.copy()
|
||||||
else:
|
else:
|
||||||
LOG.info('check instances constraints changes...')
|
self.log.info('check instances constraints changes...')
|
||||||
added = [i for i in instance_constraints.keys()
|
added = [i for i in instance_constraints.keys()
|
||||||
if i not in self.instance_constraints]
|
if i not in self.instance_constraints]
|
||||||
deleted = [i for i in self.instance_constraints.keys()
|
deleted = [i for i in self.instance_constraints.keys()
|
||||||
@ -291,64 +422,55 @@ class VNFManager(Thread):
|
|||||||
if updated or deleted:
|
if updated or deleted:
|
||||||
# Some instance constraints have changed
|
# Some instance constraints have changed
|
||||||
self.instance_constraints = instance_constraints.copy()
|
self.instance_constraints = instance_constraints.copy()
|
||||||
if not self.ha_group or self.ha_group != ha_group:
|
|
||||||
LOG.info('ha instance group need update')
|
|
||||||
self.update_remote_group_constraints(ha_group)
|
|
||||||
self.ha_group = ha_group.copy()
|
|
||||||
if not self.nonha_group or self.nonha_group != nonha_group:
|
|
||||||
LOG.info('nonha instance group need update')
|
|
||||||
self.update_remote_group_constraints(nonha_group)
|
|
||||||
self.nonha_group = nonha_group.copy()
|
|
||||||
self.update_constraints_lock = False
|
self.update_constraints_lock = False
|
||||||
|
|
||||||
def active_instance_id(self):
|
def active_instance_id(self):
|
||||||
# We digtate the active in the beginning
|
# Need rertry as it takes time after heat template done before
|
||||||
instance = self.ha_instances[0]
|
# Floating IP in place
|
||||||
LOG.info('Initially Active instance: %s %s' %
|
retry = 5
|
||||||
(instance.metadata.name, instance.metadata.uid))
|
while retry > 0:
|
||||||
name = instance.metadata.name
|
|
||||||
namespace = instance.metadata.namespace
|
|
||||||
body = {"metadata": {"labels": {"active": "True"}}}
|
|
||||||
self.kapi.patch_namespaced_pod(name, namespace, body)
|
|
||||||
self.active_instance_id = instance.metadata.uid
|
|
||||||
|
|
||||||
def switch_over_ha_instance(self, instance_id):
|
|
||||||
if instance_id == self.active_instance_id:
|
|
||||||
# Need to switchover as instance_id will be affected and is active
|
|
||||||
for instance in self.ha_instances:
|
for instance in self.ha_instances:
|
||||||
if instance_id == instance.metadata.uid:
|
network_interfaces = next(iter(instance.addresses.values()))
|
||||||
LOG.info('Active to Standby: %s %s' %
|
for network_interface in network_interfaces:
|
||||||
(instance.metadata.name, instance.metadata.uid))
|
_type = network_interface.get('OS-EXT-IPS:type')
|
||||||
name = instance.metadata.name
|
if _type == "floating":
|
||||||
namespace = instance.metadata.namespace
|
if not self.floating_ip:
|
||||||
body = client.UNKNOWN_BASE_TYPE()
|
self.floating_ip = network_interface.get('addr')
|
||||||
body.metadata.labels = {"ative": None}
|
self.log.debug('active_instance: %s %s' %
|
||||||
self.kapi.patch_namespaced_pod(name, namespace, body)
|
(instance.name, instance.id))
|
||||||
else:
|
return instance.id
|
||||||
LOG.info('Standby to Active: %s %s' %
|
time.sleep(2)
|
||||||
(instance.metadata.name, instance.metadata.uid))
|
|
||||||
name = instance.metadata.name
|
|
||||||
namespace = instance.metadata.namespace
|
|
||||||
body = client.UNKNOWN_BASE_TYPE()
|
|
||||||
body.metadata.labels = {"ative": "True"}
|
|
||||||
self.kapi.patch_namespaced_pod(name, namespace, body)
|
|
||||||
self.active_instance_id = instance.metadata.uid
|
|
||||||
self.update_instances()
|
self.update_instances()
|
||||||
|
retry -= 1
|
||||||
|
raise Exception("No active instance found")
|
||||||
|
|
||||||
|
def switch_over_ha_instance(self):
|
||||||
|
for instance in self.ha_instances:
|
||||||
|
if instance.id != self.active_instance_id:
|
||||||
|
self.log.info('Switch over to: %s %s' % (instance.name,
|
||||||
|
instance.id))
|
||||||
|
# Deprecated, need to use neutron instead
|
||||||
|
# instance.add_floating_ip(self.floating_ip)
|
||||||
|
port = self.neutron.list_ports(device_id=instance.id)['ports'][0]['id'] # noqa
|
||||||
|
floating_id = self.neutron.list_floatingips(floating_ip_address=self.floating_ip)['floatingips'][0]['id'] # noqa
|
||||||
|
self.neutron.update_floatingip(floating_id, {'floatingip': {'port_id': port}}) # noqa
|
||||||
|
# Have to update ha_instances as floating_ip changed
|
||||||
|
self.update_instances()
|
||||||
|
self.active_instance_id = instance.id
|
||||||
|
break
|
||||||
|
|
||||||
def get_instance_ids(self):
|
def get_instance_ids(self):
|
||||||
instances = self.kapi.list_pod_for_all_namespaces().items
|
ret = list()
|
||||||
return [i.metadata.uid for i in instances
|
for instance in self.nova.servers.list(detailed=False):
|
||||||
if i.metadata.name.startswith("demo-")
|
ret.append(instance.id)
|
||||||
and i.metadata.namespace == "demo"]
|
return ret
|
||||||
|
|
||||||
def update_instances(self):
|
def update_instances(self):
|
||||||
instances = self.kapi.list_pod_for_all_namespaces().items
|
instances = self.nova.servers.list(detailed=True)
|
||||||
self.ha_instances = [i for i in instances
|
self.ha_instances = [i for i in instances
|
||||||
if i.metadata.name.startswith("demo-ha")
|
if "%s_ha_app_" % self.project in i.name]
|
||||||
and i.metadata.namespace == "demo"]
|
|
||||||
self.nonha_instances = [i for i in instances
|
self.nonha_instances = [i for i in instances
|
||||||
if i.metadata.name.startswith("demo-nonha")
|
if "%s_nonha_app_" % self.project in i.name]
|
||||||
and i.metadata.namespace == "demo"]
|
|
||||||
|
|
||||||
def _alarm_data_decoder(self, data):
|
def _alarm_data_decoder(self, data):
|
||||||
if "[" in data or "{" in data:
|
if "[" in data or "{" in data:
|
||||||
@ -364,77 +486,38 @@ class VNFManager(Thread):
|
|||||||
ret = requests.get(url, data=None, headers=self.headers)
|
ret = requests.get(url, data=None, headers=self.headers)
|
||||||
if ret.status_code != 200:
|
if ret.status_code != 200:
|
||||||
raise Exception(ret.text)
|
raise Exception(ret.text)
|
||||||
LOG.info('get_instance_ids %s' % ret.json())
|
self.log.info('get_instance_ids %s' % ret.json())
|
||||||
return ret.json()['instance_ids']
|
return ret.json()['instance_ids']
|
||||||
|
|
||||||
def scale_instances(self, scale_instances):
|
def scale_instances(self, number_of_instances):
|
||||||
|
# number_of_instances_before = self.number_of_instances()
|
||||||
number_of_instances_before = len(self.nonha_instances)
|
number_of_instances_before = len(self.nonha_instances)
|
||||||
replicas = number_of_instances_before + scale_instances
|
parameters = self.stack.parameters
|
||||||
|
parameters['nonha_intances'] = (number_of_instances_before +
|
||||||
|
number_of_instances)
|
||||||
|
self.stack.update(self.stack.stack_name,
|
||||||
|
self.stack.stack_id,
|
||||||
|
self.stack.template,
|
||||||
|
parameters=parameters,
|
||||||
|
files=self.stack.files)
|
||||||
|
|
||||||
# We only scale nonha apps
|
# number_of_instances_after = self.number_of_instances()
|
||||||
namespace = "demo"
|
|
||||||
name = "demo-nonha"
|
|
||||||
body = {'spec': {"replicas": replicas}}
|
|
||||||
self.kaapi.patch_namespaced_replica_set_scale(name, namespace, body)
|
|
||||||
time.sleep(3)
|
|
||||||
|
|
||||||
# Let's check if scale has taken effect
|
|
||||||
self.update_instances()
|
self.update_instances()
|
||||||
|
self.update_constraints()
|
||||||
number_of_instances_after = len(self.nonha_instances)
|
number_of_instances_after = len(self.nonha_instances)
|
||||||
check = 20
|
if (number_of_instances_before + number_of_instances !=
|
||||||
while number_of_instances_after == number_of_instances_before:
|
number_of_instances_after):
|
||||||
if check == 0:
|
self.log.error('scale_instances with: %d from: %d ends up to: %d'
|
||||||
LOG.error('scale_instances with: %d failed, still %d instances'
|
% (number_of_instances, number_of_instances_before,
|
||||||
% (scale_instances, number_of_instances_after))
|
number_of_instances_after))
|
||||||
raise Exception('scale_instances failed')
|
raise Exception('scale_instances failed')
|
||||||
check -= 1
|
|
||||||
time.sleep(1)
|
|
||||||
self.update_instances()
|
|
||||||
number_of_instances_after = len(self.nonha_instances)
|
|
||||||
|
|
||||||
LOG.info('scaled instances from %d to %d' %
|
self.log.info('scaled nonha_intances from %d to %d' %
|
||||||
(number_of_instances_before, number_of_instances_after))
|
(number_of_instances_before,
|
||||||
|
number_of_instances_after))
|
||||||
|
|
||||||
def number_of_instances(self):
|
def number_of_instances(self):
|
||||||
instances = self.kapi.list_pod_for_all_namespaces().items
|
return len(self.nova.servers.list(detailed=False))
|
||||||
return len([i for i in instances
|
|
||||||
if i.metadata.name.startswith("demo-")])
|
|
||||||
|
|
||||||
def instance_action(self, instance_id, allowed_actions):
|
|
||||||
# We should keep instance constraint in our internal structur
|
|
||||||
# and match instance_id specific allowed action. Now we assume EVICTION
|
|
||||||
if 'EVICTION' not in allowed_actions:
|
|
||||||
LOG.error('Action for %s not foudn from %s' %
|
|
||||||
(instance_id, allowed_actions))
|
|
||||||
return None
|
|
||||||
return 'EVICTION'
|
|
||||||
|
|
||||||
def instance_action_started(self, instance_id, action):
|
|
||||||
time_now = datetime.datetime.utcnow()
|
|
||||||
max_interruption_time = (
|
|
||||||
self.instance_constraints[instance_id]['max_interruption_time'])
|
|
||||||
self.pending_actions[instance_id] = {
|
|
||||||
'started': time_now,
|
|
||||||
'max_interruption_time': max_interruption_time,
|
|
||||||
'action': action}
|
|
||||||
|
|
||||||
def was_instance_action_in_time(self, instance_id):
|
|
||||||
time_now = datetime.datetime.utcnow()
|
|
||||||
started = self.pending_actions[instance_id]['started']
|
|
||||||
limit = self.pending_actions[instance_id]['max_interruption_time']
|
|
||||||
action = self.pending_actions[instance_id]['action']
|
|
||||||
td = time_now - started
|
|
||||||
if td.total_seconds() > limit:
|
|
||||||
LOG.error('%s %s took too long: %ds' %
|
|
||||||
(instance_id, action, td.total_seconds()))
|
|
||||||
LOG.error('%s max_interruption_time %ds might be too short' %
|
|
||||||
(instance_id, limit))
|
|
||||||
raise Exception('%s %s took too long: %ds' %
|
|
||||||
(instance_id, action, td.total_seconds()))
|
|
||||||
else:
|
|
||||||
LOG.info('%s %s with recovery time took %ds' %
|
|
||||||
(instance_id, action, td.total_seconds()))
|
|
||||||
del self.pending_actions[instance_id]
|
|
||||||
|
|
||||||
def run(self):
|
def run(self):
|
||||||
app = Flask('VNFM')
|
app = Flask('VNFM')
|
||||||
@ -447,85 +530,86 @@ class VNFManager(Thread):
|
|||||||
except Exception:
|
except Exception:
|
||||||
payload = ({t[0]: t[2] for t in
|
payload = ({t[0]: t[2] for t in
|
||||||
data['reason_data']['event']['traits']})
|
data['reason_data']['event']['traits']})
|
||||||
LOG.error('cannot parse alarm data: %s' % payload)
|
self.log.error('cannot parse alarm data: %s' % payload)
|
||||||
raise Exception('VNFM cannot parse alarm.'
|
raise Exception('VNFM cannot parse alarm.'
|
||||||
'Possibly trait data over 256 char')
|
'Possibly trait data over 256 char')
|
||||||
|
|
||||||
LOG.info('VNFM received data = %s' % payload)
|
self.log.info('VNFM received data = %s' % payload)
|
||||||
|
|
||||||
state = payload['state']
|
state = payload['state']
|
||||||
reply_state = None
|
reply_state = None
|
||||||
reply = dict()
|
reply = dict()
|
||||||
|
|
||||||
LOG.info('VNFM state: %s' % state)
|
self.log.info('VNFM state: %s' % state)
|
||||||
|
|
||||||
if state == 'MAINTENANCE':
|
if state == 'MAINTENANCE':
|
||||||
self.headers['X-Auth-Token'] = self.session.get_token()
|
|
||||||
instance_ids = (self.get_session_instance_ids(
|
instance_ids = (self.get_session_instance_ids(
|
||||||
payload['instance_ids'],
|
payload['instance_ids'],
|
||||||
payload['session_id']))
|
payload['session_id']))
|
||||||
reply['instance_ids'] = instance_ids
|
my_instance_ids = self.get_instance_ids()
|
||||||
reply_state = 'ACK_MAINTENANCE'
|
invalid_instances = (
|
||||||
|
[instance_id for instance_id in instance_ids
|
||||||
|
if instance_id not in my_instance_ids])
|
||||||
|
if invalid_instances:
|
||||||
|
self.log.error('Invalid instances: %s' % invalid_instances)
|
||||||
|
reply_state = 'NACK_MAINTENANCE'
|
||||||
|
else:
|
||||||
|
reply_state = 'ACK_MAINTENANCE'
|
||||||
|
|
||||||
elif state == 'SCALE_IN':
|
elif state == 'SCALE_IN':
|
||||||
# scale down only nonha instances
|
# scale down "self.scale" instances that is VCPUS equaling
|
||||||
nonha_instances = len(self.nonha_instances)
|
# at least a single compute node
|
||||||
scale_in = nonha_instances / 2
|
self.scale_instances(-self.scale)
|
||||||
self.scale_instances(-scale_in)
|
|
||||||
self.update_constraints()
|
|
||||||
reply['instance_ids'] = self.get_instance_ids()
|
|
||||||
reply_state = 'ACK_SCALE_IN'
|
reply_state = 'ACK_SCALE_IN'
|
||||||
|
|
||||||
elif state == 'MAINTENANCE_COMPLETE':
|
elif state == 'MAINTENANCE_COMPLETE':
|
||||||
# possibly need to upscale
|
# possibly need to upscale
|
||||||
number_of_instances = self.number_of_instances()
|
self.scale_instances(self.scale)
|
||||||
if self.orig_number_of_instances > number_of_instances:
|
|
||||||
scale_instances = (self.orig_number_of_instances -
|
|
||||||
number_of_instances)
|
|
||||||
self.scale_instances(scale_instances)
|
|
||||||
self.update_constraints()
|
|
||||||
reply_state = 'ACK_MAINTENANCE_COMPLETE'
|
reply_state = 'ACK_MAINTENANCE_COMPLETE'
|
||||||
|
|
||||||
elif (state == 'PREPARE_MAINTENANCE'
|
elif state == 'PREPARE_MAINTENANCE':
|
||||||
or state == 'PLANNED_MAINTENANCE'):
|
# TBD from contraints
|
||||||
instance_id = payload['instance_ids'][0]
|
if "MIGRATE" not in payload['allowed_actions']:
|
||||||
instance_action = (self.instance_action(instance_id,
|
raise Exception('MIGRATE not supported')
|
||||||
payload['allowed_actions']))
|
instance_ids = payload['instance_ids'][0]
|
||||||
if not instance_action:
|
self.log.info('VNFM got instance: %s' % instance_ids)
|
||||||
raise Exception('Allowed_actions not supported for %s' %
|
if instance_ids == self.active_instance_id:
|
||||||
instance_id)
|
self.switch_over_ha_instance()
|
||||||
|
# optional also in contraints
|
||||||
|
reply['instance_action'] = "MIGRATE"
|
||||||
|
reply_state = 'ACK_PREPARE_MAINTENANCE'
|
||||||
|
|
||||||
LOG.info('VNFM got instance: %s' % instance_id)
|
elif state == 'PLANNED_MAINTENANCE':
|
||||||
self.switch_over_ha_instance(instance_id)
|
# TBD from contraints
|
||||||
|
if "MIGRATE" not in payload['allowed_actions']:
|
||||||
reply['instance_action'] = instance_action
|
raise Exception('MIGRATE not supported')
|
||||||
reply_state = 'ACK_%s' % state
|
instance_ids = payload['instance_ids'][0]
|
||||||
self.instance_action_started(instance_id, instance_action)
|
self.log.info('VNFM got instance: %s' % instance_ids)
|
||||||
|
if instance_ids == self.active_instance_id:
|
||||||
|
self.switch_over_ha_instance()
|
||||||
|
# optional also in contraints
|
||||||
|
reply['instance_action'] = "MIGRATE"
|
||||||
|
reply_state = 'ACK_PLANNED_MAINTENANCE'
|
||||||
|
|
||||||
elif state == 'INSTANCE_ACTION_DONE':
|
elif state == 'INSTANCE_ACTION_DONE':
|
||||||
# TBD was action done in max_interruption_time (live migration)
|
# TBD was action done in allowed window
|
||||||
# NOTE, in EVICTION instance_id reported that was in evicted
|
self.log.info('%s' % payload['instance_ids'])
|
||||||
# node. New instance_id might be different
|
|
||||||
LOG.info('%s' % payload['instance_ids'])
|
|
||||||
self.was_instance_action_in_time(payload['instance_ids'][0])
|
|
||||||
self.update_instances()
|
|
||||||
self.update_constraints()
|
|
||||||
else:
|
else:
|
||||||
raise Exception('VNFM received event with'
|
raise Exception('VNFM received event with'
|
||||||
' unknown state %s' % state)
|
' unknown state %s' % state)
|
||||||
|
|
||||||
if reply_state:
|
if reply_state:
|
||||||
reply['session_id'] = payload['session_id']
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
reply['state'] = reply_state
|
reply['state'] = reply_state
|
||||||
url = payload['reply_url']
|
url = payload['reply_url']
|
||||||
LOG.info('VNFM reply: %s' % reply)
|
self.log.info('VNFM reply: %s' % reply)
|
||||||
requests.put(url, data=json.dumps(reply), headers=self.headers)
|
requests.put(url, data=json.dumps(reply), headers=self.headers)
|
||||||
|
|
||||||
return 'OK'
|
return 'OK'
|
||||||
|
|
||||||
@app.route('/shutdown', methods=['POST'])
|
@app.route('/shutdown', methods=['POST'])
|
||||||
def shutdown():
|
def shutdown():
|
||||||
LOG.info('shutdown VNFM server at %s' % time.time())
|
self.log.info('shutdown VNFM server at %s' % time.time())
|
||||||
func = request.environ.get('werkzeug.server.shutdown')
|
func = request.environ.get('werkzeug.server.shutdown')
|
||||||
if func is None:
|
if func is None:
|
||||||
raise RuntimeError('Not running with the Werkzeug Server')
|
raise RuntimeError('Not running with the Werkzeug Server')
|
||||||
@ -543,3 +627,5 @@ if __name__ == '__main__':
|
|||||||
time.sleep(2)
|
time.sleep(2)
|
||||||
except KeyboardInterrupt:
|
except KeyboardInterrupt:
|
||||||
app_manager.stop()
|
app_manager.stop()
|
||||||
|
except Exception:
|
||||||
|
app_manager.app.stack.delete()
|
||||||
|
561
fenix/tools/vnfm_k8s.py
Normal file
561
fenix/tools/vnfm_k8s.py
Normal file
@ -0,0 +1,561 @@
|
|||||||
|
# Copyright (c) 2020 Nokia Corporation.
|
||||||
|
# All Rights Reserved.
|
||||||
|
#
|
||||||
|
# Licensed under the Apache License, Version 2.0 (the "License"); you may
|
||||||
|
# not use this file except in compliance with the License. You may obtain
|
||||||
|
# a copy of the License at
|
||||||
|
#
|
||||||
|
# http://www.apache.org/licenses/LICENSE-2.0
|
||||||
|
#
|
||||||
|
# Unless required by applicable law or agreed to in writing, software
|
||||||
|
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
|
||||||
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
|
# License for the specific language governing permissions and limitations
|
||||||
|
# under the License.
|
||||||
|
import aodhclient.client as aodhclient
|
||||||
|
import datetime
|
||||||
|
from flask import Flask
|
||||||
|
from flask import request
|
||||||
|
import json
|
||||||
|
from keystoneauth1 import loading
|
||||||
|
from keystoneclient import client as ks_client
|
||||||
|
from kubernetes import client
|
||||||
|
from kubernetes import config
|
||||||
|
import logging as lging
|
||||||
|
from oslo_config import cfg
|
||||||
|
from oslo_log import log as logging
|
||||||
|
import requests
|
||||||
|
import sys
|
||||||
|
from threading import Thread
|
||||||
|
import time
|
||||||
|
import yaml
|
||||||
|
|
||||||
|
try:
|
||||||
|
import fenix.utils.identity_auth as identity_auth
|
||||||
|
except ValueError:
|
||||||
|
sys.path.append('../utils')
|
||||||
|
import identity_auth
|
||||||
|
|
||||||
|
LOG = logging.getLogger(__name__)
|
||||||
|
streamlog = lging.StreamHandler(sys.stdout)
|
||||||
|
LOG.logger.addHandler(streamlog)
|
||||||
|
LOG.logger.setLevel(logging.INFO)
|
||||||
|
|
||||||
|
opts = [
|
||||||
|
cfg.StrOpt('ip',
|
||||||
|
default='127.0.0.1',
|
||||||
|
help='the ip of VNFM',
|
||||||
|
required=True),
|
||||||
|
cfg.IntOpt('port',
|
||||||
|
default='12348',
|
||||||
|
help='the port of VNFM',
|
||||||
|
required=True),
|
||||||
|
]
|
||||||
|
|
||||||
|
CONF = cfg.CONF
|
||||||
|
CONF.register_opts(opts)
|
||||||
|
CONF.register_opts(identity_auth.os_opts, group='service_user')
|
||||||
|
|
||||||
|
|
||||||
|
def get_identity_auth(conf, project=None, username=None, password=None):
|
||||||
|
loader = loading.get_plugin_loader('password')
|
||||||
|
return loader.load_from_options(
|
||||||
|
auth_url=conf.service_user.os_auth_url,
|
||||||
|
username=(username or conf.service_user.os_username),
|
||||||
|
password=(password or conf.service_user.os_password),
|
||||||
|
user_domain_name=conf.service_user.os_user_domain_name,
|
||||||
|
project_name=(project or conf.service_user.os_project_name),
|
||||||
|
tenant_name=(project or conf.service_user.os_project_name),
|
||||||
|
project_domain_name=conf.service_user.os_project_domain_name)
|
||||||
|
|
||||||
|
|
||||||
|
class VNFM(object):
|
||||||
|
|
||||||
|
def __init__(self, conf, log):
|
||||||
|
self.conf = conf
|
||||||
|
self.log = log
|
||||||
|
self.app = None
|
||||||
|
|
||||||
|
def start(self):
|
||||||
|
LOG.info('VNFM start......')
|
||||||
|
self.app = VNFManager(self.conf, self.log)
|
||||||
|
self.app.start()
|
||||||
|
|
||||||
|
def stop(self):
|
||||||
|
LOG.info('VNFM stop......')
|
||||||
|
if not self.app:
|
||||||
|
return
|
||||||
|
self.app.headers['X-Auth-Token'] = self.app.session.get_token()
|
||||||
|
self.app.delete_constraints()
|
||||||
|
headers = {
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
'Accept': 'application/json',
|
||||||
|
}
|
||||||
|
url = 'http://%s:%d/shutdown'\
|
||||||
|
% (self.conf.ip,
|
||||||
|
self.conf.port)
|
||||||
|
requests.post(url, data='', headers=headers)
|
||||||
|
|
||||||
|
|
||||||
|
class VNFManager(Thread):
|
||||||
|
|
||||||
|
def __init__(self, conf, log):
|
||||||
|
Thread.__init__(self)
|
||||||
|
self.conf = conf
|
||||||
|
self.log = log
|
||||||
|
self.port = self.conf.port
|
||||||
|
self.intance_ids = None
|
||||||
|
# VNFM is started with OS_* exported as admin user
|
||||||
|
# We need that to query Fenix endpoint url
|
||||||
|
# Still we work with our tenant/poroject/vnf as demo
|
||||||
|
self.project = "demo"
|
||||||
|
LOG.info('VNFM project: %s' % self.project)
|
||||||
|
self.auth = identity_auth.get_identity_auth(conf, project=self.project)
|
||||||
|
self.session = identity_auth.get_session(auth=self.auth)
|
||||||
|
self.ks = ks_client.Client(version='v3', session=self.session)
|
||||||
|
self.aodh = aodhclient.Client(2, self.session)
|
||||||
|
# Subscribe to mainenance event alarm from Fenix via AODH
|
||||||
|
self.create_alarm()
|
||||||
|
config.load_kube_config()
|
||||||
|
self.kaapi = client.AppsV1Api()
|
||||||
|
self.kapi = client.CoreV1Api()
|
||||||
|
self.headers = {
|
||||||
|
'Content-Type': 'application/json',
|
||||||
|
'Accept': 'application/json'}
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
|
self.orig_number_of_instances = self.number_of_instances()
|
||||||
|
# List of instances
|
||||||
|
self.ha_instances = []
|
||||||
|
self.nonha_instances = []
|
||||||
|
# Different instance_id specific constraints {instanse_id: {},...}
|
||||||
|
self.instance_constraints = None
|
||||||
|
# Update existing instances to instance lists
|
||||||
|
self.update_instances()
|
||||||
|
# How many instances needs to exists (with current VNF load)
|
||||||
|
# max_impacted_members need to be updated accordingly
|
||||||
|
# if number of instances is scaled. example for demo-ha:
|
||||||
|
# max_impacted_members = len(self.ha_instances) - ha_group_limit
|
||||||
|
self.ha_group_limit = 2
|
||||||
|
self.nonha_group_limit = 2
|
||||||
|
# Different instance groups constraints dict
|
||||||
|
self.ha_group = None
|
||||||
|
self.nonha_group = None
|
||||||
|
auth = get_identity_auth(conf,
|
||||||
|
project='service',
|
||||||
|
username='fenix',
|
||||||
|
password='admin')
|
||||||
|
session = identity_auth.get_session(auth=auth)
|
||||||
|
keystone = ks_client.Client(version='v3', session=session)
|
||||||
|
# VNF project_id (VNF ID)
|
||||||
|
self.project_id = self.session.get_project_id()
|
||||||
|
# HA instance_id that is active has active label
|
||||||
|
self.active_instance_id = self.active_instance_id()
|
||||||
|
services = keystone.services.list()
|
||||||
|
for service in services:
|
||||||
|
if service.type == 'maintenance':
|
||||||
|
LOG.info('maintenance service: %s:%s type %s'
|
||||||
|
% (service.name, service.id, service.type))
|
||||||
|
maint_id = service.id
|
||||||
|
self.maint_endpoint = [ep.url for ep in keystone.endpoints.list()
|
||||||
|
if ep.service_id == maint_id and
|
||||||
|
ep.interface == 'public'][0]
|
||||||
|
LOG.info('maintenance endpoint: %s' % self.maint_endpoint)
|
||||||
|
self.update_constraints_lock = False
|
||||||
|
self.update_constraints()
|
||||||
|
# Instances waiting action to be done
|
||||||
|
self.pending_actions = {}
|
||||||
|
|
||||||
|
def create_alarm(self):
|
||||||
|
alarms = {alarm['name']: alarm for alarm in self.aodh.alarm.list()}
|
||||||
|
alarm_name = "%s_MAINTENANCE_ALARM" % self.project
|
||||||
|
if alarm_name in alarms:
|
||||||
|
return
|
||||||
|
alarm_request = dict(
|
||||||
|
name=alarm_name,
|
||||||
|
description=alarm_name,
|
||||||
|
enabled=True,
|
||||||
|
alarm_actions=[u'http://%s:%d/maintenance'
|
||||||
|
% (self.conf.ip,
|
||||||
|
self.conf.port)],
|
||||||
|
repeat_actions=True,
|
||||||
|
severity=u'moderate',
|
||||||
|
type=u'event',
|
||||||
|
event_rule=dict(event_type=u'maintenance.scheduled'))
|
||||||
|
self.aodh.alarm.create(alarm_request)
|
||||||
|
|
||||||
|
def delete_remote_instance_constraints(self, instance_id):
|
||||||
|
url = "%s/instance/%s" % (self.maint_endpoint, instance_id)
|
||||||
|
LOG.info('DELETE: %s' % url)
|
||||||
|
ret = requests.delete(url, data=None, headers=self.headers)
|
||||||
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
|
if ret.status_code == 404:
|
||||||
|
LOG.info('Already deleted: %s' % instance_id)
|
||||||
|
else:
|
||||||
|
raise Exception(ret.text)
|
||||||
|
|
||||||
|
def update_remote_instance_constraints(self, instance):
|
||||||
|
url = "%s/instance/%s" % (self.maint_endpoint, instance["instance_id"])
|
||||||
|
LOG.info('PUT: %s' % url)
|
||||||
|
ret = requests.put(url, data=json.dumps(instance),
|
||||||
|
headers=self.headers)
|
||||||
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
|
raise Exception(ret.text)
|
||||||
|
|
||||||
|
def delete_remote_group_constraints(self, instance_group):
|
||||||
|
url = "%s/instance_group/%s" % (self.maint_endpoint,
|
||||||
|
instance_group["group_id"])
|
||||||
|
LOG.info('DELETE: %s' % url)
|
||||||
|
ret = requests.delete(url, data=None, headers=self.headers)
|
||||||
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
|
raise Exception(ret.text)
|
||||||
|
|
||||||
|
def update_remote_group_constraints(self, instance_group):
|
||||||
|
url = "%s/instance_group/%s" % (self.maint_endpoint,
|
||||||
|
instance_group["group_id"])
|
||||||
|
LOG.info('PUT: %s' % url)
|
||||||
|
ret = requests.put(url, data=json.dumps(instance_group),
|
||||||
|
headers=self.headers)
|
||||||
|
if ret.status_code != 200 and ret.status_code != 204:
|
||||||
|
raise Exception(ret.text)
|
||||||
|
|
||||||
|
def delete_constraints(self):
|
||||||
|
for instance_id in self.instance_constraints:
|
||||||
|
self.delete_remote_instance_constraints(instance_id)
|
||||||
|
self.delete_remote_group_constraints(self.nonha_group)
|
||||||
|
self.delete_remote_group_constraints(self.ha_group)
|
||||||
|
|
||||||
|
def update_constraints(self):
|
||||||
|
while self.update_constraints_lock:
|
||||||
|
LOG.info('Waiting update_constraints_lock...')
|
||||||
|
time.sleep(1)
|
||||||
|
self.update_constraints_lock = True
|
||||||
|
LOG.info('Update constraints')
|
||||||
|
# Pods groupped by ReplicaSet, so we use that id
|
||||||
|
rs = {r.metadata.name: r.metadata.uid for r in
|
||||||
|
self.kaapi.list_namespaced_replica_set('demo').items}
|
||||||
|
max_impacted_members = len(self.nonha_instances) - 1
|
||||||
|
nonha_group = {
|
||||||
|
"group_id": rs['demo-nonha'],
|
||||||
|
"project_id": self.project_id,
|
||||||
|
"group_name": "demo-nonha",
|
||||||
|
"anti_affinity_group": False,
|
||||||
|
"max_instances_per_host": 0,
|
||||||
|
"max_impacted_members": max_impacted_members,
|
||||||
|
"recovery_time": 10,
|
||||||
|
"resource_mitigation": True}
|
||||||
|
LOG.info('create demo-nonha constraints: %s'
|
||||||
|
% nonha_group)
|
||||||
|
ha_group = {
|
||||||
|
"group_id": rs['demo-ha'],
|
||||||
|
"project_id": self.project_id,
|
||||||
|
"group_name": "demo-ha",
|
||||||
|
"anti_affinity_group": True,
|
||||||
|
"max_instances_per_host": 1,
|
||||||
|
"max_impacted_members": 1,
|
||||||
|
"recovery_time": 10,
|
||||||
|
"resource_mitigation": True}
|
||||||
|
LOG.info('create demo-ha constraints: %s'
|
||||||
|
% ha_group)
|
||||||
|
if not self.ha_group or self.ha_group != ha_group:
|
||||||
|
LOG.info('ha instance group need update')
|
||||||
|
self.update_remote_group_constraints(ha_group)
|
||||||
|
self.ha_group = ha_group.copy()
|
||||||
|
if not self.nonha_group or self.nonha_group != nonha_group:
|
||||||
|
LOG.info('nonha instance group need update')
|
||||||
|
self.update_remote_group_constraints(nonha_group)
|
||||||
|
self.nonha_group = nonha_group.copy()
|
||||||
|
|
||||||
|
instance_constraints = {}
|
||||||
|
for ha_instance in self.ha_instances:
|
||||||
|
instance = {
|
||||||
|
"instance_id": ha_instance.metadata.uid,
|
||||||
|
"project_id": self.project_id,
|
||||||
|
"group_id": ha_group["group_id"],
|
||||||
|
"instance_name": ha_instance.metadata.name,
|
||||||
|
"max_interruption_time": 120,
|
||||||
|
"migration_type": "EVICTION",
|
||||||
|
"resource_mitigation": True,
|
||||||
|
"lead_time": 40}
|
||||||
|
LOG.info('create ha instance constraints: %s' % instance)
|
||||||
|
instance_constraints[ha_instance.metadata.uid] = instance
|
||||||
|
for nonha_instance in self.nonha_instances:
|
||||||
|
instance = {
|
||||||
|
"instance_id": nonha_instance.metadata.uid,
|
||||||
|
"project_id": self.project_id,
|
||||||
|
"group_id": nonha_group["group_id"],
|
||||||
|
"instance_name": nonha_instance.metadata.name,
|
||||||
|
"max_interruption_time": 120,
|
||||||
|
"migration_type": "EVICTION",
|
||||||
|
"resource_mitigation": True,
|
||||||
|
"lead_time": 40}
|
||||||
|
LOG.info('create nonha instance constraints: %s' % instance)
|
||||||
|
instance_constraints[nonha_instance.metadata.uid] = instance
|
||||||
|
if not self.instance_constraints:
|
||||||
|
# Initial instance constraints
|
||||||
|
LOG.info('create initial instances constraints...')
|
||||||
|
for instance in [instance_constraints[i] for i
|
||||||
|
in instance_constraints]:
|
||||||
|
self.update_remote_instance_constraints(instance)
|
||||||
|
self.instance_constraints = instance_constraints.copy()
|
||||||
|
else:
|
||||||
|
LOG.info('check instances constraints changes...')
|
||||||
|
added = [i for i in instance_constraints.keys()
|
||||||
|
if i not in self.instance_constraints]
|
||||||
|
deleted = [i for i in self.instance_constraints.keys()
|
||||||
|
if i not in instance_constraints]
|
||||||
|
modified = [i for i in instance_constraints.keys()
|
||||||
|
if (i not in added and i not in deleted and
|
||||||
|
instance_constraints[i] !=
|
||||||
|
self.instance_constraints[i])]
|
||||||
|
for instance_id in deleted:
|
||||||
|
self.delete_remote_instance_constraints(instance_id)
|
||||||
|
updated = added + modified
|
||||||
|
for instance in [instance_constraints[i] for i in updated]:
|
||||||
|
self.update_remote_instance_constraints(instance)
|
||||||
|
if updated or deleted:
|
||||||
|
# Some instance constraints have changed
|
||||||
|
self.instance_constraints = instance_constraints.copy()
|
||||||
|
self.update_constraints_lock = False
|
||||||
|
|
||||||
|
def active_instance_id(self):
|
||||||
|
# We digtate the active in the beginning
|
||||||
|
instance = self.ha_instances[0]
|
||||||
|
LOG.info('Initially Active instance: %s %s' %
|
||||||
|
(instance.metadata.name, instance.metadata.uid))
|
||||||
|
name = instance.metadata.name
|
||||||
|
namespace = instance.metadata.namespace
|
||||||
|
body = {"metadata": {"labels": {"active": "True"}}}
|
||||||
|
self.kapi.patch_namespaced_pod(name, namespace, body)
|
||||||
|
self.active_instance_id = instance.metadata.uid
|
||||||
|
|
||||||
|
def switch_over_ha_instance(self, instance_id):
|
||||||
|
if instance_id == self.active_instance_id:
|
||||||
|
# Need to switchover as instance_id will be affected and is active
|
||||||
|
for instance in self.ha_instances:
|
||||||
|
if instance_id == instance.metadata.uid:
|
||||||
|
LOG.info('Active to Standby: %s %s' %
|
||||||
|
(instance.metadata.name, instance.metadata.uid))
|
||||||
|
name = instance.metadata.name
|
||||||
|
namespace = instance.metadata.namespace
|
||||||
|
body = client.UNKNOWN_BASE_TYPE()
|
||||||
|
body.metadata.labels = {"ative": None}
|
||||||
|
self.kapi.patch_namespaced_pod(name, namespace, body)
|
||||||
|
else:
|
||||||
|
LOG.info('Standby to Active: %s %s' %
|
||||||
|
(instance.metadata.name, instance.metadata.uid))
|
||||||
|
name = instance.metadata.name
|
||||||
|
namespace = instance.metadata.namespace
|
||||||
|
body = client.UNKNOWN_BASE_TYPE()
|
||||||
|
body.metadata.labels = {"ative": "True"}
|
||||||
|
self.kapi.patch_namespaced_pod(name, namespace, body)
|
||||||
|
self.active_instance_id = instance.metadata.uid
|
||||||
|
self.update_instances()
|
||||||
|
|
||||||
|
def get_instance_ids(self):
|
||||||
|
instances = self.kapi.list_pod_for_all_namespaces().items
|
||||||
|
return [i.metadata.uid for i in instances
|
||||||
|
if i.metadata.name.startswith("demo-") and
|
||||||
|
i.metadata.namespace == "demo"]
|
||||||
|
|
||||||
|
def update_instances(self):
|
||||||
|
instances = self.kapi.list_pod_for_all_namespaces().items
|
||||||
|
self.ha_instances = [i for i in instances
|
||||||
|
if i.metadata.name.startswith("demo-ha") and
|
||||||
|
i.metadata.namespace == "demo"]
|
||||||
|
self.nonha_instances = [i for i in instances
|
||||||
|
if i.metadata.name.startswith("demo-nonha") and
|
||||||
|
i.metadata.namespace == "demo"]
|
||||||
|
|
||||||
|
def _alarm_data_decoder(self, data):
|
||||||
|
if "[" in data or "{" in data:
|
||||||
|
# string to list or dict removing unicode
|
||||||
|
data = yaml.load(data.replace("u'", "'"))
|
||||||
|
return data
|
||||||
|
|
||||||
|
def _alarm_traits_decoder(self, data):
|
||||||
|
return ({str(t[0]): self._alarm_data_decoder(str(t[2]))
|
||||||
|
for t in data['reason_data']['event']['traits']})
|
||||||
|
|
||||||
|
def get_session_instance_ids(self, url, session_id):
|
||||||
|
ret = requests.get(url, data=None, headers=self.headers)
|
||||||
|
if ret.status_code != 200:
|
||||||
|
raise Exception(ret.text)
|
||||||
|
LOG.info('get_instance_ids %s' % ret.json())
|
||||||
|
return ret.json()['instance_ids']
|
||||||
|
|
||||||
|
def scale_instances(self, scale_instances):
|
||||||
|
number_of_instances_before = len(self.nonha_instances)
|
||||||
|
replicas = number_of_instances_before + scale_instances
|
||||||
|
|
||||||
|
# We only scale nonha apps
|
||||||
|
namespace = "demo"
|
||||||
|
name = "demo-nonha"
|
||||||
|
body = {'spec': {"replicas": replicas}}
|
||||||
|
self.kaapi.patch_namespaced_replica_set_scale(name, namespace, body)
|
||||||
|
time.sleep(3)
|
||||||
|
|
||||||
|
# Let's check if scale has taken effect
|
||||||
|
self.update_instances()
|
||||||
|
number_of_instances_after = len(self.nonha_instances)
|
||||||
|
check = 20
|
||||||
|
while number_of_instances_after == number_of_instances_before:
|
||||||
|
if check == 0:
|
||||||
|
LOG.error('scale_instances with: %d failed, still %d instances'
|
||||||
|
% (scale_instances, number_of_instances_after))
|
||||||
|
raise Exception('scale_instances failed')
|
||||||
|
check -= 1
|
||||||
|
time.sleep(1)
|
||||||
|
self.update_instances()
|
||||||
|
number_of_instances_after = len(self.nonha_instances)
|
||||||
|
|
||||||
|
LOG.info('scaled instances from %d to %d' %
|
||||||
|
(number_of_instances_before, number_of_instances_after))
|
||||||
|
|
||||||
|
def number_of_instances(self):
|
||||||
|
instances = self.kapi.list_pod_for_all_namespaces().items
|
||||||
|
return len([i for i in instances
|
||||||
|
if i.metadata.name.startswith("demo-")])
|
||||||
|
|
||||||
|
def instance_action(self, instance_id, allowed_actions):
|
||||||
|
# We should keep instance constraint in our internal structur
|
||||||
|
# and match instance_id specific allowed action. Now we assume EVICTION
|
||||||
|
if 'EVICTION' not in allowed_actions:
|
||||||
|
LOG.error('Action for %s not foudn from %s' %
|
||||||
|
(instance_id, allowed_actions))
|
||||||
|
return None
|
||||||
|
return 'EVICTION'
|
||||||
|
|
||||||
|
def instance_action_started(self, instance_id, action):
|
||||||
|
time_now = datetime.datetime.utcnow()
|
||||||
|
max_interruption_time = (
|
||||||
|
self.instance_constraints[instance_id]['max_interruption_time'])
|
||||||
|
self.pending_actions[instance_id] = {
|
||||||
|
'started': time_now,
|
||||||
|
'max_interruption_time': max_interruption_time,
|
||||||
|
'action': action}
|
||||||
|
|
||||||
|
def was_instance_action_in_time(self, instance_id):
|
||||||
|
time_now = datetime.datetime.utcnow()
|
||||||
|
started = self.pending_actions[instance_id]['started']
|
||||||
|
limit = self.pending_actions[instance_id]['max_interruption_time']
|
||||||
|
action = self.pending_actions[instance_id]['action']
|
||||||
|
td = time_now - started
|
||||||
|
if td.total_seconds() > limit:
|
||||||
|
LOG.error('%s %s took too long: %ds' %
|
||||||
|
(instance_id, action, td.total_seconds()))
|
||||||
|
LOG.error('%s max_interruption_time %ds might be too short' %
|
||||||
|
(instance_id, limit))
|
||||||
|
raise Exception('%s %s took too long: %ds' %
|
||||||
|
(instance_id, action, td.total_seconds()))
|
||||||
|
else:
|
||||||
|
LOG.info('%s %s with recovery time took %ds' %
|
||||||
|
(instance_id, action, td.total_seconds()))
|
||||||
|
del self.pending_actions[instance_id]
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
app = Flask('VNFM')
|
||||||
|
|
||||||
|
@app.route('/maintenance', methods=['POST'])
|
||||||
|
def maintenance_alarm():
|
||||||
|
data = json.loads(request.data.decode('utf8'))
|
||||||
|
try:
|
||||||
|
payload = self._alarm_traits_decoder(data)
|
||||||
|
except Exception:
|
||||||
|
payload = ({t[0]: t[2] for t in
|
||||||
|
data['reason_data']['event']['traits']})
|
||||||
|
LOG.error('cannot parse alarm data: %s' % payload)
|
||||||
|
raise Exception('VNFM cannot parse alarm.'
|
||||||
|
'Possibly trait data over 256 char')
|
||||||
|
|
||||||
|
LOG.info('VNFM received data = %s' % payload)
|
||||||
|
|
||||||
|
state = payload['state']
|
||||||
|
reply_state = None
|
||||||
|
reply = dict()
|
||||||
|
|
||||||
|
LOG.info('VNFM state: %s' % state)
|
||||||
|
|
||||||
|
if state == 'MAINTENANCE':
|
||||||
|
self.headers['X-Auth-Token'] = self.session.get_token()
|
||||||
|
instance_ids = (self.get_session_instance_ids(
|
||||||
|
payload['instance_ids'],
|
||||||
|
payload['session_id']))
|
||||||
|
reply['instance_ids'] = instance_ids
|
||||||
|
reply_state = 'ACK_MAINTENANCE'
|
||||||
|
|
||||||
|
elif state == 'SCALE_IN':
|
||||||
|
# scale down only nonha instances
|
||||||
|
nonha_instances = len(self.nonha_instances)
|
||||||
|
scale_in = nonha_instances / 2
|
||||||
|
self.scale_instances(-scale_in)
|
||||||
|
self.update_constraints()
|
||||||
|
reply['instance_ids'] = self.get_instance_ids()
|
||||||
|
reply_state = 'ACK_SCALE_IN'
|
||||||
|
|
||||||
|
elif state == 'MAINTENANCE_COMPLETE':
|
||||||
|
# possibly need to upscale
|
||||||
|
number_of_instances = self.number_of_instances()
|
||||||
|
if self.orig_number_of_instances > number_of_instances:
|
||||||
|
scale_instances = (self.orig_number_of_instances -
|
||||||
|
number_of_instances)
|
||||||
|
self.scale_instances(scale_instances)
|
||||||
|
self.update_constraints()
|
||||||
|
reply_state = 'ACK_MAINTENANCE_COMPLETE'
|
||||||
|
|
||||||
|
elif (state == 'PREPARE_MAINTENANCE' or
|
||||||
|
state == 'PLANNED_MAINTENANCE'):
|
||||||
|
instance_id = payload['instance_ids'][0]
|
||||||
|
instance_action = (self.instance_action(instance_id,
|
||||||
|
payload['allowed_actions']))
|
||||||
|
if not instance_action:
|
||||||
|
raise Exception('Allowed_actions not supported for %s' %
|
||||||
|
instance_id)
|
||||||
|
|
||||||
|
LOG.info('VNFM got instance: %s' % instance_id)
|
||||||
|
self.switch_over_ha_instance(instance_id)
|
||||||
|
|
||||||
|
reply['instance_action'] = instance_action
|
||||||
|
reply_state = 'ACK_%s' % state
|
||||||
|
self.instance_action_started(instance_id, instance_action)
|
||||||
|
|
||||||
|
elif state == 'INSTANCE_ACTION_DONE':
|
||||||
|
# TBD was action done in max_interruption_time (live migration)
|
||||||
|
# NOTE, in EVICTION instance_id reported that was in evicted
|
||||||
|
# node. New instance_id might be different
|
||||||
|
LOG.info('%s' % payload['instance_ids'])
|
||||||
|
self.was_instance_action_in_time(payload['instance_ids'][0])
|
||||||
|
self.update_instances()
|
||||||
|
self.update_constraints()
|
||||||
|
else:
|
||||||
|
raise Exception('VNFM received event with'
|
||||||
|
' unknown state %s' % state)
|
||||||
|
|
||||||
|
if reply_state:
|
||||||
|
reply['session_id'] = payload['session_id']
|
||||||
|
reply['state'] = reply_state
|
||||||
|
url = payload['reply_url']
|
||||||
|
LOG.info('VNFM reply: %s' % reply)
|
||||||
|
requests.put(url, data=json.dumps(reply), headers=self.headers)
|
||||||
|
|
||||||
|
return 'OK'
|
||||||
|
|
||||||
|
@app.route('/shutdown', methods=['POST'])
|
||||||
|
def shutdown():
|
||||||
|
LOG.info('shutdown VNFM server at %s' % time.time())
|
||||||
|
func = request.environ.get('werkzeug.server.shutdown')
|
||||||
|
if func is None:
|
||||||
|
raise RuntimeError('Not running with the Werkzeug Server')
|
||||||
|
func()
|
||||||
|
return 'VNFM shutting down...'
|
||||||
|
|
||||||
|
app.run(host="0.0.0.0", port=self.port)
|
||||||
|
|
||||||
|
if __name__ == '__main__':
|
||||||
|
app_manager = VNFM(CONF, LOG)
|
||||||
|
app_manager.start()
|
||||||
|
try:
|
||||||
|
LOG.info('Press CTRL + C to quit')
|
||||||
|
while True:
|
||||||
|
time.sleep(2)
|
||||||
|
except KeyboardInterrupt:
|
||||||
|
app_manager.stop()
|
@ -94,7 +94,36 @@ class RPCClient(object):
|
|||||||
class EngineEndpoint(object):
|
class EngineEndpoint(object):
|
||||||
|
|
||||||
def __init__(self):
|
def __init__(self):
|
||||||
|
sessions = db_api.get_sessions()
|
||||||
self.workflow_sessions = {}
|
self.workflow_sessions = {}
|
||||||
|
if sessions:
|
||||||
|
LOG.info("Initialize workflows from DB")
|
||||||
|
for session in sessions:
|
||||||
|
session_id = session.session_id
|
||||||
|
LOG.info("Session %s from DB" % session.session_id)
|
||||||
|
workflow = "fenix.workflow.workflows.%s" % session.workflow
|
||||||
|
LOG.info("Workflow plugin module: %s" % workflow)
|
||||||
|
try:
|
||||||
|
wf_plugin = getattr(import_module(workflow), 'Workflow')
|
||||||
|
self.workflow_sessions[session_id] = wf_plugin(CONF,
|
||||||
|
session_id,
|
||||||
|
None)
|
||||||
|
except ImportError:
|
||||||
|
session_dir = "%s/%s" % (CONF.local_cache_dir, session_id)
|
||||||
|
download_plugin_dir = session_dir + "/workflow/"
|
||||||
|
download_plugin_file = "%s/%s.py" % (download_plugin_dir,
|
||||||
|
session.workflow)
|
||||||
|
if os.path.isfile(download_plugin_file):
|
||||||
|
self.workflow_sessions[session_id] = (
|
||||||
|
source_loader_workflow_instance(
|
||||||
|
workflow,
|
||||||
|
download_plugin_file,
|
||||||
|
CONF,
|
||||||
|
session_id,
|
||||||
|
None))
|
||||||
|
else:
|
||||||
|
raise Exception('%s: could not find workflow plugin %s'
|
||||||
|
% (session_id, session.workflow))
|
||||||
|
|
||||||
def _validate_session(self, session_id):
|
def _validate_session(self, session_id):
|
||||||
if session_id not in self.workflow_sessions.keys():
|
if session_id not in self.workflow_sessions.keys():
|
||||||
@ -144,7 +173,7 @@ class EngineEndpoint(object):
|
|||||||
data))
|
data))
|
||||||
else:
|
else:
|
||||||
raise Exception('%s: could not find workflow plugin %s' %
|
raise Exception('%s: could not find workflow plugin %s' %
|
||||||
(self.session_id, data["workflow"]))
|
(session_id, data["workflow"]))
|
||||||
|
|
||||||
self.workflow_sessions[session_id].start()
|
self.workflow_sessions[session_id].start()
|
||||||
return {"session_id": session_id}
|
return {"session_id": session_id}
|
||||||
@ -154,8 +183,23 @@ class EngineEndpoint(object):
|
|||||||
if not self._validate_session(session_id):
|
if not self._validate_session(session_id):
|
||||||
return None
|
return None
|
||||||
LOG.info("EngineEndpoint: admin_get_session")
|
LOG.info("EngineEndpoint: admin_get_session")
|
||||||
return ({"session_id": session_id, "state":
|
return {"session_id": session_id, "state":
|
||||||
self.workflow_sessions[session_id].session.state})
|
self.workflow_sessions[session_id].session.state}
|
||||||
|
|
||||||
|
def admin_get_session_detail(self, ctx, session_id):
|
||||||
|
"""Get maintenance workflow session details"""
|
||||||
|
if not self._validate_session(session_id):
|
||||||
|
return None
|
||||||
|
LOG.info("EngineEndpoint: admin_get_session_detail")
|
||||||
|
sess = self.workflow_sessions[session_id]
|
||||||
|
return {"session_id": session_id,
|
||||||
|
"state": sess.session.state,
|
||||||
|
"percent_done": sess.session_report["last_percent"],
|
||||||
|
"session": sess.session,
|
||||||
|
"hosts": sess.hosts,
|
||||||
|
"instances": sess.instances,
|
||||||
|
"action_plugin_instances": db_api.get_action_plugin_instances(
|
||||||
|
session_id)}
|
||||||
|
|
||||||
def admin_delete_session(self, ctx, session_id):
|
def admin_delete_session(self, ctx, session_id):
|
||||||
"""Delete maintenance workflow session thread"""
|
"""Delete maintenance workflow session thread"""
|
||||||
@ -198,6 +242,7 @@ class EngineEndpoint(object):
|
|||||||
session_obj = self.workflow_sessions[session_id]
|
session_obj = self.workflow_sessions[session_id]
|
||||||
project = session_obj.project(project_id)
|
project = session_obj.project(project_id)
|
||||||
project.state = data["state"]
|
project.state = data["state"]
|
||||||
|
db_api.update_project(project)
|
||||||
if "instance_actions" in data:
|
if "instance_actions" in data:
|
||||||
session_obj.proj_instance_actions[project_id] = (
|
session_obj.proj_instance_actions[project_id] = (
|
||||||
data["instance_actions"].copy())
|
data["instance_actions"].copy())
|
||||||
@ -212,6 +257,7 @@ class EngineEndpoint(object):
|
|||||||
instance.project_state = data["state"]
|
instance.project_state = data["state"]
|
||||||
if "instance_action" in data:
|
if "instance_action" in data:
|
||||||
instance.action = data["instance_action"]
|
instance.action = data["instance_action"]
|
||||||
|
db_api.update_instance(instance)
|
||||||
return data
|
return data
|
||||||
|
|
||||||
def get_instance(self, ctx, instance_id):
|
def get_instance(self, ctx, instance_id):
|
||||||
|
@ -12,8 +12,10 @@
|
|||||||
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
|
||||||
# License for the specific language governing permissions and limitations
|
# License for the specific language governing permissions and limitations
|
||||||
# under the License.
|
# under the License.
|
||||||
|
from fenix.db import api as db_api
|
||||||
from oslo_log import log as logging
|
from oslo_log import log as logging
|
||||||
import subprocess
|
import subprocess
|
||||||
|
import time
|
||||||
|
|
||||||
LOG = logging.getLogger(__name__)
|
LOG = logging.getLogger(__name__)
|
||||||
|
|
||||||
@ -32,10 +34,12 @@ class ActionPlugin(object):
|
|||||||
output = subprocess.check_output("echo Dummy running in %s" %
|
output = subprocess.check_output("echo Dummy running in %s" %
|
||||||
self.hostname,
|
self.hostname,
|
||||||
shell=True)
|
shell=True)
|
||||||
|
time.sleep(1)
|
||||||
self.ap_dbi.state = "DONE"
|
self.ap_dbi.state = "DONE"
|
||||||
except subprocess.CalledProcessError:
|
except subprocess.CalledProcessError:
|
||||||
self.ap_dbi.state = "FAILED"
|
self.ap_dbi.state = "FAILED"
|
||||||
finally:
|
finally:
|
||||||
|
db_api.update_action_plugin_instance(self.ap_dbi)
|
||||||
LOG.debug("%s: OUTPUT: %s" % (self.wf.session_id, output))
|
LOG.debug("%s: OUTPUT: %s" % (self.wf.session_id, output))
|
||||||
LOG.info("%s: Dummy action plugin state: %s" % (self.wf.session_id,
|
LOG.info("%s: Dummy action plugin state: %s" % (self.wf.session_id,
|
||||||
self.ap_dbi.state))
|
self.ap_dbi.state))
|
||||||
|
@ -34,31 +34,55 @@ LOG = logging.getLogger(__name__)
|
|||||||
|
|
||||||
class BaseWorkflow(Thread):
|
class BaseWorkflow(Thread):
|
||||||
|
|
||||||
def __init__(self, conf, session_id, data):
|
def __init__(self, conf, session_id, data=None):
|
||||||
|
# if data not set, we initialize from DB
|
||||||
Thread.__init__(self)
|
Thread.__init__(self)
|
||||||
self.conf = conf
|
self.conf = conf
|
||||||
self.session_id = session_id
|
self.session_id = session_id
|
||||||
self.stopped = False
|
self.stopped = False
|
||||||
self.thg = threadgroup.ThreadGroup()
|
self.thg = threadgroup.ThreadGroup()
|
||||||
self.timer = {}
|
self.timer = {}
|
||||||
self.session = self._init_session(data)
|
|
||||||
|
if data:
|
||||||
|
self.session = self._init_session(data)
|
||||||
|
else:
|
||||||
|
self.session = db_api.get_session(session_id)
|
||||||
|
LOG.info('%s session from DB: %s' % (self.session_id,
|
||||||
|
self.session.state))
|
||||||
|
|
||||||
self.hosts = []
|
self.hosts = []
|
||||||
if "hosts" in data and data['hosts']:
|
if not data:
|
||||||
|
self.hosts = db_api.get_hosts(session_id)
|
||||||
|
elif "hosts" in data and data['hosts']:
|
||||||
# Hosts given as input, not to be discovered in workflow
|
# Hosts given as input, not to be discovered in workflow
|
||||||
self.hosts = self.init_hosts(self.convert(data['hosts']))
|
self.hosts = self.init_hosts(self.convert(data['hosts']))
|
||||||
else:
|
else:
|
||||||
LOG.info('%s: No hosts as input' % self.session_id)
|
LOG.info('%s: No hosts as input' % self.session_id)
|
||||||
if "actions" in data:
|
|
||||||
|
if not data:
|
||||||
|
self.actions = db_api.get_action_plugins(session_id)
|
||||||
|
elif "actions" in data:
|
||||||
self.actions = self._init_action_plugins(data["actions"])
|
self.actions = self._init_action_plugins(data["actions"])
|
||||||
else:
|
else:
|
||||||
self.actions = []
|
self.actions = []
|
||||||
if "download" in data:
|
|
||||||
|
if not data:
|
||||||
|
self.downloads = db_api.get_downloads(session_id)
|
||||||
|
elif "download" in data:
|
||||||
self.downloads = self._init_downloads(data["download"])
|
self.downloads = self._init_downloads(data["download"])
|
||||||
else:
|
else:
|
||||||
self.downloads = []
|
self.downloads = []
|
||||||
|
|
||||||
self.projects = []
|
if not data:
|
||||||
self.instances = []
|
self.projects = db_api.get_projects(session_id)
|
||||||
|
else:
|
||||||
|
self.projects = []
|
||||||
|
|
||||||
|
if not data:
|
||||||
|
self.instances = db_api.get_instances(session_id)
|
||||||
|
else:
|
||||||
|
self.instances = []
|
||||||
|
|
||||||
self.proj_instance_actions = {}
|
self.proj_instance_actions = {}
|
||||||
|
|
||||||
self.states_methods = {'MAINTENANCE': 'maintenance',
|
self.states_methods = {'MAINTENANCE': 'maintenance',
|
||||||
@ -72,6 +96,7 @@ class BaseWorkflow(Thread):
|
|||||||
self.url = "http://%s:%s" % (conf.host, conf.port)
|
self.url = "http://%s:%s" % (conf.host, conf.port)
|
||||||
self.auth = get_identity_auth(conf)
|
self.auth = get_identity_auth(conf)
|
||||||
self.auth_session = get_session(auth=self.auth)
|
self.auth_session = get_session(auth=self.auth)
|
||||||
|
self.project_id = self.auth_session.get_project_id()
|
||||||
self.aodh = aodhclient.Client('2', self.auth_session)
|
self.aodh = aodhclient.Client('2', self.auth_session)
|
||||||
transport = messaging.get_transport(self.conf)
|
transport = messaging.get_transport(self.conf)
|
||||||
self.notif_proj = messaging.Notifier(transport,
|
self.notif_proj = messaging.Notifier(transport,
|
||||||
@ -84,6 +109,13 @@ class BaseWorkflow(Thread):
|
|||||||
driver='messaging',
|
driver='messaging',
|
||||||
topics=['notifications'])
|
topics=['notifications'])
|
||||||
self.notif_admin = self.notif_admin.prepare(publisher_id='fenix')
|
self.notif_admin = self.notif_admin.prepare(publisher_id='fenix')
|
||||||
|
self.notif_sess = messaging.Notifier(transport,
|
||||||
|
'maintenance.session',
|
||||||
|
driver='messaging',
|
||||||
|
topics=['notifications'])
|
||||||
|
self.notif_sess = self.notif_sess.prepare(publisher_id='fenix')
|
||||||
|
|
||||||
|
self.session_report = {'last_percent': 0, 'last_state': None}
|
||||||
|
|
||||||
def init_hosts(self, hostnames):
|
def init_hosts(self, hostnames):
|
||||||
LOG.info('%s: init_hosts: %s' % (self.session_id, hostnames))
|
LOG.info('%s: init_hosts: %s' % (self.session_id, hostnames))
|
||||||
@ -174,6 +206,12 @@ class BaseWorkflow(Thread):
|
|||||||
return [host.hostname for host in self.hosts if host.maintained and
|
return [host.hostname for host in self.hosts if host.maintained and
|
||||||
host.type == host_type]
|
host.type == host_type]
|
||||||
|
|
||||||
|
def get_maintained_percent(self):
|
||||||
|
maintained_hosts = float(len([host for host in self.hosts
|
||||||
|
if host.maintained]))
|
||||||
|
all_hosts = float(len(self.hosts))
|
||||||
|
return int(maintained_hosts / all_hosts * 100)
|
||||||
|
|
||||||
def get_disabled_hosts(self):
|
def get_disabled_hosts(self):
|
||||||
return [host for host in self.hosts if host.disabled]
|
return [host for host in self.hosts if host.disabled]
|
||||||
|
|
||||||
@ -195,6 +233,7 @@ class BaseWorkflow(Thread):
|
|||||||
if host_obj:
|
if host_obj:
|
||||||
if len(host_obj) == 1:
|
if len(host_obj) == 1:
|
||||||
host_obj[0].maintained = True
|
host_obj[0].maintained = True
|
||||||
|
db_api.update_host(host_obj[0])
|
||||||
else:
|
else:
|
||||||
raise Exception('host_maintained: %s has duplicate entries' %
|
raise Exception('host_maintained: %s has duplicate entries' %
|
||||||
hostname)
|
hostname)
|
||||||
@ -230,8 +269,10 @@ class BaseWorkflow(Thread):
|
|||||||
def set_projets_state(self, state):
|
def set_projets_state(self, state):
|
||||||
for project in self.projects:
|
for project in self.projects:
|
||||||
project.state = state
|
project.state = state
|
||||||
|
db_api.update_project(project)
|
||||||
for instance in self.instances:
|
for instance in self.instances:
|
||||||
instance.project_state = None
|
instance.project_state = None
|
||||||
|
db_api.update_instance(instance)
|
||||||
|
|
||||||
def project_has_state_instances(self, project_id):
|
def project_has_state_instances(self, project_id):
|
||||||
instances = ([instance.instance_id for instance in self.instances if
|
instances = ([instance.instance_id for instance in self.instances if
|
||||||
@ -254,11 +295,13 @@ class BaseWorkflow(Thread):
|
|||||||
instance.project_state = state
|
instance.project_state = state
|
||||||
else:
|
else:
|
||||||
instance.project_state = None
|
instance.project_state = None
|
||||||
|
db_api.update_instance(instance)
|
||||||
if state_instances:
|
if state_instances:
|
||||||
some_project_has_instances = True
|
some_project_has_instances = True
|
||||||
project.state = state
|
project.state = state
|
||||||
else:
|
else:
|
||||||
project.state = None
|
project.state = None
|
||||||
|
db_api.update_project(project)
|
||||||
if not some_project_has_instances:
|
if not some_project_has_instances:
|
||||||
LOG.error('%s: No project has instances on hosts %s' %
|
LOG.error('%s: No project has instances on hosts %s' %
|
||||||
(self.session_id, hosts))
|
(self.session_id, hosts))
|
||||||
@ -410,6 +453,10 @@ class BaseWorkflow(Thread):
|
|||||||
# TBD we could notify admin for workflow state change
|
# TBD we could notify admin for workflow state change
|
||||||
self.session.prev_state = self.session.state
|
self.session.prev_state = self.session.state
|
||||||
self.session.state = state
|
self.session.state = state
|
||||||
|
self.session = db_api.update_session(self.session)
|
||||||
|
self._session_notify(state,
|
||||||
|
self.get_maintained_percent(),
|
||||||
|
self.session_id)
|
||||||
if state in ["MAINTENANCE_DONE", "MAINTENANCE_FAILED"]:
|
if state in ["MAINTENANCE_DONE", "MAINTENANCE_FAILED"]:
|
||||||
try:
|
try:
|
||||||
statefunc = (getattr(self,
|
statefunc = (getattr(self,
|
||||||
@ -481,14 +528,35 @@ class BaseWorkflow(Thread):
|
|||||||
self.notif_proj.info({'some': 'context'}, 'maintenance.scheduled',
|
self.notif_proj.info({'some': 'context'}, 'maintenance.scheduled',
|
||||||
payload)
|
payload)
|
||||||
|
|
||||||
def _admin_notify(self, project, host, state, session_id):
|
def _admin_notify(self, host, state, session_id):
|
||||||
payload = dict(project_id=project, host=host, state=state,
|
payload = dict(project_id=self.project_id, host=host, state=state,
|
||||||
session_id=session_id)
|
session_id=session_id)
|
||||||
|
|
||||||
LOG.info('Sending "maintenance.host": %s' % payload)
|
LOG.info('Sending "maintenance.host": %s' % payload)
|
||||||
|
|
||||||
self.notif_admin.info({'some': 'context'}, 'maintenance.host', payload)
|
self.notif_admin.info({'some': 'context'}, 'maintenance.host', payload)
|
||||||
|
|
||||||
|
def _session_notify(self, state, percent_done, session_id):
|
||||||
|
# There is race in threads to send this message
|
||||||
|
# Maintenance can be further away with other thread
|
||||||
|
if self.session_report['last_percent'] > percent_done:
|
||||||
|
percent_done = self.session_report['last_percent']
|
||||||
|
if self.session_report['last_state'] == state:
|
||||||
|
return
|
||||||
|
else:
|
||||||
|
self.session_report['last_percent'] = percent_done
|
||||||
|
self.session_report['last_state'] = state
|
||||||
|
payload = dict(project_id=self.project_id,
|
||||||
|
state=state,
|
||||||
|
percent_done=percent_done,
|
||||||
|
session_id=session_id)
|
||||||
|
|
||||||
|
LOG.info('Sending "maintenance.session": %s' % payload)
|
||||||
|
|
||||||
|
self.notif_sess.info({'some': 'context'},
|
||||||
|
'maintenance.session',
|
||||||
|
payload)
|
||||||
|
|
||||||
def projects_answer(self, state, projects):
|
def projects_answer(self, state, projects):
|
||||||
state_ack = 'ACK_%s' % state
|
state_ack = 'ACK_%s' % state
|
||||||
state_nack = 'NACK_%s' % state
|
state_nack = 'NACK_%s' % state
|
||||||
|
@ -140,6 +140,7 @@ class Workflow(BaseWorkflow):
|
|||||||
host.type = 'controller'
|
host.type = 'controller'
|
||||||
continue
|
continue
|
||||||
host.type = 'other'
|
host.type = 'other'
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def disable_host_nova_compute(self, hostname):
|
def disable_host_nova_compute(self, hostname):
|
||||||
LOG.info('%s: disable nova-compute on host %s' % (self.session_id,
|
LOG.info('%s: disable nova-compute on host %s' % (self.session_id,
|
||||||
@ -153,6 +154,7 @@ class Workflow(BaseWorkflow):
|
|||||||
self.nova.services.disable_log_reason(hostname, "nova-compute",
|
self.nova.services.disable_log_reason(hostname, "nova-compute",
|
||||||
"maintenance")
|
"maintenance")
|
||||||
host.disabled = True
|
host.disabled = True
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def enable_host_nova_compute(self, hostname):
|
def enable_host_nova_compute(self, hostname):
|
||||||
LOG.info('%s: enable nova-compute on host %s' % (self.session_id,
|
LOG.info('%s: enable nova-compute on host %s' % (self.session_id,
|
||||||
@ -165,6 +167,7 @@ class Workflow(BaseWorkflow):
|
|||||||
(self.session_id, hostname))
|
(self.session_id, hostname))
|
||||||
self.nova.services.enable(hostname, "nova-compute")
|
self.nova.services.enable(hostname, "nova-compute")
|
||||||
host.disabled = False
|
host.disabled = False
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def get_compute_hosts(self):
|
def get_compute_hosts(self):
|
||||||
return [host.hostname for host in self.hosts
|
return [host.hostname for host in self.hosts
|
||||||
@ -408,8 +411,8 @@ class Workflow(BaseWorkflow):
|
|||||||
|
|
||||||
def get_free_vcpus_by_host(self, host, hvisors):
|
def get_free_vcpus_by_host(self, host, hvisors):
|
||||||
hvisor = ([h for h in hvisors if
|
hvisor = ([h for h in hvisors if
|
||||||
h.__getattr__('hypervisor_hostname').split(".", 1)[0]
|
h.__getattr__(
|
||||||
== host][0])
|
'hypervisor_hostname').split(".", 1)[0] == host][0])
|
||||||
vcpus = hvisor.__getattr__('vcpus')
|
vcpus = hvisor.__getattr__('vcpus')
|
||||||
vcpus_used = hvisor.__getattr__('vcpus_used')
|
vcpus_used = hvisor.__getattr__('vcpus_used')
|
||||||
return vcpus - vcpus_used
|
return vcpus - vcpus_used
|
||||||
@ -547,6 +550,7 @@ class Workflow(BaseWorkflow):
|
|||||||
reply_at = None
|
reply_at = None
|
||||||
state = "INSTANCE_ACTION_DONE"
|
state = "INSTANCE_ACTION_DONE"
|
||||||
instance.project_state = state
|
instance.project_state = state
|
||||||
|
db_api.update_instance(instance)
|
||||||
metadata = "{}"
|
metadata = "{}"
|
||||||
self._project_notify(project, instance_ids, allowed_actions,
|
self._project_notify(project, instance_ids, allowed_actions,
|
||||||
actions_at, reply_at, state, metadata)
|
actions_at, reply_at, state, metadata)
|
||||||
@ -561,6 +565,7 @@ class Workflow(BaseWorkflow):
|
|||||||
project, instance.instance_id))
|
project, instance.instance_id))
|
||||||
LOG.info('Action %s instance %s ' % (instance.action,
|
LOG.info('Action %s instance %s ' % (instance.action,
|
||||||
instance.instance_id))
|
instance.instance_id))
|
||||||
|
db_api.update_instance(instance)
|
||||||
if instance.action == 'MIGRATE':
|
if instance.action == 'MIGRATE':
|
||||||
if not self.migrate_server(instance):
|
if not self.migrate_server(instance):
|
||||||
return False
|
return False
|
||||||
@ -576,6 +581,12 @@ class Workflow(BaseWorkflow):
|
|||||||
'%s not supported' %
|
'%s not supported' %
|
||||||
(self.session_id, instance.instance_id,
|
(self.session_id, instance.instance_id,
|
||||||
instance.action))
|
instance.action))
|
||||||
|
server = self.nova.servers.get(instance.instance_id)
|
||||||
|
instance.host = (
|
||||||
|
str(server.__dict__.get('OS-EXT-SRV-ATTR:host')))
|
||||||
|
instance.state = server.__dict__.get('OS-EXT-STS:vm_state')
|
||||||
|
instance.action = None
|
||||||
|
db_api.update_instance(instance)
|
||||||
return self._wait_host_empty(host)
|
return self._wait_host_empty(host)
|
||||||
|
|
||||||
def _wait_host_empty(self, host):
|
def _wait_host_empty(self, host):
|
||||||
@ -625,6 +636,7 @@ class Workflow(BaseWorkflow):
|
|||||||
if instance.state == 'error':
|
if instance.state == 'error':
|
||||||
LOG.error('instance %s live migration failed'
|
LOG.error('instance %s live migration failed'
|
||||||
% server_id)
|
% server_id)
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
elif orig_vm_state != instance.state:
|
elif orig_vm_state != instance.state:
|
||||||
LOG.info('instance %s state changed: %s' % (server_id,
|
LOG.info('instance %s state changed: %s' % (server_id,
|
||||||
@ -632,6 +644,7 @@ class Workflow(BaseWorkflow):
|
|||||||
elif host != orig_host:
|
elif host != orig_host:
|
||||||
LOG.info('instance %s live migrated to host %s' %
|
LOG.info('instance %s live migrated to host %s' %
|
||||||
(server_id, host))
|
(server_id, host))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return True
|
return True
|
||||||
migration = (
|
migration = (
|
||||||
self.nova.migrations.list(instance_uuid=server_id)[0])
|
self.nova.migrations.list(instance_uuid=server_id)[0])
|
||||||
@ -664,6 +677,7 @@ class Workflow(BaseWorkflow):
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
LOG.error('server %s live migration failed, Exception=%s' %
|
LOG.error('server %s live migration failed, Exception=%s' %
|
||||||
(server_id, e))
|
(server_id, e))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
def migrate_server(self, instance):
|
def migrate_server(self, instance):
|
||||||
@ -693,6 +707,7 @@ class Workflow(BaseWorkflow):
|
|||||||
LOG.info('instance %s migration resized to host %s' %
|
LOG.info('instance %s migration resized to host %s' %
|
||||||
(server_id, host))
|
(server_id, host))
|
||||||
instance.host = host
|
instance.host = host
|
||||||
|
db_api.update_instance(instance)
|
||||||
return True
|
return True
|
||||||
if last_vm_state != instance.state:
|
if last_vm_state != instance.state:
|
||||||
LOG.info('instance %s state changed: %s' % (server_id,
|
LOG.info('instance %s state changed: %s' % (server_id,
|
||||||
@ -701,6 +716,7 @@ class Workflow(BaseWorkflow):
|
|||||||
LOG.error('instance %s migration failed, state: %s'
|
LOG.error('instance %s migration failed, state: %s'
|
||||||
% (server_id, instance.state))
|
% (server_id, instance.state))
|
||||||
instance.host = host
|
instance.host = host
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
time.sleep(5)
|
time.sleep(5)
|
||||||
retries = retries - 1
|
retries = retries - 1
|
||||||
@ -712,6 +728,7 @@ class Workflow(BaseWorkflow):
|
|||||||
if retry_migrate == 0:
|
if retry_migrate == 0:
|
||||||
LOG.error('server %s migrate failed after retries' %
|
LOG.error('server %s migrate failed after retries' %
|
||||||
server_id)
|
server_id)
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
# Might take time for scheduler to sync inconsistent instance
|
# Might take time for scheduler to sync inconsistent instance
|
||||||
# list for host
|
# list for host
|
||||||
@ -723,11 +740,13 @@ class Workflow(BaseWorkflow):
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
LOG.error('server %s migration failed, Exception=%s' %
|
LOG.error('server %s migration failed, Exception=%s' %
|
||||||
(server_id, e))
|
(server_id, e))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
finally:
|
finally:
|
||||||
retry_migrate = retry_migrate - 1
|
retry_migrate = retry_migrate - 1
|
||||||
LOG.error('instance %s migration timeout, state: %s' %
|
LOG.error('instance %s migration timeout, state: %s' %
|
||||||
(server_id, instance.state))
|
(server_id, instance.state))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
def maintenance_by_plugin_type(self, hostname, plugin_type):
|
def maintenance_by_plugin_type(self, hostname, plugin_type):
|
||||||
@ -889,13 +908,11 @@ class Workflow(BaseWorkflow):
|
|||||||
self.disable_host_nova_compute(compute)
|
self.disable_host_nova_compute(compute)
|
||||||
for host in self.get_controller_hosts():
|
for host in self.get_controller_hosts():
|
||||||
LOG.info('IN_MAINTENANCE controller %s' % host)
|
LOG.info('IN_MAINTENANCE controller %s' % host)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(host,
|
||||||
host,
|
|
||||||
'IN_MAINTENANCE',
|
'IN_MAINTENANCE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
self.host_maintenance(host)
|
self.host_maintenance(host)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(host,
|
||||||
host,
|
|
||||||
'MAINTENANCE_COMPLETE',
|
'MAINTENANCE_COMPLETE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
LOG.info('MAINTENANCE_COMPLETE controller %s' % host)
|
LOG.info('MAINTENANCE_COMPLETE controller %s' % host)
|
||||||
@ -908,13 +925,11 @@ class Workflow(BaseWorkflow):
|
|||||||
self._wait_host_empty(host)
|
self._wait_host_empty(host)
|
||||||
|
|
||||||
LOG.info('IN_MAINTENANCE compute %s' % host)
|
LOG.info('IN_MAINTENANCE compute %s' % host)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(host,
|
||||||
host,
|
|
||||||
'IN_MAINTENANCE',
|
'IN_MAINTENANCE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
self.host_maintenance(host)
|
self.host_maintenance(host)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(host,
|
||||||
host,
|
|
||||||
'MAINTENANCE_COMPLETE',
|
'MAINTENANCE_COMPLETE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
|
|
||||||
@ -929,13 +944,11 @@ class Workflow(BaseWorkflow):
|
|||||||
self._wait_host_empty(host)
|
self._wait_host_empty(host)
|
||||||
|
|
||||||
LOG.info('IN_MAINTENANCE host %s' % host)
|
LOG.info('IN_MAINTENANCE host %s' % host)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(host,
|
||||||
host,
|
|
||||||
'IN_MAINTENANCE',
|
'IN_MAINTENANCE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
self.host_maintenance(host)
|
self.host_maintenance(host)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(host,
|
||||||
host,
|
|
||||||
'MAINTENANCE_COMPLETE',
|
'MAINTENANCE_COMPLETE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
|
|
||||||
|
@ -63,11 +63,12 @@ class Workflow(BaseWorkflow):
|
|||||||
LOG.info("%s: initialized with Kubernetes: %s" %
|
LOG.info("%s: initialized with Kubernetes: %s" %
|
||||||
(self.session_id,
|
(self.session_id,
|
||||||
v_api.get_code_with_http_info()[0].git_version))
|
v_api.get_code_with_http_info()[0].git_version))
|
||||||
|
if not data:
|
||||||
self.hosts = self._init_hosts_by_services()
|
self.hosts = db_api.get_hosts(session_id)
|
||||||
|
else:
|
||||||
LOG.info('%s: Execute pre action plugins' % (self.session_id))
|
self.hosts = self._init_hosts_by_services()
|
||||||
self.maintenance_by_plugin_type("localhost", "pre")
|
LOG.info('%s: Execute pre action plugins' % (self.session_id))
|
||||||
|
self.maintenance_by_plugin_type("localhost", "pre")
|
||||||
self.group_impacted_members = {}
|
self.group_impacted_members = {}
|
||||||
|
|
||||||
def _init_hosts_by_services(self):
|
def _init_hosts_by_services(self):
|
||||||
@ -106,6 +107,7 @@ class Workflow(BaseWorkflow):
|
|||||||
body = {"apiVersion": "v1", "spec": {"unschedulable": True}}
|
body = {"apiVersion": "v1", "spec": {"unschedulable": True}}
|
||||||
self.kapi.patch_node(node_name, body)
|
self.kapi.patch_node(node_name, body)
|
||||||
host.disabled = True
|
host.disabled = True
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def uncordon(self, node_name):
|
def uncordon(self, node_name):
|
||||||
LOG.info("%s: uncordon %s" % (self.session_id, node_name))
|
LOG.info("%s: uncordon %s" % (self.session_id, node_name))
|
||||||
@ -113,6 +115,7 @@ class Workflow(BaseWorkflow):
|
|||||||
body = {"apiVersion": "v1", "spec": {"unschedulable": None}}
|
body = {"apiVersion": "v1", "spec": {"unschedulable": None}}
|
||||||
self.kapi.patch_node(node_name, body)
|
self.kapi.patch_node(node_name, body)
|
||||||
host.disabled = False
|
host.disabled = False
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def _pod_by_id(self, pod_id):
|
def _pod_by_id(self, pod_id):
|
||||||
return [p for p in self.kapi.list_pod_for_all_namespaces().items
|
return [p for p in self.kapi.list_pod_for_all_namespaces().items
|
||||||
@ -667,6 +670,7 @@ class Workflow(BaseWorkflow):
|
|||||||
actions_at = reply_time_str(wait_time)
|
actions_at = reply_time_str(wait_time)
|
||||||
reply_at = actions_at
|
reply_at = actions_at
|
||||||
instance.project_state = state
|
instance.project_state = state
|
||||||
|
db_api.update_instance(instance)
|
||||||
metadata = self.session.meta
|
metadata = self.session.meta
|
||||||
retry = 2
|
retry = 2
|
||||||
replied = False
|
replied = False
|
||||||
@ -737,6 +741,7 @@ class Workflow(BaseWorkflow):
|
|||||||
reply_at = None
|
reply_at = None
|
||||||
state = "INSTANCE_ACTION_DONE"
|
state = "INSTANCE_ACTION_DONE"
|
||||||
instance.project_state = state
|
instance.project_state = state
|
||||||
|
db_api.update_instance(instance)
|
||||||
metadata = "{}"
|
metadata = "{}"
|
||||||
self._project_notify(project, instance_ids, allowed_actions,
|
self._project_notify(project, instance_ids, allowed_actions,
|
||||||
actions_at, reply_at, state, metadata)
|
actions_at, reply_at, state, metadata)
|
||||||
@ -814,22 +819,24 @@ class Workflow(BaseWorkflow):
|
|||||||
if host.type == "compute":
|
if host.type == "compute":
|
||||||
self._wait_host_empty(hostname)
|
self._wait_host_empty(hostname)
|
||||||
LOG.info('IN_MAINTENANCE %s' % hostname)
|
LOG.info('IN_MAINTENANCE %s' % hostname)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(hostname,
|
||||||
hostname,
|
|
||||||
'IN_MAINTENANCE',
|
'IN_MAINTENANCE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
for plugin_type in ["host", host.type]:
|
for plugin_type in ["host", host.type]:
|
||||||
LOG.info('%s: Execute %s action plugins' % (self.session_id,
|
LOG.info('%s: Execute %s action plugins' % (self.session_id,
|
||||||
plugin_type))
|
plugin_type))
|
||||||
self.maintenance_by_plugin_type(hostname, plugin_type)
|
self.maintenance_by_plugin_type(hostname, plugin_type)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(hostname,
|
||||||
hostname,
|
|
||||||
'MAINTENANCE_COMPLETE',
|
'MAINTENANCE_COMPLETE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
if host.type == "compute":
|
if host.type == "compute":
|
||||||
self.uncordon(hostname)
|
self.uncordon(hostname)
|
||||||
LOG.info('MAINTENANCE_COMPLETE %s' % hostname)
|
LOG.info('MAINTENANCE_COMPLETE %s' % hostname)
|
||||||
host.maintained = True
|
host.maintained = True
|
||||||
|
db_api.update_host(host)
|
||||||
|
self._session_notify(self.session.state,
|
||||||
|
self.get_maintained_percent(),
|
||||||
|
self.session_id)
|
||||||
|
|
||||||
def maintenance(self):
|
def maintenance(self):
|
||||||
LOG.info("%s: maintenance called" % self.session_id)
|
LOG.info("%s: maintenance called" % self.session_id)
|
||||||
@ -919,6 +926,10 @@ class Workflow(BaseWorkflow):
|
|||||||
return
|
return
|
||||||
for host_name in self.get_compute_hosts():
|
for host_name in self.get_compute_hosts():
|
||||||
self.cordon(host_name)
|
self.cordon(host_name)
|
||||||
|
for host in self.get_controller_hosts():
|
||||||
|
# TBD one might need to change this. Now all controllers
|
||||||
|
# maintenance serialized
|
||||||
|
self.host_maintenance(host)
|
||||||
thrs = []
|
thrs = []
|
||||||
for host_name in empty_hosts:
|
for host_name in empty_hosts:
|
||||||
# LOG.info("%s: Maintaining %s" % (self.session_id, host_name))
|
# LOG.info("%s: Maintaining %s" % (self.session_id, host_name))
|
||||||
|
@ -66,15 +66,20 @@ class Workflow(BaseWorkflow):
|
|||||||
nova_version = max_nova_server_ver
|
nova_version = max_nova_server_ver
|
||||||
self.nova = novaclient.Client(nova_version,
|
self.nova = novaclient.Client(nova_version,
|
||||||
session=self.auth_session)
|
session=self.auth_session)
|
||||||
if not self.hosts:
|
|
||||||
|
if not data:
|
||||||
|
self.hosts = db_api.get_hosts(session_id)
|
||||||
|
elif not self.hosts:
|
||||||
self.hosts = self._init_hosts_by_services()
|
self.hosts = self._init_hosts_by_services()
|
||||||
else:
|
else:
|
||||||
self._init_update_hosts()
|
self._init_update_hosts()
|
||||||
LOG.info("%s: initialized. Nova version %f" % (self.session_id,
|
LOG.info("%s: initialized. Nova version %f" % (self.session_id,
|
||||||
nova_version))
|
nova_version))
|
||||||
|
|
||||||
LOG.info('%s: Execute pre action plugins' % (self.session_id))
|
if data:
|
||||||
self.maintenance_by_plugin_type("localhost", "pre")
|
# We expect this is done if initialized from DB
|
||||||
|
LOG.info('%s: Execute pre action plugins' % (self.session_id))
|
||||||
|
self.maintenance_by_plugin_type("localhost", "pre")
|
||||||
# How many members of each instance group are currently affected
|
# How many members of each instance group are currently affected
|
||||||
self.group_impacted_members = {}
|
self.group_impacted_members = {}
|
||||||
|
|
||||||
@ -144,6 +149,7 @@ class Workflow(BaseWorkflow):
|
|||||||
host.type = 'controller'
|
host.type = 'controller'
|
||||||
continue
|
continue
|
||||||
host.type = 'other'
|
host.type = 'other'
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def disable_host_nova_compute(self, hostname):
|
def disable_host_nova_compute(self, hostname):
|
||||||
LOG.info('%s: disable nova-compute on host %s' % (self.session_id,
|
LOG.info('%s: disable nova-compute on host %s' % (self.session_id,
|
||||||
@ -157,6 +163,7 @@ class Workflow(BaseWorkflow):
|
|||||||
self.nova.services.disable_log_reason(hostname, "nova-compute",
|
self.nova.services.disable_log_reason(hostname, "nova-compute",
|
||||||
"maintenance")
|
"maintenance")
|
||||||
host.disabled = True
|
host.disabled = True
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def enable_host_nova_compute(self, hostname):
|
def enable_host_nova_compute(self, hostname):
|
||||||
LOG.info('%s: enable nova-compute on host %s' % (self.session_id,
|
LOG.info('%s: enable nova-compute on host %s' % (self.session_id,
|
||||||
@ -169,6 +176,7 @@ class Workflow(BaseWorkflow):
|
|||||||
(self.session_id, hostname))
|
(self.session_id, hostname))
|
||||||
self.nova.services.enable(hostname, "nova-compute")
|
self.nova.services.enable(hostname, "nova-compute")
|
||||||
host.disabled = False
|
host.disabled = False
|
||||||
|
db_api.update_host(host)
|
||||||
|
|
||||||
def get_instance_details(self, instance):
|
def get_instance_details(self, instance):
|
||||||
network_interfaces = next(iter(instance.addresses.values()))
|
network_interfaces = next(iter(instance.addresses.values()))
|
||||||
@ -413,17 +421,17 @@ class Workflow(BaseWorkflow):
|
|||||||
prev_hostname = hostname
|
prev_hostname = hostname
|
||||||
if free_vcpus >= vcpus:
|
if free_vcpus >= vcpus:
|
||||||
# TBD vcpu capacity might be too scattered so moving instances from
|
# TBD vcpu capacity might be too scattered so moving instances from
|
||||||
# one host to other host still might not succeed. At least with
|
# one host to another host still might not succeed. At least with
|
||||||
# NUMA and CPU pinning, one should calculate and ask specific
|
# NUMA and CPU pinning, one should calculate and ask specific
|
||||||
# instances
|
# instances to be moved so can get empty host obeying pinning.
|
||||||
return False
|
return False
|
||||||
else:
|
else:
|
||||||
return True
|
return True
|
||||||
|
|
||||||
def get_vcpus_by_host(self, host, hvisors):
|
def get_vcpus_by_host(self, host, hvisors):
|
||||||
hvisor = ([h for h in hvisors if
|
hvisor = ([h for h in hvisors if
|
||||||
h.__getattr__('hypervisor_hostname').split(".", 1)[0]
|
h.__getattr__(
|
||||||
== host][0])
|
'hypervisor_hostname').split(".", 1)[0] == host][0])
|
||||||
vcpus = hvisor.__getattr__('vcpus')
|
vcpus = hvisor.__getattr__('vcpus')
|
||||||
vcpus_used = hvisor.__getattr__('vcpus_used')
|
vcpus_used = hvisor.__getattr__('vcpus_used')
|
||||||
return vcpus, vcpus_used
|
return vcpus, vcpus_used
|
||||||
@ -535,6 +543,7 @@ class Workflow(BaseWorkflow):
|
|||||||
actions_at = reply_time_str(wait_time)
|
actions_at = reply_time_str(wait_time)
|
||||||
reply_at = actions_at
|
reply_at = actions_at
|
||||||
instance.project_state = state
|
instance.project_state = state
|
||||||
|
db_api.update_instance(instance)
|
||||||
metadata = self.session.meta
|
metadata = self.session.meta
|
||||||
retry = 2
|
retry = 2
|
||||||
replied = False
|
replied = False
|
||||||
@ -605,6 +614,7 @@ class Workflow(BaseWorkflow):
|
|||||||
reply_at = None
|
reply_at = None
|
||||||
state = "INSTANCE_ACTION_DONE"
|
state = "INSTANCE_ACTION_DONE"
|
||||||
instance.project_state = state
|
instance.project_state = state
|
||||||
|
db_api.update_instance(instance)
|
||||||
metadata = "{}"
|
metadata = "{}"
|
||||||
self._project_notify(project, instance_ids, allowed_actions,
|
self._project_notify(project, instance_ids, allowed_actions,
|
||||||
actions_at, reply_at, state, metadata)
|
actions_at, reply_at, state, metadata)
|
||||||
@ -697,6 +707,11 @@ class Workflow(BaseWorkflow):
|
|||||||
% (instance.instance_id,
|
% (instance.instance_id,
|
||||||
self.group_impacted_members[group_id],
|
self.group_impacted_members[group_id],
|
||||||
max_parallel))
|
max_parallel))
|
||||||
|
server = self.nova.servers.get(instance.instance_id)
|
||||||
|
instance.host = str(server.__dict__.get('OS-EXT-SRV-ATTR:host'))
|
||||||
|
instance.state = server.__dict__.get('OS-EXT-STS:vm_state')
|
||||||
|
instance.action = None
|
||||||
|
db_api.update_instance(instance)
|
||||||
|
|
||||||
@run_async
|
@run_async
|
||||||
def actions_to_have_empty_host(self, host, state, target_host=None):
|
def actions_to_have_empty_host(self, host, state, target_host=None):
|
||||||
@ -759,6 +774,7 @@ class Workflow(BaseWorkflow):
|
|||||||
if instance.state == 'error':
|
if instance.state == 'error':
|
||||||
LOG.error('instance %s live migration failed'
|
LOG.error('instance %s live migration failed'
|
||||||
% server_id)
|
% server_id)
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
elif orig_vm_state != instance.state:
|
elif orig_vm_state != instance.state:
|
||||||
LOG.info('instance %s state changed: %s' % (server_id,
|
LOG.info('instance %s state changed: %s' % (server_id,
|
||||||
@ -766,6 +782,7 @@ class Workflow(BaseWorkflow):
|
|||||||
elif host != orig_host:
|
elif host != orig_host:
|
||||||
LOG.info('instance %s live migrated to host %s' %
|
LOG.info('instance %s live migrated to host %s' %
|
||||||
(server_id, host))
|
(server_id, host))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return True
|
return True
|
||||||
migration = (
|
migration = (
|
||||||
self.nova.migrations.list(instance_uuid=server_id)[0])
|
self.nova.migrations.list(instance_uuid=server_id)[0])
|
||||||
@ -775,6 +792,7 @@ class Workflow(BaseWorkflow):
|
|||||||
'%d retries' %
|
'%d retries' %
|
||||||
(server_id,
|
(server_id,
|
||||||
self.conf.live_migration_retries))
|
self.conf.live_migration_retries))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
# When live migrate fails it can fail fast after calling
|
# When live migrate fails it can fail fast after calling
|
||||||
# To have Nova time to be ready for next live migration
|
# To have Nova time to be ready for next live migration
|
||||||
@ -793,17 +811,20 @@ class Workflow(BaseWorkflow):
|
|||||||
waited = waited + 1
|
waited = waited + 1
|
||||||
last_migration_status = migration.status
|
last_migration_status = migration.status
|
||||||
last_vm_status = vm_status
|
last_vm_status = vm_status
|
||||||
|
db_api.update_instance(instance)
|
||||||
LOG.error('instance %s live migration did not finish in %ss, '
|
LOG.error('instance %s live migration did not finish in %ss, '
|
||||||
'state: %s' % (server_id, waited, instance.state))
|
'state: %s' % (server_id, waited, instance.state))
|
||||||
except Exception as e:
|
except Exception as e:
|
||||||
LOG.error('server %s live migration failed, Exception=%s' %
|
LOG.error('server %s live migration failed, Exception=%s' %
|
||||||
(server_id, e))
|
(server_id, e))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
def migrate_server(self, instance, target_host=None):
|
def migrate_server(self, instance, target_host=None):
|
||||||
server_id = instance.instance_id
|
server_id = instance.instance_id
|
||||||
server = self.nova.servers.get(server_id)
|
server = self.nova.servers.get(server_id)
|
||||||
instance.state = server.__dict__.get('OS-EXT-STS:vm_state')
|
orig_state = server.__dict__.get('OS-EXT-STS:vm_state')
|
||||||
|
instance.state = orig_state
|
||||||
orig_host = str(server.__dict__.get('OS-EXT-SRV-ATTR:host'))
|
orig_host = str(server.__dict__.get('OS-EXT-SRV-ATTR:host'))
|
||||||
LOG.info('migrate_server %s state %s host %s to %s' %
|
LOG.info('migrate_server %s state %s host %s to %s' %
|
||||||
(server_id, instance.state, orig_host, target_host))
|
(server_id, instance.state, orig_host, target_host))
|
||||||
@ -823,7 +844,12 @@ class Workflow(BaseWorkflow):
|
|||||||
server.confirm_resize()
|
server.confirm_resize()
|
||||||
LOG.info('instance %s migration resized to host %s' %
|
LOG.info('instance %s migration resized to host %s' %
|
||||||
(server_id, host))
|
(server_id, host))
|
||||||
instance.host = host
|
server = self.nova.servers.get(server_id)
|
||||||
|
instance.host = (
|
||||||
|
str(server.__dict__.get('OS-EXT-SRV-ATTR:host')))
|
||||||
|
instance.state = (
|
||||||
|
server.__dict__.get('OS-EXT-STS:vm_state'))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return True
|
return True
|
||||||
if last_vm_state != instance.state:
|
if last_vm_state != instance.state:
|
||||||
LOG.info('instance %s state changed: %s' % (server_id,
|
LOG.info('instance %s state changed: %s' % (server_id,
|
||||||
@ -832,6 +858,7 @@ class Workflow(BaseWorkflow):
|
|||||||
LOG.error('instance %s migration failed, state: %s'
|
LOG.error('instance %s migration failed, state: %s'
|
||||||
% (server_id, instance.state))
|
% (server_id, instance.state))
|
||||||
instance.host = host
|
instance.host = host
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
time.sleep(5)
|
time.sleep(5)
|
||||||
retries = retries - 1
|
retries = retries - 1
|
||||||
@ -843,6 +870,7 @@ class Workflow(BaseWorkflow):
|
|||||||
if retry_migrate == 0:
|
if retry_migrate == 0:
|
||||||
LOG.error('server %s migrate failed after retries' %
|
LOG.error('server %s migrate failed after retries' %
|
||||||
server_id)
|
server_id)
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
# Might take time for scheduler to sync inconsistent instance
|
# Might take time for scheduler to sync inconsistent instance
|
||||||
# list for host.
|
# list for host.
|
||||||
@ -855,11 +883,13 @@ class Workflow(BaseWorkflow):
|
|||||||
except Exception as e:
|
except Exception as e:
|
||||||
LOG.error('server %s migration failed, Exception=%s' %
|
LOG.error('server %s migration failed, Exception=%s' %
|
||||||
(server_id, e))
|
(server_id, e))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
finally:
|
finally:
|
||||||
retry_migrate = retry_migrate - 1
|
retry_migrate = retry_migrate - 1
|
||||||
LOG.error('instance %s migration timeout, state: %s' %
|
LOG.error('instance %s migration timeout, state: %s' %
|
||||||
(server_id, instance.state))
|
(server_id, instance.state))
|
||||||
|
db_api.update_instance(instance)
|
||||||
return False
|
return False
|
||||||
|
|
||||||
def maintenance_by_plugin_type(self, hostname, plugin_type):
|
def maintenance_by_plugin_type(self, hostname, plugin_type):
|
||||||
@ -922,22 +952,24 @@ class Workflow(BaseWorkflow):
|
|||||||
if host.type == "compute":
|
if host.type == "compute":
|
||||||
self._wait_host_empty(hostname)
|
self._wait_host_empty(hostname)
|
||||||
LOG.info('IN_MAINTENANCE %s' % hostname)
|
LOG.info('IN_MAINTENANCE %s' % hostname)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(hostname,
|
||||||
hostname,
|
|
||||||
'IN_MAINTENANCE',
|
'IN_MAINTENANCE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
for plugin_type in ["host", host.type]:
|
for plugin_type in ["host", host.type]:
|
||||||
LOG.info('%s: Execute %s action plugins' % (self.session_id,
|
LOG.info('%s: Execute %s action plugins' % (self.session_id,
|
||||||
plugin_type))
|
plugin_type))
|
||||||
self.maintenance_by_plugin_type(hostname, plugin_type)
|
self.maintenance_by_plugin_type(hostname, plugin_type)
|
||||||
self._admin_notify(self.conf.service_user.os_project_name,
|
self._admin_notify(hostname,
|
||||||
hostname,
|
|
||||||
'MAINTENANCE_COMPLETE',
|
'MAINTENANCE_COMPLETE',
|
||||||
self.session_id)
|
self.session_id)
|
||||||
if host.type == "compute":
|
if host.type == "compute":
|
||||||
self.enable_host_nova_compute(hostname)
|
self.enable_host_nova_compute(hostname)
|
||||||
LOG.info('MAINTENANCE_COMPLETE %s' % hostname)
|
LOG.info('MAINTENANCE_COMPLETE %s' % hostname)
|
||||||
host.maintained = True
|
host.maintained = True
|
||||||
|
db_api.update_host(host)
|
||||||
|
self._session_notify(self.session.state,
|
||||||
|
self.get_maintained_percent(),
|
||||||
|
self.session_id)
|
||||||
|
|
||||||
def maintenance(self):
|
def maintenance(self):
|
||||||
LOG.info("%s: maintenance called" % self.session_id)
|
LOG.info("%s: maintenance called" % self.session_id)
|
||||||
|
Loading…
x
Reference in New Issue
Block a user