# fenix.tools This directory contains tools and instructions to test Fenix workflows. Currently OPNFV Doctor has been used to test OpenStack related workflows. As Doctor is at the moment only for OpenStack and Fenix itself needs a way to be tested, the Kubernetes workflow (fenix/workflow/workflows/k8s.py) testing is implemented here. Files: - 'demo-ha.yaml': demo-ha ReplicaSet to make 2 anti-affinity PODS. - 'demo-nonha.yaml': demo-nonha ReplicaSet to make n nonha PODS. - 'vnfm_k8s.py': VNFM to test k8s.py (Kubernetes example) workflow. - 'vnfm.py': VNFM to test nfv.py (OpenStack example) workflow. - 'infra_admin.py': Tool to act as infrastructure admin. Tool catch also the 'maintenance.session' and 'maintenance.host' events to keep track where the maintenance is going. You will see when certain host is maintained and how many percent of hosts are maintained. - 'session.json': Example to define maintenance session parameters as JSON file to be given as input to 'infra_admin.py'. Example if for nfv.py workflow. This could be used for any advanced workflow testing giving software downloads and real action plugins. - 'set_config.py': You can use this to set Fenix AODH/Ceilometer configuration. - 'fenix_db_reset': Flush the Fenix database. ## Kubernetes workflow (k8s.py) First version of workflow towards Kubernetes use cases. ### Requirements for testing This workflow assumes ReplicaSet used for PODs. A Kubernetes cluster with 1 master and at least 3 workers are required for testing. Master node needs DevStack to have Fenix and OpenStack services it still uses. Later on there can be a version of Fenix not needing Keystone and AODH event alarming, but using native Kubernetes services for RBAC and events. As in Doctor testing, there is a pair of anti-affinity PODs (demo-ha) and rest of the worker node capacity is filled with (demo-nonha) PODs. Scaling of PODs is done via ReplicaSet number of replicas. Idea is that each POD is taking ("number of worker node CPUs" - 2) / 2. That will make sure scheduler fits 2 PODs on each node and 2 CPUs capacity is assumed for other node services. This should require at least 6 CPUs for each node to work. ### Install Kubernetes cluster with 1 Manager and 3 Worker nodes. Here is instructions: https://docs.openstack.org/openstack-helm/latest/install/kubernetes-gate.html https://phoenixnap.com/kb/how-to-install-kubernetes-on-a-bare-metal-server https://phoenixnap.com/kb/how-to-install-kubernetes-on-centos ### On Manager node, install DevStack including Fenix and its minimum services Note! There is no conflict with Kubernetes as limiting to only Fenix needed services. Clone DevStack. Tested to work with latest stable release Train. ```sh git clone https://github.com/openstack/devstack -b stable/train ``` Make local.conf. 'HOST_IP' should be the master node IP. ```sh cd devstack vi local.conf ``` ```sh [[local|localrc]] GIT_BASE=https://git.openstack.org HOST_IP=192.0.2.4 ADMIN_PASSWORD=admin DATABASE_PASSWORD=admin RABBIT_PASSWORD=admin SERVICE_PASSWORD=admin LOGFILE=/opt/stack/stack.sh.log PUBLIC_INTERFACE=eth0 CEILOMETER_EVENT_ALARM=True ENABLED_SERVICES=key,rabbit,mysql,fenix-engine,fenix-api,aodh-evaluator,aodh-notifier,aodh-api enable_plugin ceilometer https://git.openstack.org/openstack/ceilometer stable/train enable_plugin aodh https://git.openstack.org/openstack/aodh stable/train enable_plugin gnocchi https://github.com/openstack/gnocchi enable_plugin fenix https://opendev.org/x/fenix master ``` Deploy needed OpenStack services with Fenix ```sh ./stack.sh ``` Now you should have Kubernetes cluster and Fenix via DevStack. Any hacking of Fenix can be done under '/opt/stack/fenix'. ### Running test Use 3 terminal windows (Term1, Term2 and Term3) to test Fenix with Kubernetes kluster. Under here is what you can run in different terminals. Terminals should be running in master node. Here is short description: - Term1: Used for logging Fenix - Term2: Infrastructure admin - Term3: VNFM logging for testing and setting up the VNF #### Term1: Fenix-engine logging If any changes to Fenix make them under '/opt/stack/fenix'; restart Fenix and see logs ```sh sudo systemctl restart devstack@fenix*;sudo journalctl -f --unit devstack@fenix-engine ``` API logs can also be seen ```sh sudo journalctl -f --unit devstack@fenix-api ``` Debugging and other configuration changes to '.conf' files under '/etc/fenix' #### Term2: Infrastructure admin window ##### Admin commands as command line and curl Use DevStack admin as user. Set your variables needed accordingly ```sh . ~/devstack/operc admin admin USER_ID=`openstack user list | grep admin | awk '{print $2}'` HOST=192.0.2.4 PORT=12347 ``` Authenticate to Keystone as admin user before calling Fenix. If you will have some not authorized error later on, you need to do this again. ```sh OS_AUTH_TOKEN=`openstack token issue | grep " id " |awk '{print $4}'` ``` After you have first: Fenix running in Term1; Next: VNF created a in Term3 Next: VNFM running in Term3, you can create maintenance session utilizing those ```sh DATE=`date -d "+15 sec" "+%Y-%m-%d %H:%M:%S"`;MSESSION=`curl -g -i -X POST http://$HOST:$PORT/v1/maintenance -H "Accept: application/json" -H "Content-Type: application/json" -d '{"workflow": "k8s", "state": "MAINTENANCE","metadata": {} ,"maintenance_at": "'"$DATE"'"}' -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-User-Id: $USER_ID" | grep session_id | jq -r '.session_id'` ``` After maintenance workflow is 'MAINTENANCE_DONE', you should first press "ctrl + c" in VNFM window (Term3), so it removes constraints from Fenix and dies. Then you can remove the finished session from Fenix ```sh curl -g -i -X DELETE http://$HOST:$PORT/v1/maintenance/$MSESSION -H "Accept: application/json" -H "Content-Type: application/json" -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-User-Id: $USER_ID" ``` If maintenance run till the end with 'MAINTENANCE_DONE', you are ready to run it again if you wish. 'MAINTENANCE_FAILED' or in case of exceptions, you should recover system before trying to test again. This is covered in Term3 below. ##### Admin commands using admin tool Go to Fenix tools directory ```sh cd /opt/stack/fenix/fenix/tools ``` Call admin tool and it will run the maintenance workflow. Admin tool defaults to 'OpenStack' and 'nfv' workflow, so you can override those by exporting environmental variables ```sh . ~/devstack/openrc admin admin export WORKFLOW=k8s export CLOUD_TYPE=k8s python infra_admin.py ``` If you want to choose freely parameters for maintenance workflow session, you can give session.json file as input. With this option infra_admin.py will only override the 'maintenance_at' to be 20seconds in future when Fenix is called. ```sh python infra_admin.py --file session.json ``` Maintenance will start by pressing enter, just follow instructions on the console. #### Term3: VNFM (fenix/tools/vnfm_k8s.py) Use DevStack as demo user for testing demo application ```sh . ~/devstack/operc demo demo ``` Go to Fenix Kubernetes tool directory for testing ```sh cd /opt/stack/fenix/fenix/tools ``` Create demo namespace (we use demo namespace and demo user and project in Keystone) ```sh kubectl create namespace demo ``` Start VNFM (when done in this order, we make sure demo-ha has nodes for anti-affinity): ```sh kubectl apply -f demo-ha.yaml --namespace=demo;sleep 1;kubectl apply -f demo-nonha.yaml --namespace=demo ``` Note you should modify above yaml files so that "cpu:" has value of '(workernode.status.capacity["cpu"] - 2) / 2'. Default is expecting that there is 32 cpus, so value is "15" in both yaml files. Replicas can be changed in demo-nonha.yaml. Minimum 2 (if minimum of 3 worker nodes) to maximum '(amount_of_worker_nodes-1)*2'. Greater amount means more scaling needed and longer maintenance window as less parallel actions possible. Surely constraints in vnfm_k8s.py also can be changed for different behavior. You can delete pods used like this ```sh kubectl delete replicaset.apps demo-ha demo-nonha --namespace=demo ``` Start Kubernetes VNFM that we need for testing ```sh python vnfm_k8s.py ``` Now you can start maintenance session in Term2. When workflow failed or completed; you first kill vnfm_k8s.py with "ctrl+c" and delete maintenance session in Term2. If workflow failed something might need to be manually fixed. Here you uncordon your 3 worker nodes, if maintenance workflow did not run to the end. ```sh kubectl uncordon worker-node3 worker-node2 worker-node1 ``` You can check your pods matches to amount of replicas mentioned in demo-nonha.yaml and demo-ha.yaml: ```sh kubectl get pods --all-namespaces --output=wide ``` If not matching, delete and create again as easiest solution ```sh kubectl delete replicaset.apps demo-ha demo-nonha --namespace=demo;sleep 15;kubectl apply -f demo-ha.yaml --namespace=demo;sleep 1;kubectl apply -f demo-nonha.yaml --namespace=demo ``` ## OpenStack workflows (default.py and nvf.py) OpenStack workflows can be tested by using OPNFV Doctor project for testing or to use Fenix own tools. Workflows: - default.py is the first example workflow with VNFM interaction. - nvf.py is enhanced Telco workflow that utilizes the ETSI FEAT03 constraints. This workflow can optimize parallel nodes and migrations according to ETSI constraints. ### Requirements for testing - Multinode DevStack environment with 1 controller and at least 3 computes. - Each compute needs to have at least 2 VCPUs - DevStack local.conf for controller node need Fenix enabled. Also Heat, AODH and Ceilometer are needed. VCPUs needs one to one to mapping to CPU. Example controller baremetal local.conf: ```sh [[local|localrc]] GIT_BASE=https://git.openstack.org HOST_IP=192.168.173.3 ADMIN_PASSWORD=admin DATABASE_PASSWORD=admin RABBIT_PASSWORD=admin SERVICE_PASSWORD=admin LOGFILE=/opt/stack/stack.sh.log #USE_PYTHON3=True #PYTHON3_VERSION=3.6 ## Neutron options Q_USE_SECGROUP=True FLOATING_RANGE="192.168.37.0/24" #FLOATING_RANGE="192.168.38.0/24" IPV4_ADDRS_SAFE_TO_USE="10.0.0.0/22" Q_FLOATING_ALLOCATION_POOL=start=192.168.37.200,end=192.168.37.220 #Q_FLOATING_ALLOCATION_POOL=start=192.168.38.200,end=192.168.38.220 PUBLIC_NETWORK_GATEWAY="192.168.37.199" #This is wrong, but if right devstack deletes the host ip PUBLIC_INTERFACE=enp3s0f1 # Open vSwitch provider networking configuration Q_USE_PROVIDERNET_FOR_PUBLIC=True OVS_PHYSICAL_BRIDGE=br-ex PUBLIC_BRIDGE=br-ex OVS_BRIDGE_MAPPINGS=public:br-ex MULTI_HOST=1 CEILOMETER_EVENT_ALARM=True disable_service ceilometer-alarm-notifier,ceilometer-alarm-evaluator,ceilometer-acompute enable_service aodh-evaluator,aodh-notifier,aodh-api enable_plugin heat https://git.openstack.org/openstack/heat stable/train enable_plugin ceilometer https://git.openstack.org/openstack/ceilometer stable/train enable_plugin aodh https://git.openstack.org/openstack/aodh stable/train enable_plugin gnocchi https://github.com/openstack/gnocchi enable_plugin fenix https://opendev.org/x/fenix master enable_service fenix-engine enable_service fenix-api disable_service n-cpu [[post-config|$NOVA_CONF]] [DEFAULT] cpu_allocation_ratio = 1.0 allow_resize_to_same_host = False ``` ### Workflow default.py testing with Doctor On controller node clone Doctor to be able to test. Doctor currently requires Python 3.6: ```sh git clone https://gerrit.opnfv.org/gerrit/doctor ``` ```sh export INSTALLER_TYPE=devstack export ADMIN_TOOL_TYPE=fenix export TEST_CASE=maintenance ``` Use DevStack admin as user. ```sh . ~/devstack/operc admin admin ``` Goto Doctor and start testing ```sh cd doctor sudo -E tox -e py36 ``` Use journalctl to track the progress in Fenix ```sh sudo journalctl -f --unit devstack@fenix-engine ``` If any changes to Fenix make them under '/opt/stack/fenix' and restart Fenix ```sh sudo systemctl restart devstack@fenix* ``` You can also make changed to Doctor before running Doctor test ### Workflow vnf.py testing with Doctor This workflow differs from above as it expects ETSI FEAT03 constraints. In Doctor testing it means we also need to use different application manager (VNFM) Where default.py worklow used the sample.py application manager vnf.py workflow uses vnfm_k8s.py workflow (doctor/doctor_tests/app_manager/vnfm_k8s.py) Only change to testing is that you should export variable to use different application manager. ```sh export APP_MANAGER_TYPE=vnfm ``` If again want to use default.py, you can export the default value for application manager ```sh export APP_MANAGER_TYPE=sample ``` Doctor modifies the message where it calls maintenance accordingly to use either 'default' or 'nfv' as workflow in Fenix side ### Workflow vnf.py testing with Fenix Where Doctor is made to automate everything as a test case, Fenix provides different tools for admin and VNFM: - 'vnfm.py': VNFM to test nfv.py. - 'infra_admin.py': Tool to act as infrastructure admin. Use 3 terminal windows (Term1, Term2 and Term3) to test Fenix with Kubernetes kluster. Under here is what you can run in different terminals. Terminals should be running in master node. Here is short description: - Term1: Used for logging Fenix - Term2: Infrastructure admin - Term3: VNFM logging for testing and setting up the VNF #### Term1: Fenix-engine logging If any changes to Fenix make them under '/opt/stack/fenix'; restart Fenix and see logs ```sh sudo systemctl restart devstack@fenix*;sudo journalctl -f --unit devstack@fenix-engine ``` API logs can also be seen ```sh sudo journalctl -f --unit devstack@fenix-api ``` Debugging and other configuration changes to '.conf' files under '/etc/fenix' #### Term2: Infrastructure admin window Go to Fenix tools directory for testing ```sh cd /opt/stack/fenix/fenix/tools ``` Make flavor for testing that takes the half of the amount of VCPUs on single compute node (here we have 48 VCPUs on each compute) This is required by the current example 'vnfm.py' and the vnf 'maintenance_hot_tpl.yaml' that is used in testing. 'vnf.py' workflow is not bind to these in any way, but can be used with different VNFs and VNFM. ```sh openstack flavor create --ram 512 --vcpus 24 --disk 1 --public demo_maint_flavor ``` Call admin tool and it will run the nvf.py workflow. ```sh . ~/devstack/openrc admin admin python infra_admin.py ``` If you want to choose freely parameters for maintenance workflow session, you can give 'session.json' file as input. With this option 'infra_admin.py' will only override the 'maintenance_at' to be 20 seconds in future when Fenix is called. ```sh python infra_admin.py --file session.json ``` Maintenance will start by pressing enter, just follow instructions on the console. In case you failed to remove maintenance workflow session, you can do it manually as instructed above in 'Admin commands as command line and curl'. #### Term3: VNFM (fenix/tools/vnfm.py) Use DevStack as demo user for testing demo application ```sh . ~/devstack/openrc demo demo ``` Go to Fenix tools directory for testing ```sh cd /opt/stack/fenix/fenix/tools ``` Start VNFM that we need for testing ```sh python vnfm.py ``` Now you can start maintenance session in Term2. When workflow failed or completed; you first kill vnfm.py with "ctrl+c" and then delete maintenance session in Term2. If workflow failed something might need to be manually fixed. Here you can remove the heat stack if vnfm.py failed to sdo that: ```sh openstack stack delete -y --wait demo_stack ``` It may also be that workflow failed somewhere in the middle and some 'nova-compute' are disabled. You can enable those. Here you can see the states: ```sh openstack compute service list ```