fenix/fenix/tools/README.md

223 lines
7.2 KiB
Markdown

# fenix.tools
This directory contains tools and instructions to test Fenix workflows.
Currently OPNFV Doctor has been used to test OpenStack related workflows.
As Doctor is at the moment only for OpenStack and Fenix itself needs a way to be
tested, the Kubernetes workflow (fenix/workflow/workflows/k8s.py) testing
is implemented here.
Files:
- 'demo-ha.yaml': demo-ha ReplicaSet to make 2 anti-affinity PODS.
- 'demo-nonha.yaml': demo-nonha ReplicaSet to make n nonha PODS.
- 'vnfm.py': VNFM to test k8s.py workflow.
## Kubernetes workflow (k8s.py)
First version of workflow towards Kubeernetes use cases.
### Requirements for testing
This workflow assumes ReplicaSet used for PODs. A Kubernetes cluster with
1 master and at least 3 workers are required for testing. Master node needs
DevStack to have Fenix and OpenStack services it still uses. Later on there
can be a version of Fenix not needing Keystone and AODH event alarming, but
using native Kubernetes services for RBAC and events.
As in Doctor testing, there is a pair of anti-affinity PODs (demo-ha) and rest
of the worker node capacity is filled with (demo-nonha) PODs. Scaling of PODs
is done via ReplicaSet number of replicas. Idea is that each POD is taking
("number of worker node CPUs" - 2) / 2. That will make sure scheduler fits
2 PODs on each node and 2 CPUs capacity is assumed for other node services.
This should require at least 6 CPUs for each node to work.
### Install Kubernetes cluster with 1 Manager and 3 Worker nodes.
Here is instructions:
https://docs.openstack.org/openstack-helm/latest/install/kubernetes-gate.html
https://phoenixnap.com/kb/how-to-install-kubernetes-on-a-bare-metal-server
https://phoenixnap.com/kb/how-to-install-kubernetes-on-centos
### On Manager node, install DevStack including Fenix and its minimum services
Note! There is no conflict with Kubernetes as limiting to only Fenix needed
services.
Clone devstack. Tested to work with latest stable release Train.
```sh
git clone https://github.com/openstack/devstack -b stable/train
```
Make local.conf. 'HOST_IP' should bemaster node IP.
```sh
cd devstack vi local.conf
```
```sh
[[local|localrc]]
GIT_BASE=https://git.openstack.org
HOST_IP=192.0.2.4
ADMIN_PASSWORD=admin
DATABASE_PASSWORD=admin
RABBIT_PASSWORD=admin
SERVICE_PASSWORD=admin
LOGFILE=/opt/stack/stack.sh.log
PUBLIC_INTERFACE=eth0
CEILOMETER_EVENT_ALARM=True
ENABLED_SERVICES=key,rabbit,mysql,fenix-engine,fenix-api,aodh-evaluator,aodh-notifier,aodh-api
enable_plugin ceilometer https://git.openstack.org/openstack/ceilometer stable/train
enable_plugin aodh https://git.openstack.org/openstack/aodh stable/train
enable_plugin gnocchi https://github.com/openstack/gnocchi
enable_plugin fenix https://opendev.org/x/fenix master
```
Deploy needed OpenStack services with Fenix
```sh
./stack.sh
```
Now you should have Kubernetes cluster and Fenix via DevStack. Any hacking of
Fenix can be done under '/opt/stack/fenix'.
### Running test
Use 3 terminal windows (Term1, Term2 and Term3) to test Fenix with Kubernetes
kluster. Under here is what you can run in different terminals. Terminals
should be running in master node. Here is short description:
- Term1: Used for logging Fenix
- Term2: Infrastructure admin commands
- Term3: VNFM logging for testing and setting up the VNF
#### Term1: Fenix-engine logging
If any changes to Fenix make them under '/opt/stack/fenix'; restart fenix and
see logs
```sh
sudo systemctl restart devstack@fenix*;sudo journalctl -f --unit devstack@fenix-engine
```
API logs can also be seen
```sh
sudo journalctl -f --unit devstack@fenix-api
```
Debugging and other configuration changes to conf files under '/etc/fenix'
#### Term2: Infrastructure admin window
Use DevStack admin as user. Set your variables needed accordingly
```sh
. ~/devstack/operc admin admin
USER_ID=`openstack user list | grep admin | awk '{print $2}'`
HOST=192.0.2.4
PORT=12347
```
Authenticate to Keystone as admin user before calling Fenix. If you will have
some not authorized error later on, you need to do this again.
```sh
OS_AUTH_TOKEN=`openstack token issue | grep " id " |awk '{print $4}'`
```
After you have first: Fenix running in Term1; Next: VNF created a in Term3
Next: VNFM running in Term3, you can create maintenance session utilizing those
```sh
DATE=`date -d "+15 sec" "+%Y-%m-%d %H:%M:%S"`;MSESSION=`curl -g -i -X POST http://$HOST:$PORT/v1/maintenance -H "Accept: application/json" -H "Content-Type: application/json" -d '{"workflow": "k8s", "state": "MAINTENANCE","metadata": {} ,"maintenance_at": "'"$DATE"'"}' -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-User-Id: $USER_ID" | grep session_id | jq -r '.session_id'`
```
After maintenance workflow is 'MAINTENANCE_DONE', you should first press
"ctrl + c" in VNFM window (Term3), so it removes constraints from Fenix and
dies. Then you can remove the finished session from fenix
```sh
curl -g -i -X DELETE http://$HOST:$PORT/v1/maintenance/$MSESSION -H "Accept: application/json" -H "Content-Type: application/json" -H "X-Auth-Token: $OS_AUTH_TOKEN" -H "X-User-Id: $USER_ID"
```
If maintenance run till the end with 'MAINTENANCE_DONE', you are ready to run it
again if you wish. 'MAINTENANCE_FAILED' or in case of exceptions, you should
recover system before trying to test again. This is covered in Term3 below.
#### Term3: VNFM (fenix/tools/vnfm.py)
Use DevStack admin as user.
```sh
. ~/devstack/operc admin admin
```
Go to Fenix Kubernetes tool directory for testing
```sh
cd /opt/stack/fenix/fenix/tools
```
Create demo namespace (we use demo namespace and demo user and project in
Keystone)
```sh
kubectl create namespace demo
```
Start VNFM (when done in this order, we make sure demo-ha has nodes for antiaffinity):
```sh
kubectl apply -f demo-ha.yaml --namespace=demo;sleep 1;kubectl apply -f demo-nonha.yaml --namespace=demo
```
Note you should modify above yaml files so that "cpu:" has value of
'(workernode.status.capacity["cpu"] - 2) / 2'. Default is expecting that there
is 32 cpus, so value is "15" in both yaml files. Replicas can be changed in
demo-nonha.yaml. Minimum 2 (if minimum of 3 worker nodes) to maximum
'(amount_of_worker_nodes-1)*2'. Greater amount means more scaling needed and
longer maintenance window as less parallel actions possible. Surely constraints
in vnfm.py also can be changed for different behavior.
You can delete pods used like this
```sh
kubectl delete replicaset.apps demo-ha demo-nonha --namespace=demo
```
Start Kubernetes VNFM that we need for testing
```sh
python vnfm.py
```
Now you can start maintenance session in Term2. When workflow failed or
completed; you first kill vnfm.py with "ctrl+c" and delete maintenance session
in Term2.
If workflow failed something might need to be manually fixed. Here you
uncordon your 3 worker nodes, if maintenance workflow did not run to the end.
```sh
kubectl uncordon worker-node3 worker-node2 worker-node1
```
You can check your pods matches to amount of replicas mentioned in
demo-nonha.yaml and demo-ha.yaml:
```sh
kubectl get pods --all-namespaces --output=wide
```
If not matching, delete and create again as easiest solution
```sh
kubectl delete replicaset.apps demo-ha demo-nonha --namespace=demo;sleep 15;kubectl apply -f demo-ha.yaml --namespace=demo;sleep 1;kubectl apply -f demo-nonha.yaml --namespace=demo
```