Tools and automation to achieve Disaster Recovery of OpenStack Cloud Platforms

Go to file

Saad Zaher 5cbe9d4ff4 Adding CREDITS.rst Change-Id: I3fdee0050a82a7f5ed4de9e8f9d40038eb76e4d3		2016-02-12 16:28:58 +00:00
config-generator	Adding Fencing plugable system	2015-12-23 14:51:09 +00:00
doc	Modified diagrams	2015-12-23 14:56:03 +00:00
etc	Adding Osha description and enhancing templates	2016-02-12 15:01:11 +00:00
osha	Some small improvements to pass pep8 and pep257.	2016-02-02 12:26:42 +00:00
.gitignore	Adding support for plugable monitoring systems	2015-12-22 16:24:20 +00:00
.gitreview	Added .gitreview	2015-12-18 13:07:03 +00:00
CREDITS.rst	Adding CREDITS.rst	2016-02-12 16:28:58 +00:00
HACKING.rst	Adding HACKING.rst to follow Openstack Guidelines	2016-01-14 15:01:54 +00:00
README.rst	Adding Osha description and enhancing templates	2016-02-12 15:01:11 +00:00
requirements.txt	Add oslo.log dependency in requirements.txt	2016-01-15 13:19:23 +00:00
setup.cfg	Adding support for plugable monitoring systems	2015-12-22 16:24:20 +00:00
setup.py	Adding support for plugable monitoring systems	2015-12-22 16:24:20 +00:00

README.rst

OSHA

Osha, Openstack Compute node High Available provides compute node high availability for OpenStack. Simply Osha monitors all compute nodes running in a cloud deployment and if there is any failure in one of the compute nodes osha will fence this compute node then osha will try to evacuate all running instances on this compute node, finally Osha will notify all users who have workload/instances running on this compute node as well as will notify the cloud administrators.

Osha has a pluggable architecture so it can be used with:

Any monitoring system to monitor the compute nodes (currently we support only native openstack services status)
Any fencing driver (currently supports IPMI, libvirt, ...)
Any evacuation driver (currently supports evacuate api call, may be migrate ??)
Any notification system (currently supports email based notifications, ...)

just by adding a simple plugin and adjust the configuration file to use this plugin or in future a combination of plugins if required

Osha should run in the control plane, however the architecture supports different scenarios. For running osha under high availability mode, it should run with active passive mode.

How it works

Starting Osha 1. Osha Monitoring manager is going to load the required monitoring driver according to the configuration 2. Osha will query the monitoring system to check if it considers any compute nodes to be down ? 3.1. if no, Osha will exit displaying No failed nodes 3.2. if yes, Osha will call the fencing manager to fence the failed compute node 4. Fencing manager will load the correct fencer according to the configuration 5. once the compute node is fenced and is powered off now we will start the evacuation process 6. Osha will load the correct evacuation driver 7. Osha will evacuate all instances to another computes 8. Once the evacuation process completed, Osha will call the notification manager 9. The notification manager will load the correct driver based on the configurations 10. Osha will start the notification process ...