Tools and automation to achieve Disaster Recovery of OpenStack Cloud Platforms
Go to file
Saad Zaher 5cbe9d4ff4 Adding CREDITS.rst
Change-Id: I3fdee0050a82a7f5ed4de9e8f9d40038eb76e4d3
2016-02-12 16:28:58 +00:00
config-generator Adding Fencing plugable system 2015-12-23 14:51:09 +00:00
doc Modified diagrams 2015-12-23 14:56:03 +00:00
etc Adding Osha description and enhancing templates 2016-02-12 15:01:11 +00:00
osha Some small improvements to pass pep8 and pep257. 2016-02-02 12:26:42 +00:00
.gitignore Adding support for plugable monitoring systems 2015-12-22 16:24:20 +00:00
.gitreview Added .gitreview 2015-12-18 13:07:03 +00:00
CREDITS.rst Adding CREDITS.rst 2016-02-12 16:28:58 +00:00
HACKING.rst Adding HACKING.rst to follow Openstack Guidelines 2016-01-14 15:01:54 +00:00
README.rst Adding Osha description and enhancing templates 2016-02-12 15:01:11 +00:00
requirements.txt Add oslo.log dependency in requirements.txt 2016-01-15 13:19:23 +00:00
setup.cfg Adding support for plugable monitoring systems 2015-12-22 16:24:20 +00:00
setup.py Adding support for plugable monitoring systems 2015-12-22 16:24:20 +00:00

README.rst

OSHA

Osha, Openstack Compute node High Available provides compute node high availability for OpenStack. Simply Osha monitors all compute nodes running in a cloud deployment and if there is any failure in one of the compute nodes osha will fence this compute node then osha will try to evacuate all running instances on this compute node, finally Osha will notify all users who have workload/instances running on this compute node as well as will notify the cloud administrators.

Osha has a pluggable architecture so it can be used with:

  1. Any monitoring system to monitor the compute nodes (currently we support only native openstack services status)
  2. Any fencing driver (currently supports IPMI, libvirt, ...)
  3. Any evacuation driver (currently supports evacuate api call, may be migrate ??)
  4. Any notification system (currently supports email based notifications, ...)

just by adding a simple plugin and adjust the configuration file to use this plugin or in future a combination of plugins if required

Osha should run in the control plane, however the architecture supports different scenarios. For running osha under high availability mode, it should run with active passive mode.

How it works

Starting Osha 1. Osha Monitoring manager is going to load the required monitoring driver according to the configuration 2. Osha will query the monitoring system to check if it considers any compute nodes to be down ? 3.1. if no, Osha will exit displaying No failed nodes 3.2. if yes, Osha will call the fencing manager to fence the failed compute node 4. Fencing manager will load the correct fencer according to the configuration 5. once the compute node is fenced and is powered off now we will start the evacuation process 6. Osha will load the correct evacuation driver 7. Osha will evacuate all instances to another computes 8. Once the evacuation process completed, Osha will call the notification manager 9. The notification manager will load the correct driver based on the configurations 10. Osha will start the notification process ...