From 5941705c61cc49c8ce3b6f26540290a01d8a43cc Mon Sep 17 00:00:00 2001 From: Ifat Afek Date: Thu, 3 Mar 2016 14:58:38 +0000 Subject: [PATCH] Aodh notifier specification Change-Id: I2aab970caf94b97e8ab963e13d16ebc71aac85b1 implements: blueprint aodh-notifier --- specs/mitaka/aodh-notifier.rst | 189 +++++++++++++++++++++++++++++++++ 1 file changed, 189 insertions(+) create mode 100644 specs/mitaka/aodh-notifier.rst diff --git a/specs/mitaka/aodh-notifier.rst b/specs/mitaka/aodh-notifier.rst new file mode 100644 index 0000000..3a7b577 --- /dev/null +++ b/specs/mitaka/aodh-notifier.rst @@ -0,0 +1,189 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +============= +Aodh Notifier +============= + +launchpad blueprint: +https://blueprints.launchpad.net/vitrage/+spec/aodh-notifier + +The Evaluator performs root cause analysis on the Vitrage Graph and may determine that an alarm should be created, deleted or otherwise updated. +Other components are notified of such changes by the Vitrage Notifier service. Among others, Vitrage Notifier is responsible for handling Aodh Alarms. + +This blueprint describes the implementation of Vitrage Notifier for notifying Aodh on Vitrage alarms. + +:: + + +------------------+ +------------------+ +------------------+ + | Aodh <--+ | | | | + +------------------+ | Update | Vitrage | Raise | Vitrage | + +--------| <----------| | + +------------------+ | Alarm | Notifier | Alarm | Evaluator | + | Other components <--+ | | | | + +------------------+ +------------------+ +------------------+ + + +Problem description +=================== + +Vitrage should be capable of creating, deleting and otherwise updating alarms as requested by the Evaluator Engine. +The notifier is responsible for ensuring these updates are executed. Specifically we will start here with Aodh alarms. + +Main challenges: + +* There is no way to define a 'custom alarm' in Aodh +* Vitrage alarms are based on resources. There is a need to pass the resource information to Aodh +* Several alarms of the same type can be triggered at the same time, each for a different resource. For example, in case there is an alarm on a host, Vitrage will raise a deduced alarm on every instance in this host. +* How can someone ask for notifications on updates of Vitrage alarms? + + +Proposed change +=============== + +The Vitrage Notifier will be separate from the Evaluator, as the two will have different demands of scale and other performance considerations. +The Vitrage Notifier will supply an API used by the Vitrage Evaluator, containing create/delete/update alarm. + +In Aodh, Vitrage alarms will be defined as event alarms, this seems like the most appropriate option. The resource id will be defined in the alarm query. + +Vitrage deduced alarms will look like this: + ++---------------------------+---------------------------------------------------------+ +| Property | Value | ++---------------------------+---------------------------------------------------------+ +| alarm_actions | [] | ++---------------------------+---------------------------------------------------------+ +| alarm_id | 4a3cb988-a620-4bf3-87f7-077c751c408f | ++---------------------------+---------------------------------------------------------+ +| description | Instance is unreachable | ++---------------------------+---------------------------------------------------------+ +| enabled | True | ++---------------------------+---------------------------------------------------------+ +| event_type | vitrage.alarm.instance_unreachable | ++---------------------------+---------------------------------------------------------+ +| insufficient_data_actions | [] | ++---------------------------+---------------------------------------------------------+ +| name | vitrage_instance_unreachable_1 | ++---------------------------+---------------------------------------------------------+ +| ok_actions | [] | ++---------------------------+---------------------------------------------------------+ +| project_id | 5542b27142154f30b32dea6238aa81aa | ++---------------------------+---------------------------------------------------------+ +| query | [{u'field': u'resource_id', u'type': u'', u'value': | +| | u'b0bf3635-d9e8-4624-9793-7aac82948c0a', u'op': u'eq'}] | ++---------------------------+---------------------------------------------------------+ +| repeat_actions | False | ++---------------------------+---------------------------------------------------------+ +| severity | moderate | ++---------------------------+---------------------------------------------------------+ +| state | alarm | ++---------------------------+---------------------------------------------------------+ +| type | event | ++---------------------------+---------------------------------------------------------+ +| user_id | 8ab65ef808b245e3ba234b7b3554cb94 | ++---------------------------+---------------------------------------------------------+ + +In this example, Vitrage triggers a deduced alarm that an instance is unreachable due to a failure in the public switch (which was detected by Nagios). +There will be several alarms with the same event_type and different instance ids in their query. + + +There are two options how to trigger Vitrage alarms in Aodh, none is perfect. + + +Alternative 1 +------------- + +Vitrage will create an event alarm in Aodh. +Then, it will send a notification to the message bus. The notification will be converted to a Ceilometer event, which will trigger the Aodh alarm. + +The exact notification and event format are still TBD. + +The main problem with this solution is that the Aodh alarm will be created on-the-fly and triggered immediately, so it will be impossible for another project to register a web-hook on the alarm before it is triggered. +It will be possbile to see Vitrage alarms in list-alarms, but not to be notified when they are first triggered. + + +Alternative 2 +------------- + +Vitrage will create an event alarm in Aodh, with 'alarm' state. The event itself will never be sent, so the alarm state will remain 'alarm'. + +The problem with this solution is that Aodh will not send a notification about the alarm being triggered. But since in Alternative 1 it is also impossible to register on the alarm, there is no real difference between the two options. + + +Data model impact +----------------- + +None + +REST API impact +--------------- + +None + +Versioning impact +----------------- + +None + +Other end user impact +--------------------- + +None + +Deployer impact +--------------- + +For Alternative 1 - there is a need to define the notification->event configuration + +For Alternative 2 - None + +Developer impact +---------------- + +None + +Horizon impact +-------------- + +None + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + idan-hefetz + +Work Items +---------- + +None + +Dependencies +============ + +None + +Testing +======= + +This blueprint requires unit tests and Tempest tests. + +Documentation Impact +==================== + +For Alternative 1 - there is a need to document the notification->event configuration + +For Alternative 2 - None + +References +========== + +Vitrage wiki page: https://wiki.openstack.org/wiki/Vitrage + +Vitrage use cases: https://github.com/openstack/vitrage/blob/master/doc/source/vitrage-use-cases.rst