7.8 KiB
Add central agent HA and workload partitioning
https://blueprints.launchpad.net/fuel/+spec/ceilometer-central-agent-ha
Implement Redis installation and using it as a coordination backend for ceilometer central agents
Problem description
A detailed description of the problem:
- Currently there are several Ceilometer services which do not support workload partitioning in MOS: central agent, alarm evaluator and agent notification. During Juno release cycle workload partitioning for central agent was implemented. In Kilo, partition coordination was introduced to alarm evaluator and agent notification. In Liberty, coordination for notification agent was further improved. Thus, it should be supported in MOS. For this purpose we should provide tooz library support. This library is responsible for coordination between services and supports several backends: zookeeper, redis, memcached. Redis was chosen as the tooz backend in MOS.
Proposed change
Support for Ceilometer services coordination is an experimental feature and it was decided to implement it as a fuel plugin.
Its implementation requires the following things to be done: * Implement Redis installation on controller nodes in HA mode * Prepare Redis packages and their dependencies * Enable partitioning in config for ceilometer central agents * Enable partitioning in config for ceilometer alarm evaluator * Enable partitioning in config for ceilometer notification agents
Installation diagram for central agent is below. The schemas for alarm-evaluator and notification agent are similar
+---------------------+
| |
| +---------------+ |
| | ceilometer +-------------------------+
| | central agent | | |
| +---------------+ | |
| | |
| Primary controller | |
| | |
| +---------------+ | |
| | redis <------------------------------+
| | master | | | |
| +---------------+ | | |
| | | |
+---------------------+ | |
| |
+---------------------+ | |
| | | |
| +---------------+ | | |
| | ceilometer +-------------------------+ |
| | central agent | | | |
| +---------------+ | | |
| | +------v----+--+
| controller 1 | | |
| | | Coordination |
| +---------------+ | | |
| | redis | | +------^----+--+
| | slave1 | | | |
| | <------------------------------+
| +---------------+ | | |
| | | |
+---------------------+ | |
| |
+---------------------+ | |
| | | |
| +---------------+ | | |
| | ceilometer +-------------------------+ |
| | central agent | | |
| +---------------+ | |
| | |
| controller 2 | |
| | |
| +---------------+ | |
| | redis | | |
| | slave2 <------------------------------+
| | | |
| +---------------+ |
| |
+---------------------+
Alternatives
- We may use MQ queues for task ditribution between the services. The problem is that MQ is one of the most weak point in OpenStack now and it may be not safe to make it responsible for HA and coordination.
Data model impact
None
REST API impact
None
Upgrade impact
These changes will be needed in puppet scripts:
- Add redis module
- Configure ceilometer agents to be partitioned
This change will be needed in packages:
- Use upstream Redis packages and its dependencies
Security impact
None
Notifications impact
None
Other end user impact
None
Performance Impact
Performance should become better because the same amount of work will be done using several workers
Other deployer impact
This could be installed only in HA mode with ceilometer
Developer impact
None
Implementation
Assignee(s)
- Primary assignee:
-
Ivan Berezovskiy
- Other contributors:
-
Nadya Shakhat, Ilya Tyaptin, Igor Degtiarov
- Reviewer:
-
Vladimir Kuklin Sergii Golovatiuk
- QA:
-
Vitaly Gusev
Work Items
- Implement redis installation from puppet (iberezovskiy)
- Configure ceilometer central agent (iberezovskiy)
- Configure alarm evaluator (Nadya Shakhat)
- Configure notification agents (Nadya Shakhat)
- Write a documentation (Nadya Shakhat)
Dependencies
None
Testing
General testing approach:
- Environment with ceilometer in HA mode should be successfully deployed
- Redis cluster should be with one master and two slaves
- Ensure that after node with redis master was broken ceilometer services can work with new redis master
Testing approach for central agent:
- Ceilometer should collect all enabled polling meters for deployed environment
- Ensure that the sets of meters to be polled by each central agent are disjoint
- Ensure that after one central agent is broken, during the next polling cycle all measurements will be rescheduled between two another, and all of meters will be collected
Testing approach for alarm evaluator:
- Ensure that alarms can be successfully created
- Ensure that after one alarm evaluator is broken, during the next alarm evaluation cycle all alarms will be rescheduled between two another for further evaluation and all of alarms will be successfully evaluated
- Ensure that the sets of alarms for each alarm evaluator are disjoint
Testing approach for notification agent:
- Ensure that messages don't not stuck in notification.info queue
- Ensure that IPC queues are created in MQ, chech that list of IPC queues corresponds to pipeline.yaml and each queue has the one consumer
- Ensure that after one alarm evaluator was broken, during the next alarm evaluation cycle all alarms will be rescheduled between two another for further evaluation and all of them will be successfully evaluated
Documentation Impact
A Plugin Guide about redis plugin installation should be created. Also, the document about ceilometer HA and partitioning should be done.
For validation and testing purpose, the test plan and test report should be provided.
References
- Central agent: https://github.com/openstack/ceilometer-specs/blob/master/specs/juno/central-agent-partitioning.rst
- Notification agent: https://github.com/openstack/ceilometer-specs/blob/master/specs/kilo/notification-coordiation.rst
- Notification agent cont.: https://github.com/openstack/ceilometer-specs/blob/master/specs/liberty/distributed-coordinated-notifications.rst