![jizilian](/assets/img/avatar_default.png)
Scan the ceilometer specs repository. Filter the result and fix the mistakes. Change-Id: Idfbc41c3b681aa57cd5153dffc2dae600a58efb9
265 lines
8.3 KiB
ReStructuredText
265 lines
8.3 KiB
ReStructuredText
..
|
|
This work is licensed under a Creative Commons Attribution 3.0 Unported
|
|
License.
|
|
|
|
http://creativecommons.org/licenses/by/3.0/legalcode
|
|
|
|
===================================================
|
|
Dynamic pipeline configuration using File Reloading
|
|
===================================================
|
|
|
|
https://blueprints.launchpad.net/ceilometer/+spec/reload-file-based-pipeline-configuration
|
|
|
|
Currently there is no way to enable or disable meters without restarting ceilometer.
|
|
There are cases where operators do not want to run all the meters continuously.
|
|
|
|
By enhancing Ceilometer to monitor pipeline file configuration, the application
|
|
could be made to dynamically activate/deactivate meters, collection targets or similar
|
|
functions, have distinctly different configurations for multiple nodes in different
|
|
environments (i.e. dev/test/prod or HA scenarios) and could ultimately be updated
|
|
with new collection targets “on-the-fly”.
|
|
|
|
|
|
Problem description
|
|
===================
|
|
|
|
Currently, Ceilometer relies on pipeline configuration file for determining
|
|
run-time parameters used in polling and notification handling.
|
|
|
|
There is no way to enable or disable meters without restarting ceilometer.
|
|
There are cases where operators do not want to run all the meters
|
|
continuously. In these cases, there should be a way to disable or enable them
|
|
dynamically (without restarting ceilometer).
|
|
|
|
Meters in ceilometer are enabled by adding them in “setup.cfg:entry_points”
|
|
configuration file and restarting ceilometer agent.
|
|
|
|
When ceilometer agent restarts, during initialization it reads the
|
|
entry_points and creates and loads all pollster objects corresponding to
|
|
meters present in setup.cfg.
|
|
|
|
There are disadvantages with this implementation:
|
|
|
|
Ceilometer might be running several hundreds of meters continuously and
|
|
restarting it might impact other meters as it involves the deletion and
|
|
re-creation of the hundreds of pollster objects, when the need is to poll
|
|
a few meters or avoid polling a few meters.
|
|
|
|
The subsequent sections will cover an approach to dynamically reload and
|
|
update agent pipeline configuration. (pipeline.yaml)
|
|
|
|
|
|
Proposed change
|
|
===============
|
|
|
|
Summary:
|
|
|
|
File polling by agent - Each agent polls its local pipeline.yaml file for
|
|
changes and re-configures pollsters/listeners on detecting a change.
|
|
|
|
By default, this will be turned off. Its an optional feature and can be
|
|
turned on by specifying reload_pipeline_config = True in ceilometer.conf
|
|
under the default section.
|
|
|
|
|
|
The proposed high-level flow:
|
|
|
|
1. Each agent daemon polls local pipeline configuration file
|
|
at configurable interval.
|
|
2. The agents will reload and validate the configuration.
|
|
3. If validation succeeds, the new configuration is used.
|
|
|
|
Next steps for central-agent, compute-agent and ipmi-agent:
|
|
|
|
3. For polling agent, determine which pollsters are needed based on new
|
|
configuration and need to run at what frequency (polling interval)
|
|
4. For polling-agent, all pollsters in flight are allowed to gracefully
|
|
complete.
|
|
5. In HA mode, re-initialize the partition coordinator with new groups
|
|
6. Start the new set of pollsters.
|
|
|
|
Next steps for notification agent:
|
|
|
|
3. Existing listeners are stopped to allow for graceful termination.
|
|
4. Configure new listeners with the changed pipeline configuration.
|
|
|
|
Next steps for HA notification agent:
|
|
|
|
3. Pipeline listeners(pipeline sink-based internal queue listener) are
|
|
gracefully terminated.
|
|
4. Pipeline listeners are re-configured with new pipeline.
|
|
|
|
|
|
Pros:
|
|
Preferred and Easiest approach
|
|
Cons:
|
|
It doesn't centralize the pipeline definition and runs the risk of agents
|
|
diverging on their pipeline definitions
|
|
|
|
This means we're allowing any kind of error levels due to the fact file
|
|
might be changed for one agent, and not changed for another one in the
|
|
same coordination group.
|
|
|
|
|
|
Alternatives
|
|
------------
|
|
|
|
1. File polling by separate daemon - Each agent uses a separate daemon
|
|
that polls the pipeline.yaml for changes. On detecting a change, it signals
|
|
(HUP) the agents to re-load the pipeline configuration.
|
|
|
|
Pros:
|
|
Allows for future extensibility while keeping the agent to pipeline file
|
|
contract unchanged.
|
|
Cons:
|
|
Same as Previous approach
|
|
HA for the new daemon needed
|
|
Using SIGHUP a potential loophole? (since anyone can send a SIGHUP and have
|
|
the agent reload its configuration)
|
|
More work for less gain
|
|
|
|
2. HUP-based pipeline reload by agent - The agent uses signal handler to handle
|
|
the HUP signal and reloads pipeline from the file in response to the signal.
|
|
The assumption on receiving a signal is that the pipeline file has changed,
|
|
so the pipeline file can be changed manually or using config management tools
|
|
- Chef/Puppet.
|
|
|
|
Pros:
|
|
No API.
|
|
Cons:
|
|
File synchronization within coordination group is deployer responsibility
|
|
Same as Approach 1
|
|
Using SIGHUP a potential loophole? (since anyone can send a SIGHUP and have
|
|
the agent reload its configuration)
|
|
|
|
3. Use automated deployment tools - Puppet, Chef, Ansible to change pipeline
|
|
definitions. While this automates changing pipeline definitions across
|
|
multiple agents, it doesn't bring the value-add of on-the-fly updates to the
|
|
agent, without incurring a restart of the daemons.
|
|
|
|
Pros:
|
|
No API.
|
|
Cons:
|
|
Is looking like the only pure admins approach. Will be more tricky for everyone
|
|
not familiar with these tools (devs, DevOps, etc.)
|
|
Not in ceilometer scope.
|
|
|
|
4. Use a combination of 2 and 3 - Use configuration management tool to automate
|
|
the change in pipeline configuration across multiple remote agent nodes. Do
|
|
not restart any processes though. Agents on each node poll the pipeline file
|
|
and on detecting a change in the next polling, will update the pollsters and
|
|
listeners. This approach has a dependency with ops on the config mgmt tools.
|
|
|
|
Pros:
|
|
No API.
|
|
Cons:
|
|
File synchronization within coordination group is on deployer responsibility
|
|
|
|
|
|
Data model impact
|
|
-----------------
|
|
|
|
REST API impact
|
|
---------------
|
|
|
|
Security impact
|
|
---------------
|
|
|
|
Pipeline impact
|
|
---------------
|
|
|
|
The use of the pipeline configuration will change from static to dynamic.
|
|
This will affect the supported meters and their datapoint collection.
|
|
|
|
The impact to the system is expected to be minimal since changes to pipeline
|
|
are expected to be low in frequency.
|
|
|
|
|
|
Other end user impact
|
|
---------------------
|
|
|
|
|
|
Performance/Scalability Impacts
|
|
-------------------------------
|
|
|
|
There could be a small window where message build-up occurs in oslo bus/
|
|
internal queues when the notification listeners are restarted.
|
|
|
|
Other deployer impact
|
|
---------------------
|
|
|
|
Depending on the approach chosen, the deployer will have to synchronize
|
|
pipeline file updates across multiple agent instances and then signal the
|
|
agents to reload the configuration.
|
|
|
|
|
|
Developer impact
|
|
----------------
|
|
|
|
Implementation
|
|
==============
|
|
|
|
Assignee(s)
|
|
-----------
|
|
|
|
Primary assignee:
|
|
rjaiswal
|
|
|
|
Other contributors:
|
|
TBD
|
|
|
|
Ongoing maintainer:
|
|
rjaiswal
|
|
|
|
Work Items
|
|
----------
|
|
|
|
- Implement timer to poll pipeline configuration file
|
|
- Implement graceful termination of pollsters and listeners
|
|
- Implement reconfiguring and reloading of pollsters in polling-agent
|
|
- Implement reconfiguring and reloading of listeners in notification-agent
|
|
- Unit Tests
|
|
- Doc updates
|
|
|
|
Future lifecycle
|
|
================
|
|
|
|
We'll want to continue to iterate on this for future cycles to gain additional
|
|
functionality:
|
|
|
|
Use Tooz with publish-subscribe functionality to enable synchronization of
|
|
pipeline configuration across multiple agent instances in a single coordination
|
|
group.
|
|
|
|
Migration of event traits configuration
|
|
|
|
|
|
Dependencies
|
|
============
|
|
|
|
Relates to:
|
|
https://blueprints.launchpad.net/ceilometer/+spec/dedicated-event-db
|
|
https://review.openstack.org/#/c/119077/ (central and compute agents merge)
|
|
|
|
Testing
|
|
=======
|
|
|
|
Add unit tests to exercise reloading of pipeline in agents
|
|
|
|
Documentation Impact
|
|
====================
|
|
|
|
Update the relevant documentation about this new feature of reloading pipeline configuration
|
|
|
|
References
|
|
==========
|
|
|
|
https://etherpad.openstack.org/p/liberty-ceilometer-pipeline-config
|
|
|
|
https://etherpad.openstack.org/p/configuration_via_data_store
|
|
|
|
https://review.openstack.org/#/c/171826/
|
|
|
|
https://wiki.openstack.org/wiki/Ceilometer/blueprints/Configuration-via-data-store
|
|
|
|
http://docs.openstack.org/developer/oslo.config/configopts.html#oslo_config.cfg.ConfigOpts.reload_config_files |