Update the high level architecture
Updated the architecture document to include new features since folsom. Change-Id: I8f3bee2f881341a18ad20063d081f0fb7d63c3ad
BIN
doc/source/1-Collectorandagents.png
Normal file
After Width: | Height: | Size: 37 KiB |
BIN
doc/source/2-accessmodel.png
Normal file
After Width: | Height: | Size: 34 KiB |
BIN
doc/source/3-Pipeline.png
Normal file
After Width: | Height: | Size: 30 KiB |
BIN
doc/source/4-Transformer.png
Normal file
After Width: | Height: | Size: 41 KiB |
BIN
doc/source/5-multi-publish.png
Normal file
After Width: | Height: | Size: 27 KiB |
BIN
doc/source/6-storagemodel.png
Normal file
After Width: | Height: | Size: 46 KiB |
BIN
doc/source/7-overallarchi.png
Normal file
After Width: | Height: | Size: 58 KiB |
@ -15,42 +15,250 @@ High Level Description
|
||||
double: database; architecture
|
||||
double: API; architecture
|
||||
|
||||
The following diagram summarizes ceilometer logical architecture:
|
||||
Objectives
|
||||
----------
|
||||
|
||||
.. The image source can be found at https://docs.google.com/drawings/d/1-6-DxU5ITyRcVJtJtPsc_zeiqzafZlir0AF7AkG4ZeQ/edit
|
||||
The Ceilometer project was started in 2012 with one simple goal in mind: to
|
||||
provide an infrastructure to collect any information needed regarding
|
||||
OpenStack projects. It was designed so that rating engines could use this
|
||||
single source to transform events into billable items which we
|
||||
label as "metering".
|
||||
|
||||
.. image:: ./Ceilometer_Architecture.png
|
||||
As the project started to come to life, collecting an
|
||||
`increasing number of metrics`_ across multiple projects, the OpenStack
|
||||
community started to realize that a secondary goal could be added to
|
||||
Ceilometer: become a standard way to collect metric, regardless of the
|
||||
purpose of the collection. For example, Ceilometer can now publish information for
|
||||
monitoring, debugging and graphing tools in addition or in parallel to the
|
||||
metering backend. We labelled this effort as “multi-publisher“.
|
||||
|
||||
As shown in the above diagram, there are 5 basic components to the system:
|
||||
.. _increasing number of metrics: http://docs.openstack.org/developer/ceilometer/measurements.html
|
||||
|
||||
1. A :term:`compute agent` runs on each compute node and polls for
|
||||
resource utilization statistics. There may be other types of agents
|
||||
in the future, but for now we will focus on creating the compute
|
||||
agent.
|
||||
Most recently, as the Heat project started to come to
|
||||
life, it soon became clear that the OpenStack project needed a tool to watch for
|
||||
variations in key values in order to trigger various reactions.
|
||||
As Ceilometer already had the tooling to collect vast quantities of data, it
|
||||
seemed logical to add this as an extension of the Ceilometer project, which we
|
||||
tagged as “alarming“.
|
||||
|
||||
2. A :term:`central agent` runs on a central management server to
|
||||
poll for resource utilization statistics for resources not tied
|
||||
to instances or compute nodes.
|
||||
Metering
|
||||
--------
|
||||
|
||||
3. A :term:`collector` runs on one or more central management
|
||||
servers to monitor the message queues (for notifications and for
|
||||
metering data coming from the agent). Notification messages are
|
||||
processed and turned into metering messages and sent back out onto
|
||||
the message bus using the appropriate topic. Metering messages are
|
||||
written to the data store without modification.
|
||||
If you divide a billing process into a three steps process as is commonly done in
|
||||
the telco industry, the steps are:
|
||||
|
||||
4. A :term:`data store` is a database capable of handling concurrent
|
||||
writes (from one or more collector instances) and reads (from the
|
||||
API server).
|
||||
1. :term:`Metering` is the process of collecting information about what,
|
||||
who, when and how much regarding anything that can be billed. The result of
|
||||
this is a collection of “tickets” (a.k.a. samples) which are ready to be
|
||||
processed in anyway you want.
|
||||
1. :term:`Rating` is the process of analysing a series of tickets,
|
||||
according to business rules defined by marketing, in order to transform
|
||||
them into bill line items with a currency value.
|
||||
1. :term:`Billing` is the process to assemble bill line items into a
|
||||
single per customer bill, emitting the bill to start the payment collection.
|
||||
|
||||
5. An :term:`API server` runs on one or more central management
|
||||
servers to provide access to the data from the data store. See
|
||||
`API Description`_ for details.
|
||||
Ceilometer’s initial goal was, and still is, strictly limited to step
|
||||
one. This is a choice made from the beginning not to go into rating or billing,
|
||||
as the variety of possibilities seemed too huge for the project to ever deliver
|
||||
a solution that would fit everyone’s needs, from private to public clouds. This
|
||||
means that if you are looking at this project to solve your billing needs, this
|
||||
is the right way to go, but certainly not the end of the road for your. Once
|
||||
Ceilometer is in place on your OpenStack deployment, you will still have
|
||||
quite a few things to do before you can produce a bill for your customers.
|
||||
One of you first task could be: finding the right queries within the Ceilometer
|
||||
API to extract the information you need for your very own rating engine.
|
||||
|
||||
.. _API Description: api.html
|
||||
You can of course use the same API to satisfy other needs, such as a data mining
|
||||
solution to help you identify unexpected or new usage types, or a capacity
|
||||
planning solution. In general, it is recommended to download the data from the API in
|
||||
order to work on it in a separate database to avoid overloading the one which
|
||||
should be dedicated to storing tickets. It is also often found that the
|
||||
Ceilometer metering DB only keeps a couple months worth of data while data is
|
||||
regularly offloaded into a long term store connected to the billing system,
|
||||
but this is fully left up to the implementor.
|
||||
|
||||
These services communicate using the standard OpenStack messaging
|
||||
bus. Only the collector and API server have access to the data store.
|
||||
.. note::
|
||||
|
||||
We do not guarantee that we won’t change the DB schema, so it is
|
||||
highly recommended to access the database through the API and not use
|
||||
direct queries.
|
||||
|
||||
|
||||
How is data collected?
|
||||
----------------------
|
||||
.. The source for the 7 diagrams below can be found at: https://docs.google.com/presentation/d/1P50qO9BSAdGxRSbgHSbxLo0dKWx4HDIgjhDVa8KBR-Q/edit?usp=sharing
|
||||
|
||||
.. figure:: ./1-Collectorandagents.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: Collectors and agents
|
||||
|
||||
This is a representation of how the collectors and agents gather data from multiple sources.
|
||||
|
||||
In a perfect world, each and every project that you want to instrument should
|
||||
send events on the Oslo bus about anything that could be of interest to
|
||||
your. Unfortunately, not all
|
||||
projects have implemented this and you will often need to instrument
|
||||
other tools which may not use the same bus as OpenStack has defined. To
|
||||
circumvent this, the Ceilometer project created 3 independent methods to
|
||||
collect data:
|
||||
|
||||
1. :term:`Bus listener agent` which takes events generated on the Oslo
|
||||
notification bus and transforms them into Ceilometer sample. Again this
|
||||
is the preferred method of data collection. If you are working on some
|
||||
OpenStack related project and are using the Oslo library, you are kindly
|
||||
invited to come and talk to one of the project members to learn how you
|
||||
could quickly add instrumentation for your project.
|
||||
1. :term:`Push agents` which is the only solution to fetch data within projects
|
||||
which do not expose the required data in a remotely useable way. This is not
|
||||
the preferred method as it makes deployment a bit more complex having to add
|
||||
a component to each of the nodes that need to be monitored. However, we do
|
||||
prefer this compared to a polling agent method as resilience (high
|
||||
availability) will not be a problem with this method.
|
||||
1. :term:`Polling agents` which is the least preferred method, that will poll
|
||||
some API or other tool to collect information at a regular interval. The main
|
||||
reason why we do not like this method is the inherent difficulty to make such
|
||||
a component be resilient.
|
||||
|
||||
How to access collected data?
|
||||
-----------------------------
|
||||
|
||||
Once collected, the data is stored in a database. There can be multiple types of
|
||||
databases through the use of different database plugins (see the section
|
||||
`which database to use`_). Moreover, the schema and dictionary of
|
||||
this database can also evolve over time. For both reasons, we offer a REST API
|
||||
that should be the only way for you to access the collected data rather than
|
||||
accessing the underlying database directly. It is possible that the way
|
||||
you’d like to access your data is not yet supported by the API. If you think
|
||||
this is the case, please contact us with your feedback as this will certainly
|
||||
lead us to improve the API.
|
||||
|
||||
.. figure:: ./2-accessmodel.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: data access model
|
||||
|
||||
This is a representation of how to access data stored by ceilometer
|
||||
|
||||
The :ref:`list of currently built in meters <measurements>` is
|
||||
available in the developer documentation,
|
||||
and it is also relatively easy to add your own (and eventually contribute it).
|
||||
|
||||
Ceilometer is part of OpenStack, but is not tied to OpenStack's definition of
|
||||
"users" and "tenants." The "source" field of each sample refers to the authority
|
||||
defining the user and tenant associated with the sample. Deployers can define
|
||||
custom sources through a configuration file, and then create agents to collect
|
||||
samples for new meters using those sources. This means that you can collect
|
||||
data for applications running on top of OpenStack, such as a PaaS or SaaS
|
||||
layer, and use the same tools for metering your entire cloud.
|
||||
|
||||
Moreover, end users can also :ref:`send their own application centric data <user-defined-data>` into the
|
||||
database through the REST API for a various set of use cases (see the section
|
||||
“Alarming” later in this article).
|
||||
|
||||
.. _send their own application centric data: ./webapi/v2.html#user-defined-data
|
||||
|
||||
Multi-Publisher
|
||||
---------------
|
||||
|
||||
.. figure:: ./3-Pipeline.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: Ceilometer pipeline
|
||||
|
||||
The assembly of component making the ceilometer pipeline
|
||||
|
||||
Publishing meters for different uses is actually a two dimensional problem.
|
||||
The first variable is the frequency of publication. Typically a meter that
|
||||
you publish for billing need will need to be updated every 30 min while the
|
||||
same meter needed for performance tuning may be needed every 10 seconds.
|
||||
|
||||
The second variable is the transport. In the case of data intended for a
|
||||
monitoring system, losing an update or not ensuring security
|
||||
(non-repudiability) of a message is not really a problem while the same meter
|
||||
will need both security and guaranteed delivery in the case of data intended
|
||||
for rating and billing systems.
|
||||
|
||||
To solve this, the notion of multi-publisher can now be configured for each
|
||||
meter within Ceilometer, allowing the same technical meter to be published
|
||||
multiple times to multiple destination each potentially using a different
|
||||
transport and frequency of publication. At the time of writing, two
|
||||
transports have been implemented so far: the original and relatively secure
|
||||
Oslo RPC queue based, and one using UDP packets.
|
||||
|
||||
.. figure:: ./4-Transformer.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: Transformer example
|
||||
|
||||
Example of aggregation of multiple cpu time usage samples in a single
|
||||
cpu percentage sample
|
||||
|
||||
.. figure:: ./5-multi-publish.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: Multi-publish
|
||||
|
||||
This figure shows how a sample can be published to multiple destinations.
|
||||
|
||||
Alarming
|
||||
--------
|
||||
|
||||
The Alarming component of Ceilometer, which is being delivered in the Havana
|
||||
version, allows you to set alarms based on threshold evaluation for a collection
|
||||
of samples. An alarm can be set on a single meter, or on a combination. For
|
||||
example, you may want to trigger an alarm when the memory consumption
|
||||
reaches 70% on a given instance if the instance has been up for more than
|
||||
10 min. To setup an alarm, you will call :ref:`Ceilometer’s API server <alarms-api>` specifying
|
||||
the alarm conditions and an action to take.
|
||||
|
||||
Of course, if you are not administrator of the cloud itself, you can only
|
||||
set alarms on meters for your own components. Good news, you can also
|
||||
:ref:`send your own meters <user-defined-data>` from within your instances,
|
||||
meaning that you can trigger
|
||||
alarms based on application centric data.
|
||||
|
||||
|
||||
There can be multiple form of actions, but two have been implemented so far:
|
||||
1. http call back: you provide a URL to be called whenever the alarm has been set
|
||||
off. The payload of the request contains all the details of why the alarm went
|
||||
off.
|
||||
2. log: mostly useful for debugging, stores alarms in a log file.
|
||||
|
||||
For more details on this, I recommend you to read the blog post by
|
||||
Mehdi Abaakouk `Autoscaling with Heat and Ceilometer`_. Particular attention
|
||||
should be given to the section “Some notes about deploying alarming” as the
|
||||
database setup (using a separate database from the one used for metering)
|
||||
will be critical in all cases of production deployment.
|
||||
|
||||
.. _Autoscaling with Heat and Ceilometer: http://techs.enovance.com/5991/autoscaling-with-heat-and-ceilometer
|
||||
|
||||
.. _which database to use:
|
||||
Which database to use
|
||||
---------------------
|
||||
|
||||
.. figure:: ./6-storagemodel.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: Storage model
|
||||
|
||||
An overview of the Ceilometer storage model.
|
||||
|
||||
Since the beginning of the project, a plugin model has been put in place
|
||||
to allow for various types of database backends to be used. However, not
|
||||
all implementations are equal and, at the time of writing, MongoDB
|
||||
is the recommended backend of choice because it is the most tested. Have a look
|
||||
at the “choosing a database backend” section of the documentation for more
|
||||
details. In short, ensure a dedicated database is used when deploying
|
||||
Ceilometer as the volume of data generated can be extensive in a production
|
||||
environment and will generally use a lot of I/O.
|
||||
|
||||
.. figure:: ./7-overallarchi.png
|
||||
:figwidth: 100%
|
||||
:align: center
|
||||
:alt: Architecture summary
|
||||
|
||||
An overall summary of Ceilometer's logical architecture.
|
||||
|
||||
Detailed Description
|
||||
====================
|
||||
|
@ -32,6 +32,7 @@ Samples and Statistics
|
||||
.. autotype:: ceilometer.api.controllers.v2.Statistics
|
||||
:members:
|
||||
|
||||
.. _alarms-api:
|
||||
Alarms
|
||||
======
|
||||
|
||||
@ -292,6 +293,7 @@ parameter to the query::
|
||||
|
||||
This query would only return the last 3 samples.
|
||||
|
||||
.. _user-defined-data:
|
||||
User-defined data
|
||||
+++++++++++++++++
|
||||
|
||||
|