Merge "Convert telemetry files to RST"

This commit is contained in:
Jenkins 2015-07-02 08:19:05 +00:00 committed by Gerrit Code Review
commit 6bd6a6c5bc
6 changed files with 2891 additions and 22 deletions

View File

@ -0,0 +1,917 @@
.. highlight:: guess
:linenothreshold: 5
.. _telemetry-data-collection:
===============
Data collection
===============
The main responsibility of Telemetry in OpenStack is to collect
information about the system that can be used by billing systems or
interpreted by analytic tooling. The original focus, regarding to the
collected data, was on the counters that can be used for billing, but
the range is getting wider continuously.
Collected data can be stored in the form of samples or events in the
supported databases, listed in :ref:`telemetry-supported-databases`.
Samples can have various sources regarding to the needs and
configuration of Telemetry, which requires multiple methods to collect
data.
The available data collection mechanisms are:
Notifications
Processing notifications from other OpenStack services, by consuming
messages from the configured message queue system.
Polling
Retrieve information directly from the hypervisor or from the host
machine using SNMP, or by using the APIs of other OpenStack
services.
RESTful API
Pushing samples via the RESTful API of Telemetry.
Notifications
~~~~~~~~~~~~~
All the services send notifications about the executed operations or
system state in OpenStack. Several notifications carry information that
can be metered, like the CPU time of a VM instance created by OpenStack
Compute service.
The Telemetry module has a separate agent that is responsible for
consuming notifications, namely the notification agent. This component
is responsible for consuming from the message bus and transforming
notifications into events and measurement samples.
The different OpenStack services emit several notifications about the
various types of events that happen in the system during normal
operation. Not all these notifications are consumed by the Telemetry
module, as the intention is only to capture the billable events and
notifications that can be used for monitoring or profiling purposes. The
notification agent filters by the event type, that is contained by each
notification message. The following table contains the event types by
each OpenStack service that are transformed to samples by Telemetry.
+--------------------+------------------------+-------------------------------+
| OpenStack service | Event types | Note |
+====================+========================+===============================+
| OpenStack Compute | scheduler.run\_insta\ | For a more detailed list of |
| | nce.scheduled | Compute notifications please |
| | | check the `System Usage Data |
| | scheduler.select\_\ | Data wiki page <https://wiki |
| | destinations | .openstack.org/wiki/ |
| | | SystemUsageData>`__. |
| | compute.instance.\* | |
+--------------------+------------------------+-------------------------------+
| Bare metal service | hardware.ipmi.\* | |
+--------------------+------------------------+-------------------------------+
| OpenStack Image | image.update | The required configuration |
| service | | for Image service can be |
| | image.upload | found in `Configure the Image |
| | | service for Telemetry section |
| | image.delete | <http://docs.openstack.org |
| | | /kilo/install-guide/install |
| | image.send | /apt/content/ceilometer- |
| | | glance.html>`__ section in |
| | | the OpenStack Installation |
| | | Guide. |
+--------------------+------------------------+-------------------------------+
| OpenStack | floatingip.create.end | |
| Networking | | |
| | floatingip.update.\* | |
| | | |
| | floatingip.exists | |
| | | |
| | network.create.end | |
| | | |
| | network.update.\* | |
| | | |
| | network.exists | |
| | | |
| | port.create.end | |
| | | |
| | port.update.\* | |
| | | |
| | port.exists | |
| | | |
| | router.create.end | |
| | | |
| | router.update.\* | |
| | | |
| | router.exists | |
| | | |
| | subnet.create.end | |
| | | |
| | subnet.update.\* | |
| | | |
| | subnet.exists | |
| | | |
| | l3.meter | |
+--------------------+------------------------+-------------------------------+
| Orchestration | orchestration.stack\ | |
| module | .create.end | |
| | | |
| | orchestration.stack\ | |
| | .update.end | |
| | | |
| | orchestration.stack\ | |
| | .delete.end | |
| | | |
| | orchestration.stack\ | |
| | .resume.end | |
| | | |
| | orchestration.stack\ | |
| | .suspend.end | |
+--------------------+------------------------+-------------------------------+
| OpenStack Block | volume.exists | The required configuration |
| Storage | | for Block Storage service can |
| | volume.create.\* | be found in the `Add the |
| | | Block Storage service agent |
| | volume.delete.\* | for Telemetry section <http: |
| | | //docs.openstack.org/kilo/ |
| | volume.update.\* | install-guide/install/apt/ |
| | | content/ceilometer-cinder |
| | volume.resize.\* | .html>`__ section in the |
| | | OpenStack Installation Guide. |
| | volume.attach.\* | |
| | | |
| | volume.detach.\* | |
| | | |
| | snapshot.exists | |
| | | |
| | snapshot.create.\* | |
| | | |
| | snapshot.delete.\* | |
| | | |
| | snapshot.update.\* | |
+--------------------+------------------------+-------------------------------+
.. note::
Some services require additional configuration to emit the
notifications using the correct control exchange on the message
queue and so forth. These configuration needs are referred in the
above table for each OpenStack service that needs it.
.. note::
When the ``store_events`` option is set to True in
:file:`ceilometer.conf`, the notification agent needs database access in
order to work properly.
Middleware for the OpenStack Object Storage service
---------------------------------------------------
A subset of Object Store statistics requires additional middleware to
be installed behind the proxy of Object Store. This additional component
emits notifications containing data-flow-oriented meters, namely the
``storage.objects.(incoming|outgoing).bytes values``. The list of these
meters are listed in :ref:`telemetry-object-storage-meter`, marked with
``notification`` as origin.
The instructions on how to install this middleware can be found in
`Configure the Object Storage service for Telemetry
<http://docs.openstack.org/kilo/install-guide/install/apt/content/ceilometer-swift.html>`__
section in the OpenStack Installation Guide.
Telemetry middleware
--------------------
Telemetry provides the capability of counting the HTTP requests and
responses for each API endpoint in OpenStack. This is achieved by
storing a sample for each event marked as ``audit.http.request``,
``audit.http.response``, ``http.request`` or ``http.response``.
It is recommended that these notifications be consumed as events rather
than samples to better index the appropriate values and avoid massive
load on the Metering database. If preferred, Telemetry can consume these
events as samples if the services are configured to emit ``http.*``
notifications.
Polling
~~~~~~~
The Telemetry module is intended to store a complex picture of the
infrastructure. This goal requires additional information than what is
provided by the events and notifications published by each service. Some
information is not emitted directly, like resource usage of the VM
instances.
Therefore Telemetry uses another method to gather this data by polling
the infrastructure including the APIs of the different OpenStack
services and other assets, like hypervisors. The latter case requires
closer interaction with the compute hosts. To solve this issue,
Telemetry uses an agent based architecture to fulfill the requirements
against the data collection.
There are three types of agents supporting the polling mechanism, the
compute agent, the central agent, and the IPMI agent. Under the hood,
all the types of polling agents are the same ``ceilometer-polling`` agent,
except that they load different polling plug-ins (pollsters) from
differernt namespaces to gather data. The following subsections give
further information regarding the architectural and configuration
details of these components.
Running ceilometer-agent-compute is exactly the same as::
$ ceilometer-polling --polling-namespaces compute
Running ceilometer-agent-central is exactly the same as::
$ ceilometer-polling --polling-namespaces central
Running ceilometer-agent-ipmi is exactly the same as::
$ ceilometer-polling --polling-namespaces ipmi
In addition to loading all the polling plug-ins registered in the
specified namespaces, the ceilometer-polling agent can also specify the
polling plug-ins to be loaded by using the ``pollster-list`` option::
$ ceilometer-polling --polling-namespaces central \
--pollster-list image image.size storage.*
.. note::
HA deployment is NOT supported if the ``pollster-list`` option is
used.
.. note::
The ceilometer-polling service is available since Kilo release.
Central agent
-------------
As the name of this agent shows, it is a central component in the
Telemetry architecture. This agent is responsible for polling public
REST APIs to retrieve additional information on OpenStack resources not
already surfaced via notifications, and also for polling hardware
resources over SNMP.
The following services can be polled with this agent:
- OpenStack Networking
- OpenStack Object Storage
- OpenStack Block Storage
- Hardware resources via SNMP
- Energy consumption meters via `Kwapi <https://launchpad.net/kwapi>`__
framework
To install and configure this service use the `Install the Telemetry module
<http://docs.openstack.org/kilo/install-guide/install/apt/content/ch_ceilometer.html>`__
section in the OpenStack Installation Guide.
The central agent does not need direct database connection. The samples
collected by this agent are sent via AMQP to the collector service or
any external service, which is responsible for persisting the data into
the configured database back end.
Compute agent
-------------
This agent is responsible for collecting resource usage data of VM
instances on individual compute nodes within an OpenStack deployment.
This mechanism requires a closer interaction with the hypervisor,
therefore a separate agent type fulfills the collection of the related
meters, which is placed on the host machines to locally retrieve this
information.
A compute agent instance has to be installed on each and every compute
node, installation instructions can be found in the `Install the Compute
agent for Telemetry
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
section in the OpenStack Installation Guide.
Just like the central agent, this component also does not need a direct
database connection. The samples are sent via AMQP to the collector.
The list of supported hypervisors can be found in
:ref:`telemetry-supported-hypervisors`. The compute agent uses the API of the
hypervisor installed on the compute hosts. Therefore the supported meters may
be different in case of each virtualization back end, as each inspection tool
provides a different set of meters.
The list of collected meters can be found in :ref:`telemetry-compute-meters`.
The support column provides the information that which meter is available for
each hypervisor supported by the Telemetry module.
.. note::
Telemetry supports Libvirt, which hides the hypervisor under it.
Support for HA deployment of the central and compute agent services
-------------------------------------------------------------------
Both the central and the compute agent can run in an HA deployment,
which means that multiple instances of these services can run in
parallel with workload partitioning among these running instances.
The `Tooz <https://pypi.python.org/pypi/tooz>`__ library provides the
coordination within the groups of service instances. It provides an API
above several back ends that can be used for building distributed
applications.
Tooz supports `various
drivers <http://docs.openstack.org/developer/tooz/drivers.html>`__
including the following back end solutions:
- `Zookeeper <http://zookeeper.apache.org/>`__. Recommended solution by
the Tooz project.
- `Redis <http://redis.io/>`__. Recommended solution by the Tooz
project.
- `Memcached <http://memcached.org/>`__. Recommended for testing.
You must configure a supported Tooz driver for the HA deployment of the
Telemetry services.
For information about the required configuration options that have to be
set in the :file:`ceilometer.conf` configuration file for both the central
and compute agents, see the `Coordination section
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
in the OpenStack Configuration Reference.
.. note::
Without the ``backend_url`` option being set only one instance of
both the central and compute agent service is able to run and
function correctly.
The availability check of the instances is provided by heartbeat
messages. When the connection with an instance is lost, the workload
will be reassigned within the remained instances in the next polling
cycle.
.. note::
``Memcached`` uses a ``timeout`` value, which should always be set
to a value that is higher than the ``heartbeat`` value set for
Telemetry.
For backward compatibility and supporting existing deployments, the
central agent configuration also supports using different configuration
files for groups of service instances of this type that are running in
parallel. For enabling this configuration set a value for the
``partitioning_group_prefix`` option in the `Central section
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
in the OpenStack Configuration Reference.
.. warning::
For each sub-group of the central agent pool with the same
``partitioning_group_prefix`` a disjoint subset of meters must be
polled, otherwise samples may be missing or duplicated. The list of
meters to poll can be set in the :file:`/etc/ceilometer/pipeline.yaml`
configuration file. For more information about pipelines see
:ref:`data-collection-and-processing`.
To enable the compute agent to run multiple instances simultaneously
with workload partitioning, the ``workload_partitioning`` option has to
be set to ``True`` under the `Compute section
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
in the :file:`ceilometer.conf` configuration file.
.. _telemetry-ipmi-agent:
IPMI agent
----------
This agent is responsible for collecting IPMI sensor data and Intel Node
Manager data on individual compute nodes within an OpenStack deployment.
This agent requires an IPMI capable node with the ipmitool utility installed,
which is commonly used for IPMI control on various Linux distributions.
An IPMI agent instance could be installed on each and every compute node
with IPMI support, except when the node is managed by the Bare metal
service and the ``conductor.send_sensor_data`` option is set to ``true``
in the Bare metal service. It is no harm to install this agent on a
compute node without IPMI or Intel Node Manager support, as the agent
checks for the hardware and if none is available, returns empty data. It
is suggested that you install the IPMI agent only on an IPMI capable
node for performance reasons.
Just like the central agent, this component also does not need direct
database access. The samples are sent via AMQP to the collector.
The list of collected meters can be found in
:ref:`telemetry-bare-metal-service`.
.. note::
Do not deploy both the IPMI agent and the Bare metal service on one
compute node. If ``conductor.send_sensor_data`` is set, this
misconfiguration causes duplicated IPMI sensor samples.
Send samples to Telemetry
~~~~~~~~~~~~~~~~~~~~~~~~~
While most parts of the data collection in the Telemetry module are
automated, Telemetry provides the possibility to submit samples via the
REST API to allow users to send custom samples into this module.
This option makes it possible to send any kind of samples without the
need of writing extra code lines or making configuration changes.
The samples that can be sent to Telemetry are not limited to the actual
existing meters. There is a possibility to provide data for any new,
customer defined counter by filling out all the required fields of the
POST request.
If the sample corresponds to an existing meter, then the fields like
``meter-type`` and meter name should be matched accordingly.
The required fields for sending a sample using the command line client
are:
- ID of the corresponding resource. (``--resource-id``)
- Name of meter. (``--meter-name``)
- Type of meter. (``--meter-type``)
Predefined meter types:
- Gauge
- Delta
- Cumulative
- Unit of meter. (``--meter-unit``)
- Volume of sample. (``--sample-volume``)
To send samples to Telemetry using the command line client, the
following command should be invoked::
$ ceilometer sample-create -r 37128ad6-daaa-4d22-9509-b7e1c6b08697 \
-m memory.usage --meter-type gauge --meter-unit MB --sample-volume 48
+-------------------+--------------------------------------------+
| Property | Value |
+-------------------+--------------------------------------------+
| message_id | 6118820c-2137-11e4-a429-08002715c7fb |
| name | memory.usage |
| project_id | e34eaa91d52a4402b4cb8bc9bbd308c1 |
| resource_id | 37128ad6-daaa-4d22-9509-b7e1c6b08697 |
| resource_metadata | {} |
| source | e34eaa91d52a4402b4cb8bc9bbd308c1:openstack |
| timestamp | 2014-08-11T09:10:46.358926 |
| type | gauge |
| unit | MB |
| user_id | 679b0499e7a34ccb9d90b64208401f8e |
| volume | 48.0 |
+-------------------+--------------------------------------------+
.. _data-collection-and-processing:
Data collection and processing
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The mechanism by which data is collected and processed is called a
pipeline. Pipelines, at the configuration level, describe a coupling
between sources of data and the corresponding sinks for transformation
and publication of data.
A source is a producer of data: samples or events. In effect, it is a
set of pollsters or notification handlers emitting datapoints for a set
of matching meters and event types.
Each source configuration encapsulates name matching, polling interval
determination, optional resource enumeration or discovery, and mapping
to one or more sinks for publication.
Data gathered can be used for different purposes, which can impact how
frequently it needs to be published. Typically, a meter published for
billing purposes needs to be updated every 30 minutes while the same
meter may be needed for performance tuning every minute.
.. warning::
Rapid polling cadences should be avoided, as it results in a huge
amount of data in a short time frame, which may negatively affect
the performance of both Telemetry and the underlying database back
end. We therefore strongly recommend you do not use small
granularity values like 10 seconds.
A sink, on the other hand, is a consumer of data, providing logic for
the transformation and publication of data emitted from related sources.
In effect, a sink describes a chain of handlers. The chain starts with
zero or more transformers and ends with one or more publishers. The
first transformer in the chain is passed data from the corresponding
source, takes some action such as deriving rate of change, performing
unit conversion, or aggregating, before passing the modified data to the
next step that is described in :ref:`telemetry-publishers`.
.. _telemetry-pipeline-configuration:
Pipeline configuration
----------------------
Pipeline configuration by default, is stored in separate configuration
files, called :file:`pipeline.yaml` and :file:`event_pipeline.yaml`, next to
the :file:`ceilometer.conf` file. The meter pipeline and event pipeline
configuration files can be set by the ``pipeline_cfg_file`` and
``event_pipeline_cfg_file`` options listed in the `Description of
configuration options for api table
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
section in the OpenStack Configuration Reference respectively. Multiple
pipelines can be defined in one pipeline configuration file.
The meter pipeline definition looks like the following::
---
sources:
- name: 'source name'
interval: 'how often should the samples be injected into the pipeline'
meters:
- 'meter filter'
resources:
- 'list of resource URLs'
sinks
- 'sink name'
sinks:
- name: 'sink name'
transformers: 'definition of transformers'
publishers:
- 'list of publishers'
The interval parameter in the sources section should be defined in
seconds. It determines the polling cadence of sample injection into the
pipeline, where samples are produced under the direct control of an
agent.
There are several ways to define the list of meters for a pipeline
source. The list of valid meters can be found in :ref:`telemetry-measurements`.
There is a possibility to define all the meters, or just included or excluded
meters, with which a source should operate:
- To include all meters, use the ``*`` wildcard symbol. It is highly
advisable to select only the meters that you intend on using to avoid
flooding the metering database with unused data.
- To define the list of meters, use either of the following:
- To define the list of included meters, use the ``meter_name``
syntax.
- To define the list of excluded meters, use the ``!meter_name``
syntax.
- For meters, which have variants identified by a complex name
field, use the wildcard symbol to select all, e.g. for
"instance:m1.tiny", use "instance:\*".
.. note::
Please be aware that we do not have any duplication check between
pipelines and if you add a meter to multiple pipelines then it is
assumed the duplication is intentional and may be stored multiple
times according to the specified sinks.
The above definition methods can be used in the following combinations:
- Use only the wildcard symbol.
- Use the list of included meters.
- Use the list of excluded meters.
- Use wildcard symbol with the list of excluded meters.
.. note::
At least one of the above variations should be included in the
meters section. Included and excluded meters cannot co-exist in the
same pipeline. Wildcard and included meters cannot co-exist in the
same pipeline definition section.
The optional resources section of a pipeline source allows a static list
of resource URLs to be configured for polling.
The transformers section of a pipeline sink provides the possibility to
add a list of transformer definitions. The available transformers are:
+-----------------------+------------------------------------+
| Name of transformer | Reference name for configuration |
+=======================+====================================+
| Accumulator | accumulator |
+-----------------------+------------------------------------+
| Aggregator | aggregator |
+-----------------------+------------------------------------+
| Arithmetic | arithmetic |
+-----------------------+------------------------------------+
| Rate of change | rate\_of\_change |
+-----------------------+------------------------------------+
| Unit conversion | unit\_conversion |
+-----------------------+------------------------------------+
The publishers section contains the list of publishers, where the
samples data should be sent after the possible transformations.
Similarly, the event pipeline definition looks like the following::
---
sources:
- name: 'source name'
events:
- 'event filter'
sinks
- 'sink name'
sinks:
- name: 'sink name'
publishers:
- 'list of publishers'
The event filter uses the same filtering logic as the meter pipeline.
.. _telemetry-transformers:
Transformers
^^^^^^^^^^^^
The definition of transformers can contain the following fields:
name
Name of the transformer.
parameters
Parameters of the transformer.
The parameters section can contain transformer specific fields, like
source and target fields with different subfields in case of the rate of
change, which depends on the implementation of the transformer.
In the case of the transformer that creates the ``cpu_util`` meter, the
definition looks like the following::
transformers:
- name: "rate_of_change"
parameters:
target:
name: "cpu_util"
unit: "%"
type: "gauge"
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"
The rate of change the transformer generates is the ``cpu_util`` meter
from the sample values of the ``cpu`` counter, which represents
cumulative CPU time in nanoseconds. The transformer definition above
defines a scale factor (for nanoseconds and multiple CPUs), which is
applied before the transformation derives a sequence of gauge samples
with unit '%', from sequential values of the ``cpu`` meter.
The definition for the disk I/O rate, which is also generated by the
rate of change transformer::
transformers:
- name: "rate_of_change"
parameters:
source:
map_from:
name: "disk\\.(read|write)\\.(bytes|requests)"
unit: "(B|request)"
target:
map_to:
name: "disk.\\1.\\2.rate"
unit: "\\1/s"
type: "gauge"
**Unit conversion transformer**
Transformer to apply a unit conversion. It takes the volume of the meter
and multiplies it with the given ``scale`` expression. Also supports
``map_from`` and ``map_to`` like the rate of change transformer.
Sample configuration::
transformers:
- name: "unit_conversion"
parameters:
target:
name: "disk.kilobytes"
unit: "KB"
scale: "1.0 / 1024.0"
With ``map_from`` and ``map_to`` ::
transformers:
- name: "unit_conversion"
parameters:
source:
map_from:
name: "disk\\.(read|write)\\.bytes"
target:
map_to:
name: "disk.\\1.kilobytes"
scale: "1.0 / 1024.0"
unit: "KB"
**Aggregator transformer**
A transformer that sums up the incoming samples until enough samples
have come in or a timeout has been reached.
Timeout can be specified with the ``retention_time`` option. If we want
to flush the aggregation after a set number of samples have been
aggregated, we can specify the size parameter.
The volume of the created sample is the sum of the volumes of samples
that came into the transformer. Samples can be aggregated by the
attributes ``project_id``, ``user_id`` and ``resource_metadata``. To aggregate
by the chosen attributes, specify them in the configuration and set which
value of the attribute to take for the new sample (first to take the
first sample's attribute, last to take the last sample's attribute, and
drop to discard the attribute).
To aggregate 60s worth of samples by ``resource_metadata`` and keep the
``resource_metadata`` of the latest received sample::
transformers:
- name: "aggregator"
parameters:
retention_time: 60
resource_metadata: last
To aggregate each 15 samples by ``user_id`` and ``resource_metadata`` and keep
the ``user_id`` of the first received sample and drop the
``resource_metadata``::
transformers:
- name: "aggregator"
parameters:
size: 15
user_id: first
resource_metadata: drop
**Accumulator transformer**
This transformer simply caches the samples until enough samples have
arrived and then flushes them all down the pipeline at once::
transformers:
- name: "accumulator"
parameters:
size: 15
**Muli meter arithmetic transformer**
This transformer enables us to perform arithmetic calculations over one
or more meters and/or their metadata, for example::
memory_util = 100 * memory.usage / memory
A new sample is created with the properties described in the ``target``
section of the transformer's configuration. The sample's
volume is the result of the provided expression. The calculation is
performed on samples from the same resource.
.. note::
The calculation is limited to meters with the same interval.
Example configuration::
transformers:
- name: "arithmetic"
parameters:
target:
name: "memory_util"
unit: "%"
type: "gauge"
expr: "100 * $(memory.usage) / $(memory)"
To demonstrate the use of metadata, here is the implementation of a
silly meter that shows average CPU time per core::
transformers:
- name: "arithmetic"
parameters:
target:
name: "avg_cpu_per_core"
unit: "ns"
type: "cumulative"
expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"
.. note::
Expression evaluation gracefully handles NaNs and exceptions. In
such a case it does not create a new sample but only logs a warning.
Block Storage audit script setup to get notifications
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
If you want to collect OpenStack Block Storage notification on demand,
you can use ``cinder-volume-usage-audit`` from OpenStack Block Storage.
This script becomes available when you install OpenStack Block Storage,
so you can use it without any specific settings and you don't need to
authenticate to access the data. To use it, you must run this command in
the following format::
$ cinder-volume-usage-audit \
--start_time='YYYY-MM-DD HH:MM:SS' --end_time='YYYY-MM-DD HH:MM:SS' --send_actions
This script outputs what volumes or snapshots were created, deleted, or
exists in a given period of time and some information about these
volumes or snapshots. Information about the existence and size of
volumes and snapshots is store in the Telemetry module. This data is
also stored as an event which is the recommended usage as it provides
better indexing of data.
Using this script via cron you can get notifications periodically, for
example, every 5 minutes::
*/5 * * * * /path/to/cinder-volume-usage-audit --send_actions
.. _telemetry-storing-samples:
Storing samples
~~~~~~~~~~~~~~~
The Telemetry module has a separate service that is responsible for
persisting the data that comes from the pollsters or is received as
notifications. The data can be stored in a file or a database back end,
for which the list of supported databases can be found in
:ref:`telemetry-supported-databases`. The data can also be sent to an external
data store by using an HTTP dispatcher.
The ``ceilometer-collector`` service receives the data as messages from the
message bus of the configured AMQP service. It sends these datapoints
without any modification to the configured target. The service has to
run on a host machine from which it has access to the configured
dispatcher.
.. note::
Multiple dispatchers can be configured for Telemetry at one time.
Multiple ``ceilometer-collector`` processes can be run at a time. It is also
supported to start multiple worker threads per collector process. The
``collector_workers`` configuration option has to be modified in the
`Collector section
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
of the :file:`ceilometer.conf` configuration file.
.. note::
Prior to the Juno release, it is not recommended to use multiple
workers per collector process when using PostgreSQL as the database
back end.
Database dispatcher
-------------------
When the database dispatcher is configured as data store, you have the
option to set a ``time_to_live`` option (ttl) for samples. By default
the time to live value for samples is set to -1, which means that they
are kept in the database forever.
The time to live value is specified in seconds. Each sample has a time
stamp, and the ``ttl`` value indicates that a sample will be deleted
from the database when the number of seconds has elapsed since that
sample reading was stamped. For example, if the time to live is set to
600, all samples older than 600 seconds will be purged from the
database.
Certain databases support native TTL expiration. In cases where this is
not possible, a command-line script, which you can use for this purpose
is ceilometer-expirer. You can run it in a cron job, which helps to keep
your database in a consistent state.
The level of support differs in case of the configured back end:
+--------------------+-------------------+------------------------------------+
| Database | TTL value support | Note |
+====================+===================+====================================+
| MongoDB | Yes | MongoDB has native TTL support for |
| | | deleting samples that are older |
| | | than the configured ttl value. |
+--------------------+-------------------+------------------------------------+
| SQL-based back | Yes | ceilometer-expirer has to be used |
| ends | | for deleting samples and its |
| | | related data from the database. |
+--------------------+-------------------+------------------------------------+
| HBase | No | Telemetry's HBase support does not |
| | | include native TTL nor |
| | | ceilometer-expirer support. |
+--------------------+-------------------+------------------------------------+
| DB2 NoSQL | No | DB2 NoSQL does not have native TTL |
| | | nor ceilometer-expirer support. |
+--------------------+-------------------+------------------------------------+
HTTP dispatcher
---------------
The Telemetry module supports sending samples to an external HTTP
target. The samples are sent without any modification. To set this
option as the collector's target, the ``dispatcher`` has to be changed
to ``http`` in the :file:`ceilometer.conf` configuration file. For the list
of options that you need to set, see the see the `dispatcher_http
section <http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
in the OpenStack Configuration Reference.
File dispatcher
---------------
You can store samples in a file by setting the ``dispatcher`` option in the
:file:`ceilometer.conf` file. For the list of configuration options,
see the `dispatcher_file section
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
in the OpenStack Configuration Reference.

View File

@ -0,0 +1,497 @@
==============
Data retrieval
==============
The Telemetry module offers several mechanisms from which the persisted
data can be accessed. As described in :ref:`telemetry-system-architecture` and
in :ref:`telemetry-data-collection`, the collected information can be stored in
one or more database back ends, which are hidden by the Telemetry RESTful API.
.. note::
It is highly recommended not to access directly the database and
read or modify any data in it. The API layer hides all the changes
in the actual database schema and provides a standard interface to
expose the samples, alarms and so forth.
Telemetry v2 API
~~~~~~~~~~~~~~~~
The Telemetry module provides a RESTful API, from which the collected
samples and all the related information can be retrieved, like the list
of meters, alarm definitions and so forth.
The Telemetry API URL can be retrieved from the service catalog provided
by OpenStack Identity, which is populated during the installation
process. The API access needs a valid token and proper permission to
retrieve data, as described in :ref:`telemetry-users-roles-tenants`.
Further information about the available API endpoints can be found in
the `Telemetry API Reference
<http://developer.openstack.org/api-ref-telemetry-v2.html>`__.
Query
-----
The API provides some additional functionalities, like querying the
collected data set. For the samples and alarms API endpoints, both
simple and complex query styles are available, whereas for the other
endpoints only simple queries are supported.
After validating the query parameters, the processing is done on the
database side in the case of most database back ends in order to achieve
better performance.
**Simple query**
Many of the API endpoints accept a query filter argument, which should
be a list of data structures that consist of the following items:
- ``field``
- ``op``
- ``value``
- ``type``
Regardless of the endpoint on which the filter is applied on, it will
always target the fields of the `Sample type
<http://docs.openstack.org/developer/ceilometer/webapi/v2.html#Sample>`__.
Several fields of the API endpoints accept shorter names than the ones
defined in the reference. The API will do the transformation internally
and return the output with the fields that are listed in the `API reference
<http://docs.openstack.org/developer/ceilometer/webapi/v2.html>`__.
The fields are the following:
- ``project_id``: project
- ``resource_id``: resource
- ``user_id``: user
When a filter argument contains multiple constraints of the above form,
a logical ``AND`` relation between them is implied.
.. _complex-query:
**Complex query**
The filter expressions of the complex query feature operate on the
fields of ``Sample``, ``Alarm`` and ``AlarmChange`` types. The following
comparison operators are supported:
- ``=``
- ``!=``
- ``<``
- ``<=``
- ``>``
- ``>=``
The following logical operators can be used:
- ``and``
- ``or``
- ``not``
.. note::
The ``not`` operator has different behavior in MongoDB and in the
SQLAlchemy-based database engines. If the ``not`` operator is
applied on a non existent metadata field then the result depends on
the database engine. In case of MongoDB, it will return every sample
as the ``not`` operator is evaluated true for every sample where the
given field does not exist. On the other hand the SQL-based database
engine will return an empty result because of the underlying
``join`` operation.
Complex query supports specifying a list of ``orderby`` expressions.
This means that the result of the query can be ordered based on the
field names provided in this list. When multiple keys are defined for
the ordering, these will be applied sequentially in the order of the
specification. The second expression will be applied on the groups for
which the values of the first expression are the same. The ordering can
be ascending or descending.
The number of returned items can be bounded using the ``limit`` option.
The ``filter``, ``orderby`` and ``limit`` fields are optional.
.. note::
As opposed to the simple query, complex query is available via a
separate API endpoint. For more information see the `Telemetry v2 Web API
Reference <http://docs.openstack.org/developer/ceilometer/webapi/v2.html#v2-web-api>`__.
Statistics
----------
The sample data can be used in various ways for several purposes, like
billing or profiling. In external systems the data is often used in the
form of aggregated statistics. The Telemetry API provides several
built-in functions to make some basic calculations available without any
additional coding.
Telemetry supports the following statistics and aggregation functions:
``avg``
Average of the sample volumes over each period.
``cardinality``
Count of distinct values in each period identified by a key
specified as the parameter of this aggregate function. The supported
parameter values are:
- ``project_id``
- ``resource_id``
- ``user_id``
.. note::
The ``aggregate.param`` option is required.
``count``
Number of samples in each period.
``max``
Maximum of the sample volumes in each period.
``min``
Minimum of the sample volumes in each period.
``stddev``
Standard deviation of the sample volumes in each period.
``sum``
Sum of the sample volumes over each period.
The simple query and the statistics functionality can be used together
in a single API request.
Telemetry command line client and SDK
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The Telemetry module provides a command line client, with which the
collected data is available just as the alarm definition and retrieval
options. The client uses the Telemetry RESTful API in order to execute
the requested operations.
To be able to use the ``ceilometer`` command, the
python-ceilometerclient package needs to be installed and configured
properly. For details about the installation process, see the `Telemetry
chapter <http://docs.openstack.org/kilo/install-guide/install/apt/content/ch_ceilometer.html>`__
in the OpenStack Installation Guide.
.. note::
The Telemetry module captures the user-visible resource usage data.
Therefore the database will not contain any data without the
existence of these resources, like VM images in the OpenStack Image
service.
Similarly to other OpenStack command line clients, the ``ceilometer``
client uses OpenStack Identity for authentication. The proper
credentials and ``--auth_url`` parameter have to be defined via command
line parameters or environment variables.
This section provides some examples without the aim of completeness.
These commands can be used for instance for validating an installation
of Telemetry.
To retrieve the list of collected meters, the following command should
be used::
$ ceilometer meter-list
+------------------------+------------+------+------------------------------------------+----------------------------------+----------------------------------+
| Name | Type | Unit | Resource ID | User ID | Project ID |
+------------------------+------------+------+------------------------------------------+----------------------------------+----------------------------------+
| cpu | cumulative | ns | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| cpu | cumulative | ns | c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| cpu_util | gauge | % | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| cpu_util | gauge | % | c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.device.read.bytes | cumulative | B | bb52e52b-1e42-4751-b3ac-45c52d83ba07-hdd | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.device.read.bytes | cumulative | B | bb52e52b-1e42-4751-b3ac-45c52d83ba07-vda | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.device.read.bytes | cumulative | B | c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b-hdd | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.device.read.bytes | cumulative | B | c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b-vda | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| ... |
+------------------------+------------+------+------------------------------------------+----------------------------------+----------------------------------+
The ``ceilometer`` command was run with ``admin`` rights, which means
that all the data is accessible in the database. For more information
about access right see :ref:`telemetry-users-roles-tenants`. As it can be seen
in the above example, there are two VM instances existing in the system, as
there are VM instance related meters on the top of the result list. The
existence of these meters does not indicate that these instances are running at
the time of the request. The result contains the currently collected meters per
resource, in an ascending order based on the name of the meter.
Samples are collected for each meter that is present in the list of
meters, except in case of instances that are not running or deleted from
the OpenStack Compute database. If an instance is no more existing and
there is ``time_to_live`` value is set in the :file:`ceilometer.conf`
configuration file, then a group of samples are deleted in each
expiration cycle. When the last sample is deleted for a meter, the
database can be cleaned up by running ceilometer-expirer and the meter
will not be present in the list above anymore. For more information
about the expiration procedure see :ref:`telemetry-storing-samples`.
The Telemetry API supports simple query on the meter endpoint. The query
functionality has the following syntax::
--query <field1><operator1><value1>;...;<field_n><operator_n><value_n>
The following command needs to be invoked to request the meters of one
VM instance::
$ ceilometer meter-list --query resource=bb52e52b-1e42-4751-b3ac-45c52d83ba07
+-------------------------+------------+-----------+--------------------------------------+----------------------------------+----------------------------------+
| Name | Type | Unit | Resource ID | User ID | Project ID |
+-------------------------+------------+-----------+--------------------------------------+----------------------------------+----------------------------------+
| cpu | cumulative | ns | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| cpu_util | gauge | % | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.ephemeral.size | gauge | GB | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.read.bytes | cumulative | B | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.read.bytes.rate | gauge | B/s | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.read.requests | cumulative | request | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.read.requests.rate | gauge | request/s | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.root.size | gauge | GB | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.write.bytes | cumulative | B | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.write.bytes.rate | gauge | B/s | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.write.requests | cumulative | request | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| disk.write.requests.rate| gauge | request/s | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| instance | gauge | instance | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| instance:m1.tiny | gauge | instance | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| memory | gauge | MB | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
| vcpus | gauge | vcpu | bb52e52b-1e42-4751-b3ac-45c52d83ba07 | b6e62aad26174382bc3781c12fe413c8 | cbfa8e3dfab64a27a87c8e24ecd5c60f |
+-------------------------+------------+-----------+--------------------------------------+----------------------------------+----------------------------------+
As it was described above, the whole set of samples can be retrieved
that are stored for a meter or filtering the result set by using one of
the available query types. The request for all the samples of the
``cpu`` meter without any additional filtering looks like the following::
$ ceilometer sample-list --meter cpu
+--------------------------------------+-------+------------+------------+------+---------------------+
| Resource ID | Meter | Type | Volume | Unit | Timestamp |
+--------------------------------------+-------+------------+------------+------+---------------------+
| c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b | cpu | cumulative | 5.4863e+11 | ns | 2014-08-31T11:17:03 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.7848e+11 | ns | 2014-08-31T11:17:03 |
| c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b | cpu | cumulative | 5.4811e+11 | ns | 2014-08-31T11:07:05 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.7797e+11 | ns | 2014-08-31T11:07:05 |
| c8d2e153-a48f-4cec-9e93-86e7ac6d4b0b | cpu | cumulative | 5.3589e+11 | ns | 2014-08-31T10:27:19 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.6397e+11 | ns | 2014-08-31T10:27:19 |
| ... |
+--------------------------------------+-------+------------+------------+------+---------------------+
The result set of the request contains the samples for both instances
ordered by the timestamp field in the default descending order.
The simple query makes it possible to retrieve only a subset of the
collected samples. The following command can be executed to request the
``cpu`` samples of only one of the VM instances::
$ ceilometer sample-list --meter cpu --query resource=bb52e52b-1e42-4751-
b3ac-45c52d83ba07
+--------------------------------------+------+------------+------------+------+---------------------+
| Resource ID | Name | Type | Volume | Unit | Timestamp |
+--------------------------------------+------+------------+------------+------+---------------------+
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.7906e+11 | ns | 2014-08-31T11:27:08 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.7848e+11 | ns | 2014-08-31T11:17:03 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.7797e+11 | ns | 2014-08-31T11:07:05 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.6397e+11 | ns | 2014-08-31T10:27:19 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.6207e+11 | ns | 2014-08-31T10:17:03 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 5.3831e+11 | ns | 2014-08-31T08:41:57 |
| ... |
+--------------------------------------+------+------------+------------+------+---------------------+
As it can be seen on the output above, the result set contains samples
for only one instance of the two.
The ``ceilometer query-samples`` command is used to execute rich
queries. This command accepts the following parameters:
``--filter``
Contains the filter expression for the query in the form of:
``{complex_op: [{simple_op: {field_name: value}}]}``.
``--orderby``
Contains the list of ``orderby`` expressions in the form of:
``[{field_name: direction}, {field_name: direction}]``.
``--limit``
Specifies the maximum number of samples to return.
For more information about complex queries see
:ref:`Complex query <complex-query>`.
As the complex query functionality provides the possibility of using
complex operators, it is possible to retrieve a subset of samples for a
given VM instance. To request for the first six samples for the ``cpu``
and ``disk.read.bytes`` meters, the following command should be invoked::
$ ceilometer query-samples --filter '{"and": \
[{"=":{"resource":"bb52e52b-1e42-4751-b3ac-45c52d83ba07"}},{"or":[{"=":{"counter_name":"cpu"}}, \
{"=":{"counter_name":"disk.read.bytes"}}]}]}' --orderby '[{"timestamp":"asc"}]' --limit 6
+--------------------------------------+-----------------+------------+------------+------+---------------------+
| Resource ID | Meter | Type | Volume | Unit | Timestamp |
+--------------------------------------+-----------------+------------+------------+------+---------------------+
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | disk.read.bytes | cumulative | 385334.0 | B | 2014-08-30T13:00:46 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 1.2132e+11 | ns | 2014-08-30T13:00:47 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 1.4295e+11 | ns | 2014-08-30T13:10:51 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | disk.read.bytes | cumulative | 601438.0 | B | 2014-08-30T13:10:51 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | disk.read.bytes | cumulative | 601438.0 | B | 2014-08-30T13:20:33 |
| bb52e52b-1e42-4751-b3ac-45c52d83ba07 | cpu | cumulative | 1.4795e+11 | ns | 2014-08-30T13:20:34 |
+--------------------------------------+-----------------+------------+------------+------+---------------------+
Telemetry python bindings
-------------------------
The command line client library provides python bindings in order to use
the Telemetry Python API directly from python programs.
The first step in setting up the client is to create a client instance
with the proper credentials::
>>> import ceilometerclient.client
>>> cclient = ceilometerclient.client.get_client(VERSION, username=USERNAME, password=PASSWORD, tenant_name=PROJECT_NAME, auth_url=AUTH_URL)
The ``VERSION`` parameter can be ``1`` or ``2``, specifying the API
version to be used.
The method calls look like the following::
>>> cclient.meters.list()
[<Meter ...>, ...]
>>> cclient.samples.list()
[<Sample ...>, ...]
For further details about the python-ceilometerclient package, see the
`Python bindings to the OpenStack Ceilometer
API <http://docs.openstack.org/developer/python-ceilometerclient/>`__
reference.
.. _telemetry-publishers:
Publishers
~~~~~~~~~~
The Telemetry module provides several transport methods to forward the
data collected to the ceilometer-collector service or to an external
system. The consumers of this data are widely different, like monitoring
systems, for which data loss is acceptable and billing systems, which
require reliable data transportation. Telemetry provides methods to
fulfill the requirements of both kind of systems, as it is described
below.
The publisher component makes it possible to persist the data into
storage through the message bus or to send it to one or more external
consumers. One chain can contain multiple publishers.
To solve the above mentioned problem, the notion of multi-publisher can
be configured for each datapoint within the Telemetry module, allowing
the same technical meter or event to be published multiple times to
multiple destinations, each potentially using a different transport.
Publishers can be specified in the ``publishers`` section for each
pipeline (for further details about pipelines see
:ref:`data-collection-and-processing`) that is defined in
the `pipeline.yaml
<https://git.openstack.org/cgit/openstack/ceilometer/plain/etc/ceilometer/pipeline.yaml>`__
file.
The following publisher types are supported:
notifier
It can be specified in the form of
``notifier://?option1=value1&option2=value2``. It emits data over
AMQP using oslo.messaging. This is the recommended method of
publishing.
rpc
It can be specified in the form of
``rpc://?option1=value1&option2=value2``. It emits metering data
over lossy AMQP. This method is synchronous and may experience
performance issues.
udp
It can be specified in the form of ``udp://<host>:<port>/``. It emits
metering data for over UDP.
file
It can be specified in the form of
``file://path?option1=value1&option2=value2``. This publisher
records metering data into a file.
.. note::
If a file name and location is not specified, this publisher
does not log any meters, instead it logs a warning message in
the configured log file for Telemetry.
kafka
It can be specified in the form of:
``kafka://kafka_broker_ip: kafka_broker_port?topic=kafka_topic
&option1=value1``.
This publisher sends metering data to a kafka broker.
.. note::
If the topic parameter is missing, this publisher brings out
metering data under a topic name, ``ceilometer``. When the port
number is not specified, this publisher uses 9092 as the
broker's port.
The following options are available for ``rpc`` and ``notifier``. The
policy option can be used by ``kafka`` publisher:
``per_meter_topic``
The value of it is 1. It is used for publishing the samples on
additional ``metering_topic.sample_name`` topic queue besides the
default ``metering_topic`` queue.
``policy``
It is used for configuring the behavior for the case, when the
publisher fails to send the samples, where the possible predefined
values are the following:
default
Used for waiting and blocking until the samples have been sent.
drop
Used for dropping the samples which are failed to be sent.
queue
Used for creating an in-memory queue and retrying to send the
samples on the queue on the next samples publishing period (the
queue length can be configured with ``max_queue_length``, where
1024 is the default value).
The following options are available for the ``file`` publisher:
``max_bytes``
When this option is greater than zero, it will cause a rollover.
When the size is about to be exceeded, the file is closed and a new
file is silently opened for output. If its value is zero, rollover
never occurs.
``backup_count``
If this value is non-zero, an extension will be appended to the
filename of the old log, as '.1', '.2', and so forth until the
specified value is reached. The file that is written and contains
the newest data is always the one that is specified without any
extensions.
The default publisher is ``notifier``, without any additional options
specified. A sample ``publishers`` section in the
:file:`/etc/ceilometer/pipeline.yaml` looks like the following::
publishers:
- udp://10.0.0.2:1234
- rpc://?per_meter_topic=1
- notifier://?policy=drop&max_queue_length=512

View File

@ -0,0 +1,153 @@
======
Events
======
In addition to meters, the Telemetry module collects events triggered
within an OpenStack environment. This section provides a brief summary
of the events format in the Telemetry module.
While a sample represents a single, numeric datapoint within a
time-series, an event is a broader concept that represents the state of
a resource at a point in time. The state may be described using various
data types including non-numeric data such as an instance's flavor. In
general, events represent any action made in the OpenStack system.
Event configuration
~~~~~~~~~~~~~~~~~~~
To enable the creation and storage of events in the Telemetry module
``store_events`` option needs to be set to ``True``. For further configuration
options, see the event section in the `OpenStack Configuration Reference
<http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__.
.. note::
It is advisable to set ``disable_non_metric_meters`` to ``True``
when enabling events in the Telemetry module. The Telemetry module
historically represented events as metering data, which may create
duplication of data if both events and non-metric meters are
enabled.
Event structure
~~~~~~~~~~~~~~~
Events captured by the Telemetry module are represented by five key
attributes:
event\_type
A dotted string defining what event occurred such as
``"compute.instance.resize.start"``.
message\_id
A UUID for the event.
generated
A timestamp of when the event occurred in the system.
traits
A flat mapping of key-value pairs which describe the event. The
event's traits contain most of the details of the event. Traits are
typed, and can be strings, integers, floats, or datetimes.
raw
Mainly for auditing purpose, the full event message can be stored
(unindexed) for future evaluation.
Event indexing
~~~~~~~~~~~~~~
The general philosophy of notifications in OpenStack is to emit any and
all data someone might need, and let the consumer filter out what they
are not interested in. In order to make processing simpler and more
efficient, the notifications are stored and processed within Ceilometer
as events. The notification payload, which can be an arbitrarily complex
JSON data structure, is converted to a flat set of key-value pairs. This
conversion is specified by a config file.
.. note::
The event format is meant for efficient processing and querying.
Storage of complete notifications for auditing purposes can be
enabled by configuring ``store_raw`` option.
Event conversion
----------------
The conversion from notifications to events is driven by a configuration
file defined by the ``definitions_cfg_file`` in the :file:`ceilometer.conf`
configuration file.
This includes descriptions of how to map fields in the notification body
to Traits, and optional plug-ins for doing any programmatic translations
(splitting a string, forcing case).
The mapping of notifications to events is defined per event\_type, which
can be wildcarded. Traits are added to events if the corresponding
fields in the notification exist and are non-null.
.. note::
The default definition file included with the Telemetry module
contains a list of known notifications and useful traits. The
mappings provided can be modified to include more or less data
according to user requirements.
If the definitions file is not present, a warning will be logged, but an
empty set of definitions will be assumed. By default, any notifications
that do not have a corresponding event definition in the definitions
file will be converted to events with a set of minimal traits. This can
be changed by setting the option ``drop_unmatched_notifications`` in the
:file:`ceilometer.conf` file. If this is set to True, any unmapped
notifications will be dropped.
The basic set of traits (all are TEXT type) that will be added to all
events if the notification has the relevant data are: service
(notification's publisher), tenant\_id, and request\_id. These do not
have to be specified in the event definition, they are automatically
added, but their definitions can be overridden for a given event\_type.
Event definitions format
------------------------
The event definitions file is in YAML format. It consists of a list of
event definitions, which are mappings. Order is significant, the list of
definitions is scanned in reverse order to find a definition which
matches the notification's event\_type. That definition will be used to
generate the event. The reverse ordering is done because it is common to
want to have a more general wildcarded definition (such as
``compute.instance.*``) with a set of traits common to all of those
events, with a few more specific event definitions afterwards that have
all of the above traits, plus a few more.
Each event definition is a mapping with two keys:
event\_type
This is a list (or a string, which will be taken as a 1 element
list) of event\_types this definition will handle. These can be
wildcarded with unix shell glob syntax. An exclusion listing
(starting with a ``!``) will exclude any types listed from matching.
If only exclusions are listed, the definition will match anything
not matching the exclusions.
traits
This is a mapping, the keys are the trait names, and the values are
trait definitions.
Each trait definition is a mapping with the following keys:
fields
A path specification for the field(s) in the notification you wish
to extract for this trait. Specifications can be written to match
multiple possible fields. By default the value will be the first
such field. The paths can be specified with a dot syntax
(``payload.host``). Square bracket syntax (``payload[host]``) is
also supported. In either case, if the key for the field you are
looking for contains special characters, like ``.``, it will need to
be quoted (with double or single quotes):
``payload.image_meta.org.openstack__1__architecture``. The syntax
used for the field specification is a variant of
`JSONPath <https://github.com/kennknowles/python-jsonpath-rw>`__
type
(Optional) The data type for this trait. Valid options are:
``text``, ``int``, ``float``, and ``datetime``. Defaults to ``text``
if not specified.
plugin
(Optional) Used to execute simple programmatic conversions on the
value in a notification field.

File diff suppressed because it is too large Load Diff

View File

@ -1,5 +1,8 @@
.. _telemetry-system-architecture:
===================
System architecture
~~~~~~~~~~~~~~~~~~~
===================
The Telemetry module uses an agent-based architecture. Several modules
combine their responsibilities to collect data, store samples in a
@ -61,8 +64,10 @@ communication.
|
.. _telemetry-supported-databases:
Supported databases
-------------------
~~~~~~~~~~~~~~~~~~~
The other key external component of Telemetry is the database, where
events, samples, alarm definitions and alarms are stored.
@ -88,8 +93,10 @@ The list of supported database back ends:
|
.. _telemetry-supported-hypervisors:
Supported hypervisors
---------------------
~~~~~~~~~~~~~~~~~~~~~
The Telemetry module collects information about the virtual machines,
which requires close connection to the hypervisor that runs on the
@ -125,7 +132,7 @@ The list of supported hypervisors is:
|
Supported networking services
-----------------------------
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Telemetry is able to retrieve information from OpenStack Networking and
external networking services:
@ -148,8 +155,10 @@ external networking services:
|
.. _telemetry-users-roles-tenants:
Users, roles and tenants
------------------------
~~~~~~~~~~~~~~~~~~~~~~~~
This module of OpenStack uses OpenStack Identity for authenticating and
authorizing users. The required configuration options are listed in the
@ -157,10 +166,9 @@ authorizing users. The required configuration options are listed in the
section <http://docs.openstack.org/kilo/config-reference/content/ch_configuring-openstack-telemetry.html>`__
in the *OpenStack Configuration Reference*.
Two roles are used in the system basically, which are the 'admin' and
'non-admin'. The authorization happens before processing each API
request. The amount of returned data depends on the role the requestor
owns.
The system uses two roles:`admin` and `non-admin`. The authorization happens
before processing each API request. The amount of returned data depends on the
role the requestor owns.
The creation of alarm definitions also highly depends on the role of the
user, who initiated the action. Further details about alarm handling can

View File

@ -4,9 +4,6 @@
Telemetry
=========
Introduction
~~~~~~~~~~~~
Even in the cloud industry, providers must use a multi-step process
for billing. The required steps to bill for usage in a cloud
environment are metering, rating, and billing. Because the provider's
@ -46,22 +43,17 @@ You can retrieve the collected samples in three different ways: with
the REST API, with the command line interface, or with the Metering
tab on an OpenStack dashboard.
.. include:: telemetry-system-architecture.rst
.. include:: telemetry-troubleshooting-guide.rst
.. include:: telemetry-best-practices.rst
.. toctree::
:hidden:
:maxdepth: 2
telemetry-system-architecture.rst
telemetry-data-collection.rst
telemetry-data-retrieval.rst
telemetry-measurements.rst
telemetry-events.rst
telemetry-troubleshooting-guide.rst
telemetry-best-practices.rst
.. TODO (OL) Translate and add the below files with new name
include: telemetry/section_telemetry-data-collection.xml
include: telemetry/section_telemetry-data-retrieval.xml
include: telemetry/section_telemetry-alarms.xml
include: telemetry/section_telemetry-measurements.xml
include: telemetry/section_telemetry-events.xml