862 lines
44 KiB
XML
862 lines
44 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<section xmlns="http://docbook.org/ns/docbook"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
version="5.0"
|
|
xml:id="section_telemetry-data-collection">
|
|
<title>Data collection</title>
|
|
<para>The main responsibility of Telemetry in OpenStack is to collect
|
|
information about the system that can be used by billing systems or
|
|
interpreted by analytic tooling. The original focus, regarding to
|
|
the collected data, was on the counters that can be used for billing,
|
|
but the range is getting wider continuously.</para>
|
|
<para>Collected data can be stored in the form of samples or events in the
|
|
supported databases, listed in
|
|
<xref linkend="section_telemetry-supported-dbs"/>.</para>
|
|
<para>Samples can have various sources regarding to the needs
|
|
and configuration of Telemetry, which requires multiple methods to
|
|
collect data.</para>
|
|
<para>The available data collection mechanisms are:</para>
|
|
<para>
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Notifications</term>
|
|
<listitem>
|
|
<para>Processing notifications from other OpenStack services, by
|
|
consuming messages from the configured message queue system.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Polling</term>
|
|
<listitem>
|
|
<para>Retrieve information directly from the hypervisor or from the host
|
|
machine using SNMP, or by using the APIs of other OpenStack services.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>RESTful API</term>
|
|
<listitem>
|
|
<para>Pushing samples via the RESTful API of Telemetry.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
<section xml:id="section_telemetry-notifications">
|
|
<title>Notifications</title>
|
|
<para>All the services send notifications about the executed operations or system
|
|
state in OpenStack. Several notifications carry information that can be metered,
|
|
like the CPU time of a VM instance created by OpenStack Compute service.</para>
|
|
<para>The Telemetry module has a separate agent that is responsible for consuming
|
|
notifications, namely the notification agent. This component is responsible for
|
|
consuming from the message bus and transforming notifications into events and
|
|
measurement samples.
|
|
</para>
|
|
<para>The different OpenStack services emit several notifications about the various
|
|
types of events that happen in the system during normal operation. Not all these
|
|
notifications are consumed by the Telemetry module, as the intention is only to
|
|
capture the billable events and notifications that can be used for
|
|
monitoring or profiling purposes. The notification agent filters by the event
|
|
type, that is contained by each notification message. The following table
|
|
contains the event types by each OpenStack service that are transformed to samples
|
|
by Telemetry.</para>
|
|
<table rules="all">
|
|
<caption>Consumed event types from OpenStack services</caption>
|
|
<col width="33%"/>
|
|
<col width="33%"/>
|
|
<col width="33%"/>
|
|
<thead>
|
|
<tr>
|
|
<td>OpenStack service</td>
|
|
<td>Event types</td>
|
|
<td>Note</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>OpenStack Compute</td>
|
|
<td><para>scheduler.run_instance.scheduled</para>
|
|
<para>scheduler.select_destinations</para>
|
|
<para>compute.instance.*</para></td>
|
|
<td>For a more detailed list of Compute notifications please check the
|
|
<link xlink:href="https://wiki.openstack.org/wiki/SystemUsageData">
|
|
System Usage Data wiki page</link>.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Bare metal module for OpenStack</td>
|
|
<td>hardware.ipmi.*</td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>OpenStack Image Service</td>
|
|
<td><para>image.update</para>
|
|
<para>image.upload</para>
|
|
<para>image.delete</para>
|
|
<para>image.send</para></td>
|
|
<td>The required configuration for Image service can be found in the
|
|
<link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ceilometer-agent-glance.html">
|
|
Configure the Image Service for Telemetry section</link> section
|
|
in the <citetitle>OpenStack Installation Guide</citetitle>.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>OpenStack Networking</td>
|
|
<td><para>floatingip.create.end</para>
|
|
<para>floatingip.update.*</para>
|
|
<para>floatingip.exists</para>
|
|
<para>network.create.end</para>
|
|
<para>network.update.*</para>
|
|
<para>network.exists</para>
|
|
<para>port.create.end</para>
|
|
<para>port.update.*</para>
|
|
<para>port.exists</para>
|
|
<para>router.create.end</para>
|
|
<para>router.update.*</para>
|
|
<para>router.exists</para>
|
|
<para>subnet.create.end</para>
|
|
<para>subnet.update.*</para>
|
|
<para>subnet.exists</para>
|
|
<para>l3.meter</para></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Orchestration module</td>
|
|
<td><para>orchestration.stack.create.end</para>
|
|
<para>orchestration.stack.update.end</para>
|
|
<para>orchestration.stack.delete.end</para>
|
|
<para>orchestration.stack.resume.end</para>
|
|
<para>orchestration.stack.suspend.end</para></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>OpenStack Block Storage</td>
|
|
<td><para>volume.exists</para>
|
|
<para>volume.create.*</para>
|
|
<para>volume.delete.*</para>
|
|
<para>volume.update.*</para>
|
|
<para>volume.resize.*</para>
|
|
<para>volume.attach.*</para>
|
|
<para>volume.detach.*</para>
|
|
<para>snapshot.exists</para>
|
|
<para>snapshot.create.*</para>
|
|
<para>snapshot.delete.*</para>
|
|
<para>snapshot.update.*</para></td>
|
|
<td>The required configuration for Block Storage service can be found in the
|
|
<link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ceilometer-agent-cinder.html">
|
|
Add the Block Storage service agent for Telemetry section</link>
|
|
section in the <citetitle>OpenStack Installation Guide</citetitle>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<note>
|
|
<para>Some services require additional configuration to emit the notifications
|
|
using the correct control exchange on the message queue and so forth. These
|
|
configuration needs are referred in the above table for each OpenStack service
|
|
that needs it.</para>
|
|
</note>
|
|
<note>
|
|
<para>When the <literal>store_events</literal> option is set to True in
|
|
<filename>ceilometer.conf</filename>, the notification agent needs database access
|
|
in order to work properly.</para>
|
|
</note>
|
|
<section xml:id="section_telemetry-objectstore-middleware">
|
|
<title>Middleware for OpenStack Object Storage service</title>
|
|
<para>A subset of Object Store statistics requires an additional middleware to be installed
|
|
behind the proxy of Object Store. This additional component emits notifications containing
|
|
data-flow-oriented meters, namely the storage.objects.(incoming|outgoing).bytes values.
|
|
The list of these meters are listed in <xref linkend=
|
|
"section_telemetry-object-storage-metrics"/>,
|
|
marked with <literal>notification</literal> as origin.</para>
|
|
<para>The instructions on how to install this middleware can be found in <link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ceilometer-agent-swift.html">
|
|
Configure the Object Storage service for Telemetry</link>
|
|
section in the <citetitle>OpenStack Installation Guide</citetitle>.
|
|
</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-middleware">
|
|
<title>Telemetry middleware</title>
|
|
<para>Telemetry provides the capability of counting the HTTP requests and responses
|
|
for each API endpoint in OpenStack. This is achieved by storing a sample for each
|
|
event marked as <literal>audit.http.request</literal>, <literal>audit.http.response</literal>,
|
|
<literal>http.request</literal> or <literal>http.response</literal>.</para>
|
|
<para>It is recommended that these notifications be consumed as Events rather than samples
|
|
to better index the appropriate values and avoid massive load on the Metering database.
|
|
If preferred, Telemetry can consume these events as samples if the services are configured
|
|
to emit <literal>http.*</literal> notifications.</para>
|
|
</section>
|
|
</section>
|
|
<section xml:id="section_telemetry-polling">
|
|
<title>Polling</title>
|
|
<para>The Telemetry module is intended to store a complex picture of the
|
|
infrastructure. This goal requires additional information than what is
|
|
provided by the events and notifications published by each service.
|
|
Some information is not emitted directly, like resource usage of the VM
|
|
instances.</para>
|
|
<para>Therefore Telemetry uses another method to gather this data by polling
|
|
the infrastructure including the APIs of the different OpenStack services and
|
|
other assets, like hypervisors. The latter case requires closer interaction with
|
|
the compute hosts. To solve this issue, Telemetry uses an agent based
|
|
architecture to fulfill the requirements against the data collection.</para>
|
|
<para>There are two agents supporting the polling mechanism, namely the compute
|
|
agent and the central agent. The following subsections give further information
|
|
regarding to the architectural and configuration details of these components.
|
|
</para>
|
|
<section xml:id="section_telemetry-central-agent">
|
|
<title>Central agent</title>
|
|
<para>As the name of this agent shows, it is a central component in the
|
|
Telemetry architecture. This agent is responsible for polling public REST APIs
|
|
to retrieve additional information on OpenStack resources not already surfaced
|
|
via notifications, and also for polling hardware resources over SNMP.</para>
|
|
<para>The following services can be polled with this agent:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>OpenStack Networking</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>OpenStack Object Storage</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>OpenStack Block Storage</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Hardware resources via SNMP</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Energy consumption metrics via <link xlink:href="https://launchpad.net/kwapi">
|
|
Kwapi</link> framework</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>To install and configure this service use the <link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ch_ceilometer.html">
|
|
Install the Telemetry module</link> section in the <citetitle>OpenStack
|
|
Installation Guide</citetitle>.</para>
|
|
<para>The central agent does not need direct database connection. The
|
|
samples collected by this agent are sent via AMQP to the collector
|
|
service or any external service, which is responsible for persisting the data
|
|
into the configured database back end.</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-compute-agent">
|
|
<title>Compute agent</title>
|
|
<para>This agent is responsible for collecting resource usage data of VM
|
|
instances on individual compute nodes within an OpenStack deployment. This
|
|
mechanism requires a closer interaction with the hypervisor, therefore a
|
|
separate agent type fulfills the collection of the related meters, which
|
|
is placed on the host machines to locally retrieve this information.</para>
|
|
<para>A compute agent instance has to be installed on each and every compute node,
|
|
installation instructions can be found in the <link xlink:href=
|
|
"http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
Install the Compute agent for Telemetry</link> section in the
|
|
<citetitle>OpenStack Installation Guide</citetitle>.
|
|
</para>
|
|
<para>Just like the central agent, this component also does not need a direct database
|
|
connection. The samples are sent via AMQP to the collector.
|
|
</para>
|
|
<para>The list of supported hypervisors can be found in
|
|
<xref linkend="section_telemetry-supported-hypervisors"/>.
|
|
The compute agent uses the API of the hypervisor installed on the compute hosts.
|
|
Therefore the supported meters may be different in case of each virtualization
|
|
back end, as each inspection tool provides a different set of metrics.</para>
|
|
<para>The list of collected meters can be found in <xref linkend=
|
|
"section_telemetry-compute-metrics"/>.
|
|
The support column provides the information that which meter is available for
|
|
each hypervisor supported by the Telemetry module.</para>
|
|
<note>
|
|
<para>Telemetry supports Libvirt, which hides the hypervisor under it.</para>
|
|
</note>
|
|
</section>
|
|
<section xml:id="section_telemetry-cetral-compute-agent-ha">
|
|
<title>Support for HA deployment of the central and compute agent services</title>
|
|
<para>Both the central and the compute agent can run in an HA deployment, which
|
|
means that multiple instances of these services can run in parallel
|
|
with workload partitioning among these running instances.</para>
|
|
<para>The <link xlink:href="https://pypi.python.org/pypi/tooz">Tooz</link>
|
|
library provides the coordination within the groups of service instances. It
|
|
provides an API above several back ends that can be used for building
|
|
distributed applications.</para>
|
|
<para>Tooz supports <link xlink:href="http://docs.openstack.org/developer/tooz/drivers.html">
|
|
various drivers</link> including the following back end solutions:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><link xlink:href="http://zookeeper.apache.org/">Zookeeper</link>.
|
|
Recommended solution by the Tooz project.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para><link xlink:href="http://redis.io/">Redis</link>.
|
|
Recommended solution by the Tooz project.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para><link xlink:href="http://memcached.org/">Memcached</link>
|
|
Recommended for testing.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
You must configure a supported Tooz driver for the HA deployment
|
|
of the Telemetry services.</para>
|
|
<para>For information about the required configuration options that have to be set in the
|
|
<filename>ceilometer.conf</filename> configuration file for both the central and compute
|
|
agents, see the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>coordination</literal> section</link>
|
|
in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
|
<note>
|
|
<para>Without the <option>backend_url</option> option being set only one
|
|
instance of both the central and compute agent service is able to run
|
|
and function correctly.</para>
|
|
</note>
|
|
<para>The availability check of the instances is provided by heartbeat
|
|
messages. When the connection with an instance is lost, the workload will be
|
|
reassigned within the remained instances in the next polling cycle.</para>
|
|
<note>
|
|
<para><literal>Memcached</literal> uses a <option>timeout</option> value, which
|
|
should always be set to a value that is higher than the <option>heartbeat</option>
|
|
value set for Telemetry.</para>
|
|
</note>
|
|
<para>For backward compatibility and supporting existing deployments, the central agent
|
|
configuration also supports using different configuration files for groups of service
|
|
instances of this type that are running in parallel. For enabling this configuration
|
|
set a value for the <option>partitioning_group_prefix</option> option in the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>central</literal> section</link> in the <citetitle>OpenStack Configuration
|
|
Reference</citetitle>.</para>
|
|
<warning>
|
|
<para>For each sub-group of the central agent pool with the same
|
|
<option>partitioning_group_prefix</option> a disjoint subset of meters must be polled,
|
|
otherwise samples may be missing or duplicated. The list of meters to poll can be set
|
|
in the <filename>/etc/ceilometer/pipeline.yaml</filename> configuration
|
|
file. For more information about pipelines see
|
|
<xref linkend="section_telemetry-data-collection-processing"/>.</para>
|
|
</warning>
|
|
<para>To enable the compute agent to run multiple instances simultaneously with
|
|
workload partitioning, the
|
|
<option>workload_partitioning</option> option has to be set to <literal>True</literal>
|
|
under the <link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
compute section</link> in the <filename>ceilometer.conf</filename> configuration
|
|
file.</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-IPMI-agent">
|
|
<title>IPMI agent</title>
|
|
<para>This agent is responsible for collecting IPMI sensor data and Intel Node Manager
|
|
data on individual compute nodes within an OpenStack deployment. This
|
|
agent requires an IPMI capable node with <application>ipmitool</application> installed, which
|
|
is a common utility for IPMI control on various Linux distributions.</para>
|
|
<para>An IPMI agent instance could be installed on each and every compute node
|
|
with IPMI support, except when the node is managed by Bare metal module for OpenStack
|
|
and the <option>conductor.send_sensor_data</option> option is set to <literal>true</literal>
|
|
in the Bare metal module for OpenStack.
|
|
It is no harm to install this agent on compute node without IPMI or Intel Node
|
|
Manager support, as the agent checks for the hardware and if none is available, returns empty data.
|
|
But it is suggested that you install IPMI agent only on IPMI capable node for performance reason.
|
|
</para>
|
|
<para>Just like the central agent, this component also does not need a direct database
|
|
access. The samples are sent via AMQP to the collector.
|
|
</para>
|
|
<para>The list of collected meters can be found in <xref linkend=
|
|
"section_telemetry-ironic-metrics"/>.</para>
|
|
<note>
|
|
<para>Do not deploy both IPMI agent and Bare metal module for OpenStack on
|
|
one compute node. If <option>conductor.send_sensor_data</option> set, this
|
|
misconfiguration causes duplicated IPMI sensor samples.
|
|
</para>
|
|
</note>
|
|
</section>
|
|
</section>
|
|
<section xml:id="section_telemetry-post-api">
|
|
<title>Send samples to Telemetry</title>
|
|
<para>While most parts of the data collection in the Telemetry module are automated,
|
|
Telemetry provides the possibility to submit samples via the REST API to allow
|
|
users to send custom samples into this module.</para>
|
|
<para>This option makes it possible to send any kind of samples without the need
|
|
of writing extra code lines or making configuration changes.</para>
|
|
<para>The samples that can be sent to Telemetry are not limited to the actual
|
|
existing meters. There is a possibility to provide data for any new, customer
|
|
defined counter by filling out all the required fields of the POST request.
|
|
</para>
|
|
<para>If the sample corresponds to an existing meter, then the fields like
|
|
<literal>meter-type</literal> and meter name should be matched accordingly.</para>
|
|
<para>The required fields for sending a sample using the command line client
|
|
are:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>ID of the corresponding resource. (<parameter>--resource-id</parameter>)</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Name of meter. (<parameter>--meter-name</parameter>)</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Type of meter. (<parameter>--meter-type</parameter>)</para>
|
|
<para>Predefined meter types:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Gauge</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Delta</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Cumulative</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Unit of meter. (<parameter>--meter-unit</parameter>)</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Volume of sample. (<parameter>--sample-volume</parameter>)</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
<para>To send samples to Telemetry
|
|
using the command line client, the following command should be invoked:
|
|
<screen><prompt>$</prompt> <userinput>ceilometer sample-create -r 37128ad6-daaa-4d22-9509-b7e1c6b08697 \
|
|
-m memory.usage --meter-type gauge --meter-unit MB --sample-volume 48</userinput>
|
|
<?db-font-size 75%?><computeroutput>+-------------------+--------------------------------------------+
|
|
| Property | Value |
|
|
+-------------------+--------------------------------------------+
|
|
| message_id | 6118820c-2137-11e4-a429-08002715c7fb |
|
|
| name | memory.usage |
|
|
| project_id | e34eaa91d52a4402b4cb8bc9bbd308c1 |
|
|
| resource_id | 37128ad6-daaa-4d22-9509-b7e1c6b08697 |
|
|
| resource_metadata | {} |
|
|
| source | e34eaa91d52a4402b4cb8bc9bbd308c1:openstack |
|
|
| timestamp | 2014-08-11T09:10:46.358926 |
|
|
| type | gauge |
|
|
| unit | MB |
|
|
| user_id | 679b0499e7a34ccb9d90b64208401f8e |
|
|
| volume | 48.0 |
|
|
+-------------------+--------------------------------------------+</computeroutput></screen>
|
|
</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-data-collection-processing">
|
|
<title>Data collection and processing</title>
|
|
<para>The mechanism by which data is collected and processed and is called a
|
|
pipeline. Pipelines, at the configuration level, describe a coupling between
|
|
sources of data and the corresponding sinks for transformation and
|
|
publication of data.</para>
|
|
<para>A source is a producer of data: samples or events. In effect, it is a set of pollsters
|
|
or notification handlers emitting datapoints for a set of matching meters and event types.
|
|
</para>
|
|
<para>Each source configuration encapsulates name matching, polling
|
|
interval determination, optional resource enumeration or discovery, and
|
|
mapping to one or more sinks for publication.</para>
|
|
<para>A sink on the other hand is a consumer of data, providing logic
|
|
for the transformation and publication of data emitted from related
|
|
sources</para>
|
|
<para>In effect, a sink describes a chain of handlers. The chain starts
|
|
with zero or more transformers and ends with one or more publishers.
|
|
The first transformer in the chain is passed data from the corresponding
|
|
source, takes some action such as deriving rate of change, performing unit
|
|
conversion, or aggregating, before passing the modified data to the
|
|
next step that is described in
|
|
<xref linkend="section_telemetry-publishers"/>.</para>
|
|
<section xml:id="section_telemetry-pipeline-configuration">
|
|
<title>Pipeline configuration</title>
|
|
<para>Pipeline configuration by default, is stored in a separate configuration
|
|
file, called <filename>pipeline.yaml</filename>, next to the
|
|
<filename>ceilometer.conf</filename> file. The pipeline
|
|
configuration file can be set in the <option>pipeline_cfg_file</option>
|
|
parameter listed in the <link xlink:href=
|
|
"http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html"
|
|
>Description of configuration options for api table</link> section in the
|
|
<citetitle>OpenStack Configuration Reference</citetitle>. Multiple chains
|
|
can be defined in one pipeline configuration file.</para>
|
|
<para>The chain definition looks like the following:</para>
|
|
<programlisting>---
|
|
sources:
|
|
- name: 'source name'
|
|
interval: 'how often should the samples be injected into the pipeline'
|
|
meters:
|
|
- 'meter filter'
|
|
resources:
|
|
- 'list of resource URLs'
|
|
sinks
|
|
- 'sink name'
|
|
sinks:
|
|
- name: 'sink name'
|
|
transformers: 'definition of transformers'
|
|
publishers:
|
|
- 'list of publishers'</programlisting>
|
|
<para>The interval parameter in the sources section should be defined in seconds.
|
|
It determines the polling cadence of sample injection into the pipeline, where samples
|
|
are produced under the direct control of an agent.</para>
|
|
<para>There are several ways to define the list of meters for a pipeline source.
|
|
The list of valid meters can be found in <xref linkend=
|
|
"section_telemetry-measurements"/>. There is a possibility to define all
|
|
the meters, or just included or excluded meters, with which a source should
|
|
operate:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>To include all meters, use the <literal>*</literal> wildcard symbol.
|
|
It is highly advisable to select only the meters that you intend on using
|
|
to avoid flooding the metering database with unused data.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>To define the list of meters, use either of the following:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>To define the list of included meters, use the <literal>meter_name</literal>
|
|
syntax.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>To define the list of excluded meters, use the <literal>!meter_name</literal>
|
|
syntax.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>For meters, which have variants identified by a complex name field,
|
|
use the wildcard symbol to select all, e.g. for "instance:m1.tiny", use
|
|
"instance:*".</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<note>
|
|
<para>Please be aware that we do not have any duplication check
|
|
between pipelines and if you add a meter to multiple pipelines
|
|
then it is assumed the duplication is intentional and may be stored
|
|
multiple times according to the specified sinks.</para>
|
|
</note>
|
|
<para>The above definition methods can be used in the following combinations:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Use only the wildcard symbol.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Use the list of included meters.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Use the list of excluded meters.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Use wildcard symbol with the list of excluded meters.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<note>
|
|
<para>At least one of the above variations should be included in the meters section.
|
|
Included and excluded meters cannot co-exist in the same pipeline. Wildcard and
|
|
included meters cannot co-exist in the same pipeline definition section.</para>
|
|
</note>
|
|
<para>The optional resources section of a pipeline source allows a static list of
|
|
resource URLs to be configured for polling.</para>
|
|
<para>The transformers section of a pipeline sink provides the possibility to add a list
|
|
of transformer definitions. The available transformers are:</para>
|
|
<table rules="all">
|
|
<caption>List of available transformers</caption>
|
|
<col width="50%"/>
|
|
<col width="50%"/>
|
|
<thead>
|
|
<tr>
|
|
<td>Name of transformer</td>
|
|
<td>Reference name for configuration</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>Accumulator</td>
|
|
<td>accumulator</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Aggregator</td>
|
|
<td>aggregator</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Arithmetic</td>
|
|
<td>arithmetic</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Rate of change</td>
|
|
<td>rate_of_change</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Unit conversion</td>
|
|
<td>unit_conversion</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<para>The publishers section contains the list of publishers, where the samples data should
|
|
be sent after the possible transformations.</para>
|
|
<section xml:id="section_telemetry-pipeline-transformers">
|
|
<title>Transformers</title>
|
|
<para>The definition of transformers can contain the following fields:</para>
|
|
<para>
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>name</term>
|
|
<listitem>
|
|
<para>Name of the transformer.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>parameters</term>
|
|
<listitem>
|
|
<para>Parameters of the transformer.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
<para>The parameters section can contain transformer specific fields, like source and
|
|
target fields with different subfields in case of the rate of change, which depends on
|
|
the implementation of the transformer.</para>
|
|
<simplesect>
|
|
<title>Rate of change transformer</title>
|
|
<para>In the case of the transformer that creates the
|
|
<literal>cpu_util</literal> meter, the definition looks like the following:</para>
|
|
<programlisting>transformers:
|
|
- name: "rate_of_change"
|
|
parameters:
|
|
target:
|
|
name: "cpu_util"
|
|
unit: "%"
|
|
type: "gauge"
|
|
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"</programlisting>
|
|
<para>The rate of change the transformer generates is the <literal>cpu_util</literal>meter
|
|
from the sample values of the <literal>cpu</literal> counter, which represents
|
|
cumulative CPU time in nanoseconds. The transformer definition above defines a
|
|
scale factor (for nanoseconds, multiple CPUs, etc.), which is applied before the
|
|
transformation derives a sequence of gauge samples with unit '%', from sequential
|
|
values of the <literal>cpu</literal> meter.</para>
|
|
<para>The definition for the disk I/O rate, which is also generated by the rate of change
|
|
transformer:</para>
|
|
<programlisting>transformers:
|
|
- name: "rate_of_change"
|
|
parameters:
|
|
source:
|
|
map_from:
|
|
name: "disk\\.(read|write)\\.(bytes|requests)"
|
|
unit: "(B|request)"
|
|
target:
|
|
map_to:
|
|
name: "disk.\\1.\\2.rate"
|
|
unit: "\\1/s"
|
|
type: "gauge"</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Unit conversion transformer</title>
|
|
<para>Transformer to apply a unit conversion. It takes the volume of the meter and
|
|
multiplies it with the given 'scale' expression. Also supports <literal>map_from
|
|
</literal> and <literal>map_to</literal> like the rate of change transformer.</para>
|
|
<para>Sample configuration:</para>
|
|
<programlisting>transformers:
|
|
- name: "unit_conversion"
|
|
parameters:
|
|
target:
|
|
name: "disk.kilobytes"
|
|
unit: "KB"
|
|
scale: "1.0 / 1024.0"</programlisting>
|
|
<para>With the <option>map_from</option> and <option>map_to</option>
|
|
:</para>
|
|
<programlisting>transformers:
|
|
- name: "unit_conversion"
|
|
parameters:
|
|
source:
|
|
map_from:
|
|
name: "disk\\.(read|write)\\.bytes"
|
|
target:
|
|
map_to:
|
|
name: "disk.\\1.kilobytes"
|
|
scale: "1.0 / 1024.0"
|
|
unit: "KB"</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Aggregator transformer</title>
|
|
<para>A transformer that sums up the incoming samples until enough samples have
|
|
come in or a timeout has been reached.</para>
|
|
<para>Timeout can be specified with the <option>retention_time</option> parameter.
|
|
If we want to flush the aggregation after a set number of samples have been
|
|
aggregated, we can specify the size parameter.</para>
|
|
<para>The volume of the created sample is the sum of the volumes of samples that came
|
|
into the transformer. Samples can be aggregated by the attributes <option>project_id
|
|
</option>, <option>user_id</option> and <option>resource_metadata</option>.
|
|
To aggregate by the chosen attributes, specify them in the configuration and set
|
|
which value of the attribute to take for the new sample (first to take the first
|
|
sample's attribute, last to take the last sample's attribute, and drop to discard
|
|
the attribute).</para>
|
|
<para>To aggregate 60s worth of samples by <option>resource_metadata</option>
|
|
and keep the <option>resource_metadata</option> of the latest received
|
|
sample:</para>
|
|
<programlisting>transformers:
|
|
- name: "aggregator"
|
|
parameters:
|
|
retention_time: 60
|
|
resource_metadata: last</programlisting>
|
|
<para>To aggregate each 15 samples by <option>user_id</option> and <option>resource_metadata
|
|
</option> and keep the <option>user_id</option> of the first received sample and
|
|
drop the <option>resource_metadata</option>:</para>
|
|
<programlisting>transformers:
|
|
- name: "aggregator"
|
|
parameters:
|
|
size: 15
|
|
user_id: first
|
|
resource_metadata: drop</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Accumulator transformer</title>
|
|
<para>This transformer simply caches the samples until enough samples have arrived and
|
|
then flushes them all down the pipeline at once.</para>
|
|
<programlisting>transformers:
|
|
- name: "accumulator"
|
|
parameters:
|
|
size: 15</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Multi meter arithmetic transformer</title>
|
|
<para>This transformer enables us to perform arithmetic calculations over one or more
|
|
meters and/or their metadata, for example:</para>
|
|
<programlisting>memory_util = 100 * memory.usage / memory</programlisting>
|
|
<para>A new sample is created with the properties described in the <literal>target
|
|
</literal> section of the transformer's configuration. The sample's volume is the
|
|
result of the provided expression. The calculation is performed on samples from
|
|
the same resource.</para>
|
|
<note>
|
|
<para>The calculation is limited to meters with the same interval.</para>
|
|
</note>
|
|
<para>Example configuration:</para>
|
|
<programlisting>transformers:
|
|
- name: "arithmetic"
|
|
parameters:
|
|
target:
|
|
name: "memory_util"
|
|
unit: "%"
|
|
type: "gauge"
|
|
expr: "100 * $(memory.usage) / $(memory)"</programlisting>
|
|
<para>To demonstrate the use of metadata, here is the implementation of a silly metric
|
|
that shows average CPU time per core:</para>
|
|
<programlisting>transformers:
|
|
- name: "arithmetic"
|
|
parameters:
|
|
target:
|
|
name: "avg_cpu_per_core"
|
|
unit: "ns"
|
|
type: "cumulative"
|
|
expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"</programlisting>
|
|
<note>
|
|
<para>Expression evaluation gracefully handles NaNs and exceptions. In such a case it
|
|
does not create a new sample but only logs a warning.</para>
|
|
</note>
|
|
</simplesect>
|
|
</section>
|
|
</section>
|
|
</section>
|
|
<section xml:id="section_telemetry-cinder-audit-script">
|
|
<title>Block Storage audit script setup to get notifications</title>
|
|
<para>If you want to collect OpenStack Block Storage notification on demand,
|
|
you can use <command>cinder-volume-usage-audit</command> from OpenStack Block Storage.
|
|
This script becomes available when you install OpenStack Block Storage, so you can use
|
|
it without any specific settings and you don't need to authenticate to
|
|
access the data. To use it, you must run this command in the following format:</para>
|
|
<screen><prompt>$</prompt> <userinput>cinder-volume-usage-audit \
|
|
--start_time='YYYY-MM-DD HH:MM:SS' --end_time='YYYY-MM-DD HH:MM:SS' --send_actions</userinput></screen>
|
|
<para>This script outputs what volumes or snapshots were
|
|
created, deleted, or exists in a given period of time and some
|
|
information about these volumes or snapshots. Information about the
|
|
existence and size of volumes and snapshots is store in the Telemetry
|
|
module. This data is also stored as an event which is the recommended usage as it
|
|
provides better indexing of data.</para>
|
|
<para>Using this script via cron you can get notifications periodically,
|
|
for example, every 5 minutes.</para>
|
|
<programlisting>*/5 * * * * /path/to/cinder-volume-usage-audit --send_actions</programlisting>
|
|
</section>
|
|
<section xml:id="section_telemetry-storing-data">
|
|
<title>Storing samples</title>
|
|
<para>The Telemetry module has a separate service that is responsible for persisting the data
|
|
that comes from the pollsters or is received as notifications. The data can be stored in
|
|
a file or a database back end, for which the list of supported databases can be found in
|
|
<xref linkend="section_telemetry-supported-dbs"/>. The data can also be sent to an external
|
|
data store by using an HTTP dispatcher.
|
|
</para>
|
|
<para>The <systemitem class="service">ceilometer-collector</systemitem> service receives the
|
|
data as messages from the message bus of the configured AMQP service. It
|
|
sends these datapoints without any modification to the configured target.
|
|
The service has to run on a host machine from which it has access to the configured dispatcher.</para>
|
|
<note>
|
|
<para>Multiple dispatchers can be configured for Telemetry at one time.</para>
|
|
</note>
|
|
<para>Multiple <systemitem class="service">ceilometer-collector</systemitem> process can be
|
|
run at a time. It is also supported to start multiple worker threads per collector process.
|
|
The <option>collector_workers</option> configuration option has to be modified in the
|
|
<link xlink:href=
|
|
"http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
collector section</link> of the <filename>ceilometer.conf</filename>
|
|
configuration file.</para>
|
|
<note>
|
|
<para>Prior to the Juno release, it is not recommended to use multiple workers per collector
|
|
process when using PostgreSQL as the database back end.</para>
|
|
</note>
|
|
<simplesect>
|
|
<title>Database dispatcher</title>
|
|
<para>When the database dispatcher is configured as data store, you have the option to set
|
|
a <option>time_to_live</option> parameter (ttl) for samples. By default the time to
|
|
live value for samples is set to -1, which means that they are kept in the database forever.
|
|
</para>
|
|
<para>The time to live value is specified in seconds. Each sample has a time stamp, and the
|
|
<literal>ttl</literal> value indicates that a sample will be deleted from the database when
|
|
the number of seconds has elapsed since that sample reading was stamped. For example,
|
|
if the time to live is set to 600, all samples older than 600 seconds will be purged from
|
|
the database.</para>
|
|
<para>Certain databases support native TTL expiration. In cases where this is not possible,
|
|
a command-line script, which you can use for this purpose is
|
|
<systemitem class="service">ceilometer-expirer</systemitem>. You can run it in a cron job,
|
|
which helps to keep your database in a consistent state.</para>
|
|
<para>The level of support differs in case of the configured back end:</para>
|
|
<table rules="all">
|
|
<caption>Time-to-live support for database back ends</caption>
|
|
<col width="24%"/>
|
|
<col width="18%"/>
|
|
<col width="58%"/>
|
|
<thead>
|
|
<tr>
|
|
<td>Database</td>
|
|
<td>TTL value support</td>
|
|
<td>Note</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>MongoDB</td>
|
|
<td>Yes</td>
|
|
<td>MongoDB has native TTL support for deleting samples that are older
|
|
than the configured ttl value.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>SQL-based back ends</td>
|
|
<td>Yes</td>
|
|
<td><systemitem class="service">ceilometer-expirer</systemitem> has to be
|
|
used for deleting samples and its related data from the database.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>HBase</td>
|
|
<td>No</td>
|
|
<td>Telemetry's HBase support does not include native TTL nor
|
|
<systemitem class="service">ceilometer-expirer</systemitem> support.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>DB2 NoSQL</td>
|
|
<td>No</td>
|
|
<td>DB2 NoSQL does not have native TTL nor
|
|
<systemitem class="service">ceilometer-expirer</systemitem> support.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>HTTP dispatcher</title>
|
|
<para>The Telemetry module supports sending samples to an external HTTP target. The
|
|
samples are sent without any modification. To set this option as the collector's target, the
|
|
<option>dispatcher</option> has to be changed to <literal>http</literal> in the
|
|
<filename>ceilometer.conf</filename> configuration file. For the list
|
|
of options that you need to set, see the see the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>dispatcher_http</literal> section</link>
|
|
in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>File dispatcher</title>
|
|
<para>You can store samples in a file by setting the <option>dispatcher</option>
|
|
option in <filename>ceilometer.conf</filename> o <literal>file</literal>. For the list
|
|
of configuration options, see the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>dispatcher_file</literal> section</link>
|
|
in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
|
</simplesect>
|
|
</section>
|
|
</section>
|