
The administrator guide includes many links to the install guides, configuration guide, and so forth. As the administrator guide is the current version all links should link to current (Juno) vs. Icehouse or trunk (Kilo) versions of the docs. Closes-Bug: #1410565 Change-Id: I8fa3bc6d0af2ee90f436fe065b8897eefefa2be9
869 lines
44 KiB
XML
869 lines
44 KiB
XML
<?xml version="1.0" encoding="UTF-8"?>
|
|
<section xmlns="http://docbook.org/ns/docbook"
|
|
xmlns:xi="http://www.w3.org/2001/XInclude"
|
|
xmlns:xlink="http://www.w3.org/1999/xlink"
|
|
version="5.0"
|
|
xml:id="section_telemetry-data-collection">
|
|
<title>Data collection</title>
|
|
<para>The main responsibility of Telemetry in OpenStack is to collect
|
|
information about the system that can be used by billing systems or any
|
|
kinds of analytic tools for instance. The original focus, regarding to
|
|
the collected data, was on the counters that can be used for billing,
|
|
but the range is getting wider continuously.</para>
|
|
<para>Collected data can be stored in the form of samples or events in the
|
|
supported databases, listed in
|
|
<xref linkend="section_telemetry-supported-dbs"/>.</para>
|
|
<para>Samples can have various sources regarding to the needs
|
|
and configuration of Telemetry, which requires multiple methods to
|
|
collect data.</para>
|
|
<para>The available data collection mechanisms are:</para>
|
|
<para>
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>Notifications</term>
|
|
<listitem>
|
|
<para>Processing notifications from other OpenStack services, by
|
|
consuming messages from the configured message queue system.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>Polling</term>
|
|
<listitem>
|
|
<para>Retrieve information directly from the hypervisor or from the host
|
|
machine using SNMP, or by using the APIs of other OpenStack services.
|
|
</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>RESTful API</term>
|
|
<listitem>
|
|
<para>Pushing samples via the RESTful API of Telemetry.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
<section xml:id="section_telemetry-notifications">
|
|
<title>Notifications</title>
|
|
<para>All the services send notifications about the executed operations or system
|
|
state in OpenStack. Several notifications carry information that can be metered,
|
|
like when a new VM instance was created by OpenStack Compute service.</para>
|
|
<para>The Telemetry module has a separate agent that is responsible for consuming
|
|
notifications, namely the notification agent. This component is responsible for
|
|
consuming from the message bus and transforming notifications into new samples.
|
|
</para>
|
|
<para>The different OpenStack services emit several notifications about the various
|
|
types of events that happen in the system during normal operation. Not all these
|
|
notifications are consumed by the Telemetry module, as the intention is only to
|
|
capture the billable events and all those notifications that can be used for
|
|
monitoring or profiling purposes. The notification agent filters by the event
|
|
type, that is contained by each notification message. The following table
|
|
contains the event types by each OpenStack service that are transformed to samples
|
|
by Telemetry.</para>
|
|
<table rules="all">
|
|
<caption>Consumed event types from OpenStack services</caption>
|
|
<col width="33%"/>
|
|
<col width="33%"/>
|
|
<col width="33%"/>
|
|
<thead>
|
|
<tr>
|
|
<td>OpenStack service</td>
|
|
<td>Event types</td>
|
|
<td>Note</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>OpenStack Compute</td>
|
|
<td><para>scheduler.run_instance.scheduled</para>
|
|
<para>scheduler.select_destinations</para>
|
|
<para>compute.instance.*</para></td>
|
|
<td>For a more detailed list of Compute notifications please check the
|
|
<link xlink:href="https://wiki.openstack.org/wiki/SystemUsageData">
|
|
System Usage Data wiki page</link>.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Bare metal module for OpenStack</td>
|
|
<td>hardware.ipmi.*</td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>OpenStack Image Service</td>
|
|
<td><para>image.update</para>
|
|
<para>image.upload</para>
|
|
<para>image.delete</para>
|
|
<para>image.send</para></td>
|
|
<td>The required configuration for Image service can be found in the
|
|
<link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ceilometer-agent-glance.html">
|
|
Configure the Image Service for Telemetry section</link> section
|
|
in the <citetitle>OpenStack Installation Guide</citetitle>.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>OpenStack Networking</td>
|
|
<td><para>floatingip.create.end</para>
|
|
<para>floatingip.update.*</para>
|
|
<para>floatingip.exists</para>
|
|
<para>network.create.end</para>
|
|
<para>network.update.*</para>
|
|
<para>network.exists</para>
|
|
<para>port.create.end</para>
|
|
<para>port.update.*</para>
|
|
<para>port.exists</para>
|
|
<para>router.create.end</para>
|
|
<para>router.update.*</para>
|
|
<para>router.exists</para>
|
|
<para>subnet.create.end</para>
|
|
<para>subnet.update.*</para>
|
|
<para>subnet.exists</para>
|
|
<para>l3.meter</para></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>Orchestration module</td>
|
|
<td><para>orchestration.stack.create.end</para>
|
|
<para>orchestration.stack.update.end</para>
|
|
<para>orchestration.stack.delete.end</para>
|
|
<para>orchestration.stack.resume.end</para>
|
|
<para>orchestration.stack.suspend.end</para></td>
|
|
<td></td>
|
|
</tr>
|
|
<tr>
|
|
<td>OpenStack Block Storage</td>
|
|
<td><para>volume.exists</para>
|
|
<para>volume.create.*</para>
|
|
<para>volume.delete.*</para>
|
|
<para>volume.update.*</para>
|
|
<para>volume.resize.*</para>
|
|
<para>volume.attach.*</para>
|
|
<para>volume.detach.*</para>
|
|
<para>snapshot.exists</para>
|
|
<para>snapshot.create.*</para>
|
|
<para>snapshot.delete.*</para>
|
|
<para>snapshot.update.*</para></td>
|
|
<td>The required configuration for Block Storage service can be found in the
|
|
<link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ceilometer-agent-cinder.html">
|
|
Add the Block Storage service agent for Telemetry section</link>
|
|
section in the <citetitle>OpenStack Installation Guide</citetitle>.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<note>
|
|
<para>Some services require additional configuration to emit the notifications
|
|
using the correct control exchange on the message queue and so forth. These
|
|
configuration needs are referred in the above table for each OpenStack service
|
|
that needs it.</para>
|
|
</note>
|
|
<note>
|
|
<para>When the <literal>store_events</literal> option is set to True in
|
|
<filename>ceilometer.conf</filename>, the notification agent needs database access
|
|
in order to work properly.</para>
|
|
</note>
|
|
<section xml:id="section_telemetry-objectstore-middleware">
|
|
<title>Middleware for OpenStack Object Storage service</title>
|
|
<para>A subset of Object Store statistics requires an additional middleware to be installed
|
|
behind the proxy of Object Store. This additional component emits notifications containing
|
|
the data-flow-oriented meters, namely the storage.objects.(incoming|outgoing).bytes values.
|
|
The list of these meters are listed in the <link xlink:href=
|
|
"http://docs.openstack.org/developer/ceilometer/measurements.html#object-storage-swift">
|
|
Swift</link> table section in the <citetitle>Telemetry Measurements Reference</citetitle>,
|
|
marked with <literal>notification</literal> as origin.</para>
|
|
<para>The instructions on how to install this middleware can be found in <link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ceilometer-agent-swift.html">
|
|
Configure the Object Storage service for Telemetry</link>
|
|
section in the <citetitle>OpenStack Installation Guide</citetitle>.
|
|
</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-middleware">
|
|
<title>Telemetry middleware</title>
|
|
<para>Telemetry provides the capability of counting the HTTP requests and responses
|
|
for each API endpoint in OpenStack. This is achieved by storing a sample for each
|
|
event marked as <literal>http.request</literal> or <literal>http.response</literal>.</para>
|
|
<para>Telemetry can consume these events if the services are configured to emit
|
|
notifications with these two event types.</para>
|
|
</section>
|
|
</section>
|
|
<section xml:id="section_telemetry-polling">
|
|
<title>Polling</title>
|
|
<para>The Telemetry module is intended to store a complex picture of the
|
|
infrastructure. This goal requires additional information than what is
|
|
provided by the events and notifications published by each service.
|
|
Some information is not emitted directly, like resource usage of the VM
|
|
instances.</para>
|
|
<para>Therefore Telemetry uses another method to gather this data by polling
|
|
the infrastructure including the APIs of the different OpenStack services and
|
|
other assets, like hypervisors. The latter case requires closer interaction with
|
|
the compute hosts. To solve this issue, Telemetry uses an agent based
|
|
architecture to fulfill the requirements against the data collection.</para>
|
|
<para>There are two agents supporting the polling mechanism, namely the compute
|
|
agent and the central agent. The following subsections give further information
|
|
regarding to the architectural and configuration details of these components.
|
|
</para>
|
|
<section xml:id="section_telemetry-central-agent">
|
|
<title>Central agent</title>
|
|
<para>As the name of this agent shows, it is a central component in the
|
|
Telemetry architecture. This agent is responsible for polling public REST APIs
|
|
to retrieve additional information on OpenStack resources not already surfaced
|
|
via notifications, and also for polling hardware resources over SNMP.</para>
|
|
<para>The following services can be polled with this agent:
|
|
</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>OpenStack Networking</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>OpenStack Object Storage</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>OpenStack Block Storage</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Hardware resources via SNMP</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Energy consumption metrics via <link xlink:href="https://launchpad.net/kwapi">
|
|
Kwapi</link> framework</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<para>To install and configure this service use the <link xlink:href=
|
|
"http://docs.openstack.org/juno/install-guide/install/apt/content/ch_ceilometer.html">
|
|
Install the Telemetry module</link> section in the <citetitle>OpenStack
|
|
Installation Guide</citetitle>.</para>
|
|
<para>The central agent does not need direct database connection. The
|
|
samples collected by this agent are sent via message queue to the collector
|
|
service, which is responsible for persisting the data into the configured
|
|
database back end.</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-compute-agent">
|
|
<title>Compute agent</title>
|
|
<para>This agent is responsible for collecting resource usage data of VM
|
|
instances on individual compute nodes within an OpenStack deployment. This
|
|
mechanism requires a closer interaction with the hypervisor, therefore a
|
|
separate agent type fulfills the collection of the related meters, which
|
|
placed on the host machines to locally retrieve this information.</para>
|
|
<para>A compute agent instance has to be installed on each and every compute node,
|
|
installation instructions can be found in the <link xlink:href=
|
|
"http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
Install the Compute agent for Telemetry</link> section in the
|
|
<citetitle>OpenStack Installation Guide</citetitle>.
|
|
</para>
|
|
<para>Just like the central agent, this component also does not need a direct database
|
|
access. The samples are sent via AMQP to the collector.
|
|
</para>
|
|
<para>The list of supported hypervisors can be found in
|
|
<xref linkend="section_telemetry-supported-hypervisors"/>.
|
|
The compute agent uses the API of the hypervisor installed on the compute hosts.
|
|
Therefore the supported meters can be different in case of each virtualization
|
|
back end, as these tools provide different set of metrics.</para>
|
|
<para>The list of collected meters can be found in the <link xlink:href=
|
|
"http://docs.openstack.org/developer/ceilometer/measurements.html#compute-nova">
|
|
Compute section</link> in the <citetitle>Telemetry Measurements Reference</citetitle>.
|
|
The support column provides the information that which meter is available for
|
|
each hypervisor supported by the Telemetry module.</para>
|
|
<note>
|
|
<para>Telemetry supports Libvirt, which hides the hypervisor under it.</para>
|
|
</note>
|
|
</section>
|
|
<section xml:id="section_telemetry-cetral-compute-agent-ha">
|
|
<title>Support for HA deployment of the central and compute agent services</title>
|
|
<para>Both the central and the compute agent can run in an HA deployment, which
|
|
means that multiple instances of these services can run in parallel
|
|
with workload partitioning among these running instances.</para>
|
|
<para>The <link xlink:href="https://pypi.python.org/pypi/tooz">Tooz</link>
|
|
library provides the coordination within the groups of service instances. It
|
|
provides an API above several back ends that can be used for building
|
|
distributed applications.</para>
|
|
<para>Tooz supports the following back-end solutions:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para><link xlink:href="http://zookeeper.apache.org/">Zookeeper</link>.
|
|
Recommended solution by the Tooz project.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para><link xlink:href="http://memcached.org/">Memcached</link></para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
You must configure these back ends to use either of them for the HA deployment
|
|
of the Telemetry services.</para>
|
|
<para>For information about the required configuration options that have to be set in the
|
|
<filename>ceilometer.conf</filename> configuration file for both the central and compute
|
|
agents, see the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>coordination</literal> section</link>
|
|
in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
|
<note>
|
|
<para>Without the <option>backend_url</option> option being set only one
|
|
instance of both the central and compute agent service is able to run
|
|
and function correctly.</para>
|
|
</note>
|
|
<para>The availability check of the instances is provided by heartbeat
|
|
messages. When the connection with an instance is lost, the workload will be
|
|
reassigned within the remained instances in the next polling cycle.</para>
|
|
<note>
|
|
<para><literal>Memcached</literal> uses a <option>timeout</option> value, which
|
|
should always be set to a value that is higher than the <option>heartbeat</option>
|
|
value set for Telemetry.</para>
|
|
</note>
|
|
<para>For backward compatibility and supporting existing deployments, the central agent
|
|
configuration also supports using different configuration files for groups of service
|
|
instances of this type that are running in parallel. For enabling this configuration
|
|
set a value for the <option>partitioning_group_prefix</option> option in the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>central</literal> section</link> in the <citetitle>OpenStack Configuration
|
|
Reference</citetitle>.</para>
|
|
<warning>
|
|
<para>For each sub-group of the central agent pool with the same
|
|
<option>partitioning_group_prefix</option> a disjoint subset of meters must be polled,
|
|
otherwise samples may be missing or duplicated. The list of meters to poll can be set
|
|
in the <filename>/etc/ceilometer/pipeline.yaml</filename> configuration
|
|
file. For more information about pipelines see
|
|
<xref linkend="section_telemetry-data-collection-processing"/>.</para>
|
|
</warning>
|
|
<para>To enable the compute agent to run multiple instances simultaneously with
|
|
workload partitioning, the
|
|
<option>workload_partitioning</option> option has to be set to <literal>True</literal>
|
|
under the <link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
compute section</link> in the <filename>ceilometer.conf</filename> configuration
|
|
file.</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-IPMI-agent">
|
|
<title>IPMI agent</title>
|
|
<para>This agent is responsible for collecting IPMI sensor data and Intel Node Manager
|
|
data on individual compute nodes within an OpenStack deployment. This
|
|
agent requires IPMI capable node with <application>ipmitool</application> installed, which
|
|
is a common utility for IPMI control on various Linux distributions.</para>
|
|
<para>An IPMI agent instance could be installed on each and every compute node
|
|
with IPMI support, except that the node is managed by Bare metal module for OpenStack
|
|
and the <option>conductor.send_sensor_data</option> option is set to <literal>true</literal>
|
|
in the Bare metal module for OpenStack.
|
|
It is no harm to install this agent on compute node without IPMI or Intel Node
|
|
Manager support, as the agent checks for the hardware and if none is available, returns empty data.
|
|
But it is suggested that you install IPMI agent only on IPMI capable node for performance reason.
|
|
</para>
|
|
<para>Just like the central agent, this component also does not need a direct database
|
|
access. The samples are sent via AMQP to the collector.
|
|
</para>
|
|
<para>The list of collected meters can be found in the <link xlink:href=
|
|
"http://docs.openstack.org/developer/ceilometer/measurements.html#ironic-hardware-ipmi-sensor-data">
|
|
Ironic Hardware IPMI Sensor Data section</link> in the <citetitle>Telemetry Measurements Reference</citetitle>.</para>
|
|
<note>
|
|
<para>Do not deploy both IPMI agent and Bare metal module for OpenStack on
|
|
one compute node. If <option>conductor.send_sensor_data</option> set, this
|
|
misconfiguration causes duplicated IPMI sensor samples.
|
|
</para>
|
|
</note>
|
|
</section>
|
|
</section>
|
|
<section xml:id="section_telemetry-post-api">
|
|
<title>Send samples to Telemetry</title>
|
|
<para>Most parts of the data collections in the Telemetry module are automated.
|
|
Telemetry provides the possibility to submit samples via the REST API to allow
|
|
users to send custom samples into this module.</para>
|
|
<para>This option makes it possible to send any kind of samples without the need
|
|
of writing extra code lines or making configuration changes.</para>
|
|
<para>The samples that can be sent to Telemetry are not limited to the actual
|
|
existing meters. There is a possibility to provide data for any new, customer
|
|
defined counter by filling out all the required fields of the POST request.
|
|
</para>
|
|
<para>If the sample corresponds to an existing meter, then the fields like
|
|
<literal>meter-type</literal> and meter name should be matched accordingly.</para>
|
|
<para>The required fields for sending a sample using the command line client
|
|
are:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>ID of the corresponding resource. (<parameter>--resource-id</parameter>)</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Name of meter. (<parameter>--meter-name</parameter>)</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Type of meter. (<parameter>--meter-type</parameter>)</para>
|
|
<para>Predefined meter types:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Gauge</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Delta</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Cumulative</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Unit of meter. (<parameter>--meter-unit</parameter>)</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Volume of sample. (<parameter>--sample-volume</parameter>)</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
<para>The <literal>memory.usage</literal> meter is not supported when Libvirt is used in an
|
|
OpenStack deployment. There is still a possibility to provide samples for
|
|
this meter based on any custom measurements. To send samples to Telemetry
|
|
using the command line client, the following command should be invoked:
|
|
<screen><prompt>$</prompt> <userinput>ceilometer sample-create -r 37128ad6-daaa-4d22-9509-b7e1c6b08697 \
|
|
-m memory.usage --meter-type gauge --meter-unit MB --sample-volume 48</userinput>
|
|
<?db-font-size 75%?><computeroutput>+-------------------+--------------------------------------------+
|
|
| Property | Value |
|
|
+-------------------+--------------------------------------------+
|
|
| message_id | 6118820c-2137-11e4-a429-08002715c7fb |
|
|
| name | memory.usage |
|
|
| project_id | e34eaa91d52a4402b4cb8bc9bbd308c1 |
|
|
| resource_id | 37128ad6-daaa-4d22-9509-b7e1c6b08697 |
|
|
| resource_metadata | {} |
|
|
| source | e34eaa91d52a4402b4cb8bc9bbd308c1:openstack |
|
|
| timestamp | 2014-08-11T09:10:46.358926 |
|
|
| type | gauge |
|
|
| unit | MB |
|
|
| user_id | 679b0499e7a34ccb9d90b64208401f8e |
|
|
| volume | 48.0 |
|
|
+-------------------+--------------------------------------------+</computeroutput></screen>
|
|
</para>
|
|
</section>
|
|
<section xml:id="section_telemetry-data-collection-processing">
|
|
<title>Data collection and processing</title>
|
|
<para>The mechanism via the data is collected and processed and is called
|
|
pipeline. Pipelines, at the configuration level, describe a coupling between
|
|
sources of samples and the corresponding sinks for transformation and
|
|
publication of data.</para>
|
|
<para>A source is a producer of samples, in effect, a set of pollsters
|
|
and/or notification handlers emitting samples for a set of matching meters.
|
|
</para>
|
|
<para>Each source configuration encapsulates meter name matching, polling
|
|
interval determination, optional resource enumeration or discovery, and
|
|
mapping to one or more sinks for publication.</para>
|
|
<para>A sink on the other hand is a consumer of samples, providing logic
|
|
for the transformation and publication of samples emitted from related
|
|
sources. Each sink configuration is concerned only with the
|
|
transformation rules and publication conduits for samples.</para>
|
|
<para>In effect, a sink describes a chain of handlers. The chain starts
|
|
with zero or more transformers and ends with one or more publishers.
|
|
The first transformer in the chain is passed samples from the corresponding
|
|
source, takes some action such as deriving rate of change, performing unit
|
|
conversion, or aggregating, before passing the modified sample to the
|
|
next step that is described in
|
|
<xref linkend="section_telemetry-publishers"/>.</para>
|
|
<section xml:id="section_telemetry-pipeline-configuration">
|
|
<title>Pipeline configuration</title>
|
|
<para>Pipeline configuration by default, is stored in a separate configuration
|
|
file, called <filename>pipeline.yaml</filename>, next to the
|
|
<filename>ceilometer.conf</filename> file. The pipeline
|
|
configuration file can be set in the <parameter>pipeline_cfg_file</parameter>
|
|
parameter listed in the <link xlink:href=
|
|
"http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html"
|
|
>Description of configuration options for api table</link> section in the
|
|
<citetitle>OpenStack Configuration Reference</citetitle>. Multiple chains
|
|
can be defined in one pipeline configuration file.</para>
|
|
<para>The chain definition looks like the following:</para>
|
|
<programlisting>---
|
|
sources:
|
|
- name: 'source name'
|
|
interval: 'how often should the samples be injected into the pipeline'
|
|
meters:
|
|
- 'meter filter'
|
|
resources:
|
|
- 'list of resource URLs'
|
|
sinks
|
|
- 'sink name'
|
|
sinks:
|
|
- name: 'sink name'
|
|
transformers: 'definition of transformers'
|
|
publishers:
|
|
- 'list of publishers'</programlisting>
|
|
<para>The interval parameter in the sources section should be defined in seconds.
|
|
It determines the cadence of sample injection into the pipeline, where samples
|
|
are produced under the direct control of an agent, for instance via a polling
|
|
cycle as opposed to incoming notifications.</para>
|
|
<para>There are several ways to define the list of meters for a pipeline source.
|
|
The list of valid meters can be found in the <link xlink:href=
|
|
"http://docs.openstack.org/developer/ceilometer/measurements.html"> Telemetry
|
|
Measurements Reference</link> document. There is a possibility to define all
|
|
the meters, or just included or excluded meters, with which a source should
|
|
operate:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>To include all meters, use the <literal>*</literal> wildcard symbol.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>To define the list of meters, use either of the following:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>To define the list of included meters, use the <literal>meter_name</literal>
|
|
syntax.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>To define the list of excluded meters, use the <literal>!meter_name</literal>
|
|
syntax.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>For meters, which have variants identified by a complex name field,
|
|
use the wildcard symbol to select all, e.g. for "instance:m1.tiny", use
|
|
"instance:*".</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<note>
|
|
<para>Please be aware that we do not have any duplication check
|
|
between pipelines and if you add a meter to multiple pipelines
|
|
then it will be polled in each one and will be also stored
|
|
multiple times according to the specified sinks.</para>
|
|
</note>
|
|
<para>The above definition methods can be used in the following combinations:</para>
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>Use only the wildcard symbol.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Use the list of included meters.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Use the list of excluded meters.</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>Use wildcard symbol with the list of excluded meters.</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
<note>
|
|
<para>At least one of the above variations should be included in the meters section.
|
|
Included and excluded meters cannot co-exist in the same pipeline. Wildcard and
|
|
included meters cannot co-exist in the same pipeline definition section.</para>
|
|
</note>
|
|
<para>The optional resources section of a pipeline source allows a static list of
|
|
resource URLs to be configured for polling.</para>
|
|
<para>The transformers section of a pipeline sink provides the possibility to add a list
|
|
of transformer definitions. The available transformers are:</para>
|
|
<table rules="all">
|
|
<caption>List of available transformers</caption>
|
|
<col width="50%"/>
|
|
<col width="50%"/>
|
|
<thead>
|
|
<tr>
|
|
<td>Name of transformer</td>
|
|
<td>Reference name for configuration</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>Accumulator</td>
|
|
<td>accumulator</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Aggregator</td>
|
|
<td>aggregator</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Arithmetic</td>
|
|
<td>arithmetic</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Rate of change</td>
|
|
<td>rate_of_change</td>
|
|
</tr>
|
|
<tr>
|
|
<td>Unit conversion</td>
|
|
<td>unit_conversion</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
<para>The publishers section contains the list of publishers, where the samples data should
|
|
be sent after the possible transformations.</para>
|
|
<section xml:id="section_telemetry-pipeline-transformers">
|
|
<title>Transformers</title>
|
|
<para>The definition of transformers can contain the following fields:</para>
|
|
<para>
|
|
<variablelist>
|
|
<varlistentry>
|
|
<term>name</term>
|
|
<listitem>
|
|
<para>Name of the transformer.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
<varlistentry>
|
|
<term>parameters</term>
|
|
<listitem>
|
|
<para>Parameters of the transformer.</para>
|
|
</listitem>
|
|
</varlistentry>
|
|
</variablelist>
|
|
</para>
|
|
<para>The parameters section can contain transformer specific fields, like source and
|
|
target fields with different subfields in case of the rate of change, which depends on
|
|
the implementation of the transformer.</para>
|
|
<simplesect>
|
|
<title>Rate of change transformer</title>
|
|
<para>In the case of the transformer that creates the
|
|
<literal>cpu_util</literal> meter, the definition looks like the following:</para>
|
|
<programlisting>transformers:
|
|
- name: "rate_of_change"
|
|
parameters:
|
|
target:
|
|
name: "cpu_util"
|
|
unit: "%"
|
|
type: "gauge"
|
|
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"</programlisting>
|
|
<para>The rate of change the transformer generates is the <literal>cpu_util</literal>meter
|
|
from the sample values of the <literal>cpu</literal> counter, which represents
|
|
cumulative CPU time in nanoseconds. The transformer definition above defines a
|
|
scale factor (for nanoseconds, multiple CPUs, etc.), which is applied before the
|
|
transformation derives a sequence of gauge samples with unit '%', from sequential
|
|
values of the <literal>cpu</literal> meter.</para>
|
|
<para>The definition for the disk I/O rate, which is also generated by the rate of change
|
|
transformer:</para>
|
|
<programlisting>transformers:
|
|
- name: "rate_of_change"
|
|
parameters:
|
|
source:
|
|
map_from:
|
|
name: "disk\\.(read|write)\\.(bytes|requests)"
|
|
unit: "(B|request)"
|
|
target:
|
|
map_to:
|
|
name: "disk.\\1.\\2.rate"
|
|
unit: "\\1/s"
|
|
type: "gauge"</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Unit conversion transformer</title>
|
|
<para>Transformer to apply a unit conversion. It takes the volume of the meter and
|
|
multiplies it with the given 'scale' expression. Also supports <literal>map_from
|
|
</literal> and <literal>map_to</literal> like the rate of change transformer.</para>
|
|
<para>Sample configuration:</para>
|
|
<programlisting>transformers:
|
|
- name: "unit_conversion"
|
|
parameters:
|
|
target:
|
|
name: "disk.kilobytes"
|
|
unit: "KB"
|
|
scale: "1.0 / 1024.0"</programlisting>
|
|
<para>With the <parameter>map_from</parameter> and <parameter>map_to</parameter>
|
|
:</para>
|
|
<programlisting>transformers:
|
|
- name: "unit_conversion"
|
|
parameters:
|
|
source:
|
|
map_from:
|
|
name: "disk\\.(read|write)\\.bytes"
|
|
target:
|
|
map_to:
|
|
name: "disk.\\1.kilobytes"
|
|
scale: "1.0 / 1024.0"
|
|
unit: "KB"</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Aggregator transformer</title>
|
|
<para>A transformer that sums up the incoming samples until enough samples have
|
|
come in or a timeout has been reached.</para>
|
|
<para>Timeout can be specified with the <parameter>retention_time</parameter> parameter.
|
|
If we want to flush the aggregation after a set number of samples have been
|
|
aggregated, we can specify the size parameter.</para>
|
|
<para>The volume of the created sample is the sum of the volumes of samples that came
|
|
into the transformer. Samples can be aggregated by the attributes <parameter>project_id
|
|
</parameter>, <parameter>user_id</parameter> and <parameter>resource_metadata</parameter>.
|
|
To aggregate by the chosen attributes, specify them in the configuration and set
|
|
which value of the attribute to take for the new sample (first to take the first
|
|
sample's attribute, last to take the last sample's attribute, and drop to discard
|
|
the attribute).</para>
|
|
<para>To aggregate 60s worth of samples by <parameter>resource_metadata</parameter>
|
|
and keep the <parameter>resource_metadata</parameter> of the latest received
|
|
sample:</para>
|
|
<programlisting>transformers:
|
|
- name: "aggregator"
|
|
parameters:
|
|
retention_time: 60
|
|
resource_metadata: last</programlisting>
|
|
<para>To aggregate each 15 samples by <parameter>user_id</parameter> and <parameter>resource_metadata
|
|
</parameter> and keep the <parameter>user_id</parameter> of the first received sample and
|
|
drop the <parameter>resource_metadata</parameter>:</para>
|
|
<programlisting>transformers:
|
|
- name: "aggregator"
|
|
parameters:
|
|
size: 15
|
|
user_id: first
|
|
resource_metadata: drop</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Accumulator transformer</title>
|
|
<para>This transformer simply caches the samples until enough samples have arrived and
|
|
then flushes them all down the pipeline at once.</para>
|
|
<programlisting>transformers:
|
|
- name: "accumulator"
|
|
parameters:
|
|
size: 15</programlisting>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>Multi meter arithmetic transformer</title>
|
|
<para>This transformer enables us to perform arithmetic calculations over one or more
|
|
meters and/or their metadata, for example:</para>
|
|
<programlisting>memory_util = 100 * memory.usage / memory</programlisting>
|
|
<para>A new sample is created with the properties described in the <literal>target
|
|
</literal> section of the transformer's configuration. The sample's volume is the
|
|
result of the provided expression. The calculation is performed on samples from
|
|
the same resource.</para>
|
|
<note>
|
|
<para>The calculation is limited to meters with the same interval.</para>
|
|
</note>
|
|
<para>Example configuration:</para>
|
|
<programlisting>transformers:
|
|
- name: "arithmetic"
|
|
parameters:
|
|
target:
|
|
name: "memory_util"
|
|
unit: "%"
|
|
type: "gauge"
|
|
expr: "100 * $(memory.usage) / $(memory)"</programlisting>
|
|
<para>To demonstrate the use of metadata, here is the implementation of a silly metric
|
|
that shows average CPU time per core:</para>
|
|
<programlisting>transformers:
|
|
- name: "arithmetic"
|
|
parameters:
|
|
target:
|
|
name: "avg_cpu_per_core"
|
|
unit: "ns"
|
|
type: "cumulative"
|
|
expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"</programlisting>
|
|
<note>
|
|
<para>Expression evaluation gracefully handles NaNs and exceptions. In such a case it
|
|
does not create a new sample but only logs a warning.</para>
|
|
</note>
|
|
</simplesect>
|
|
</section>
|
|
</section>
|
|
</section>
|
|
<section xml:id="section_telemetry-cinder-audit-script">
|
|
<title>Block Storage audit script setup to get notifications</title>
|
|
<para>If you want to collect OpenStack Block Storage notification on demand,
|
|
you can use <command>cinder-volume-usage-audit</command> from OpenStack Block Storage.
|
|
This script becomes available when you install OpenStack Block Storage, so you can use
|
|
it without any specific settings and you don't need to authenticate to
|
|
access the data. To use it, you must run this command in the following format:</para>
|
|
<screen><prompt>$</prompt> <userinput>cinder-volume-usage-audit \
|
|
--start_time='YYYY-MM-DD HH:MM:SS' --end_time='YYYY-MM-DD HH:MM:SS' --send_actions</userinput></screen>
|
|
<para>This script outputs what volumes or snapshots were
|
|
created or deleted or existed in a given period of time and some
|
|
information about these volumes or snapshots. Information about the
|
|
existence and size of volumes and snapshots is store in the Telemetry
|
|
module.</para>
|
|
<para>Using this script via cron you can get notifications periodically,
|
|
for example, every 5 minutes.</para>
|
|
<programlisting>*/5 * * * * /path/to/cinder-volume-usage-audit --send_actions</programlisting>
|
|
</section>
|
|
<section xml:id="section_telemetry-storing-data">
|
|
<title>Storing samples</title>
|
|
<para>The Telemetry module has a separate service that is responsible for persisting the data
|
|
that is coming from the pollsters or received as notifications. The data can be stored in
|
|
a file or a database back end, for which the list of supported databases can be found in
|
|
<xref linkend="section_telemetry-supported-dbs"/>. The data can also be sent to an external
|
|
data store by using an HTTP dispatcher.
|
|
</para>
|
|
<para>The <systemitem class="service">ceilometer-collector</systemitem> service receives the
|
|
samples as metering messages from the message bus of the configured AMQP service. It stores
|
|
these samples without any modification in the configured file or database back end, or in
|
|
the external data store as dispatched by the HTTP dispatcher. The service has to run on a
|
|
host machine from which it has access to the configured dispatcher.</para>
|
|
<note>
|
|
<para>Multiple dispatchers can be configured for Telemetry at one time.</para>
|
|
</note>
|
|
<para>Multiple <systemitem class="service">ceilometer-collector</systemitem> process can be
|
|
run at a time. It is also supported to start multiple worker threads per collector process.
|
|
The <parameter>collector_workers</parameter> configuration option has to be modified in the
|
|
<link xlink:href=
|
|
"http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
collector section</link> of the <filename>ceilometer.conf</filename>
|
|
configuration file.</para>
|
|
<note>
|
|
<para>Using multiple workers per collector process is not recommended to be used with
|
|
PostgreSQL as database back end.</para>
|
|
</note>
|
|
<simplesect>
|
|
<title>Database dispatcher</title>
|
|
<para>When the database dispatcher is configured as data store, you have the option to set
|
|
a <parameter>time_to_live</parameter> parameter (ttl) for samples. By default the time to
|
|
live value for samples is set to -1, which means that they are kept in the database forever.
|
|
</para>
|
|
<para>The time to live value is specified in seconds. Each sample has a time stamp, and the
|
|
<literal>ttl</literal> value indicates that a sample is deleted from the database when
|
|
the number of seconds has elapsed since that sample reading was stamped. For example,
|
|
if the sampling occurs every 60 seconds, and the time to live is set to 600, only ten
|
|
samples are stored in the database.</para>
|
|
<para>During the deletion of samples <literal>resources</literal> and <literal>users</literal>
|
|
can remain in the database without a corresponding sample when the time to live has expired,
|
|
you may need to delete the entries related to the expired samples. The command-line script,
|
|
which you can use for this purpose is
|
|
<systemitem class="service">ceilometer-expirer</systemitem>. You can run it in a cron job,
|
|
which helps keeping your database in a consistent state.</para>
|
|
<para>The level of support differs in case of the configured back end:</para>
|
|
<table rules="all">
|
|
<caption>Time-to-live support for database back ends</caption>
|
|
<col width="24%"/>
|
|
<col width="38%"/>
|
|
<col width="38%"/>
|
|
<thead>
|
|
<tr>
|
|
<td>Database</td>
|
|
<td>ttl value support</td>
|
|
<td><systemitem class="service">ceilometer-expirer</systemitem>
|
|
capabilities</td>
|
|
</tr>
|
|
</thead>
|
|
<tbody>
|
|
<tr>
|
|
<td>MongoDB</td>
|
|
<td>MongoDB has a built-in mechanism for deleting samples that are older
|
|
than the configured ttl value.</td>
|
|
<td>In case of this database, only the lingering dead resource,
|
|
user and project entries will be deleted by
|
|
<systemitem class="service">ceilometer-expirer</systemitem>.
|
|
</td>
|
|
</tr>
|
|
<tr>
|
|
<td>SQL-based back ends</td>
|
|
<td>The library (SQLAlchemy) that is used for accessing SQL-based back ends does
|
|
not support using the ttl value.</td>
|
|
<td><systemitem class="service">ceilometer-expirer</systemitem> has to be
|
|
used for deleting both the samples and the remaining entires in other
|
|
database tables. The script will delete samples based on the
|
|
<parameter>time_to_live</parameter> value that is set in the
|
|
configuration file.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>HBase</td>
|
|
<td>HBase does not support this functionality currently, therefore the ttl value
|
|
in the configuration file is ignored.</td>
|
|
<td>The samples are not deleted by using
|
|
<systemitem class="service">ceilometer-expirer</systemitem>,
|
|
this functionality is not supported.</td>
|
|
</tr>
|
|
<tr>
|
|
<td>DB2</td>
|
|
<td>Same case as MongoDB.</td>
|
|
<td>Same case as MongoDB.</td>
|
|
</tr>
|
|
</tbody>
|
|
</table>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>HTTP dispatcher</title>
|
|
<para>The Telemetry module supports sending samples to an external HTTP target. The
|
|
samples are sent without any modification. To set this option as data storage, the
|
|
<option>dispatcher</option> has to be changed to <literal>http</literal> in the
|
|
<filename>ceilometer.conf</filename> configuration file. For the list
|
|
of options that you need to set, see the see the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>dispatcher_http</literal> section</link>
|
|
in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
|
</simplesect>
|
|
<simplesect>
|
|
<title>File dispatcher</title>
|
|
<para>You can store samples in a file by setting the <option>dispatcher</option>
|
|
option in <filename>ceilometer.conf</filename> o <literal>file</literal>. For the list
|
|
of configuration options, see the
|
|
<link xlink:href="http://docs.openstack.org/juno/config-reference/content/ch_configuring-openstack-telemetry.html">
|
|
<literal>dispatcher_file</literal> section</link>
|
|
in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
|
</simplesect>
|
|
</section>
|
|
</section>
|