Add content and ToC to Telemetry Admin guide
Add content to the section System architecture and Data collection and add table of contents to Retrieving data and Alarming. Implements: blueprint add-ceilometer-admin-guide-to-openstack-manuals Change-Id: I56065c21d62e3477900981a4f37af0cc95fe64b9
This commit is contained in:
parent
2509c15c50
commit
95a077a0ff
@ -8,11 +8,41 @@
|
||||
<para>The Telemetry module is the metering service in OpenStack.</para>
|
||||
<section xml:id="section_telemetry-introduction">
|
||||
<title>Introduction</title>
|
||||
<para>TBD</para>
|
||||
<para>Even in the cloud industry, providers must use a multi-step
|
||||
process for billing. The required steps to bill for usage in a
|
||||
cloud environment are metering, rating, and billing. Because the
|
||||
provider's requirements may be far too specific for a shared
|
||||
solution, rating and billing solutions cannot be designed a
|
||||
common module that satisfies all. Providing users with measurements
|
||||
on cloud services is required to meet the "measured service"
|
||||
definition of cloud computing.</para>
|
||||
<para>The Telemetry module was originally designed to support billing
|
||||
systems for OpenStack cloud resources. This project only covers the
|
||||
metering portion of the required processing for billing. This module
|
||||
collects information about the system and stores it in the form of
|
||||
samples in order to provide data about anything that can be billed.
|
||||
</para>
|
||||
<para>The list of meters is continuously growing, which makes it
|
||||
possible to use the data collected by Telemetry for different
|
||||
purposes, other than billing. For example, the autoscaling feature
|
||||
in the Orchestration module can be triggered by alarms this module
|
||||
sets and then gets notified within Telemetry.</para>
|
||||
<para>The sections in this document contain information about the
|
||||
architecture and usage of Telemetry. The first section contains a
|
||||
brief summary about the system architecture used in a typical
|
||||
OpenStack deployment. The second section describes the data collection
|
||||
mechanisms. You can also read about alarming to understand how alarm
|
||||
definitions can be posted to Telemetry and what actions can happen if
|
||||
an alarm is raised. The last section contains a troubleshooting
|
||||
guide, which mentions error situations and possible solutions for the
|
||||
problems.</para>
|
||||
<para>You can retrieve the collected samples three different ways: with
|
||||
the REST API, with the command line interface, or with the Metering tab
|
||||
on an OpenStack dashboard.</para>
|
||||
</section>
|
||||
<xi:include href="telemetry/section_telemetry-system-architecture.xml"/>
|
||||
<xi:include href="telemetry/section_telemetry-data-collection.xml"/>
|
||||
<xi:include href="telemetry/section_telemetry-data-retrieval.xml"/>
|
||||
<xi:include href="telemetry/section_telemetry-alarms.xml"/>
|
||||
<xi:include href="telemetry/section_telemetry-troubleshooting-guide.xml"/>
|
||||
</chapter>
|
||||
</chapter>
|
||||
|
@ -5,5 +5,713 @@
|
||||
version="5.0"
|
||||
xml:id="section_telemetry-data-collection">
|
||||
<title>Data collection</title>
|
||||
<para>TBD</para>
|
||||
<para>The main responsibility of Telemetry in OpenStack is to collect
|
||||
information about the system that can be used by billing systems or any
|
||||
kinds of analytic tools for instance. The original focus, regarding to
|
||||
the collected data, was on the counters that can be used for billing,
|
||||
but the range is getting wider continuously.</para>
|
||||
<para>Collected data can be stored in the form of samples or events in the
|
||||
supported databases, listed in
|
||||
<xref linkend="section_telemetry-supported-dbs"/>.</para>
|
||||
<para>Samples can have various sources regarding to the needs
|
||||
and configuration of Telemetry, which requires multiple methods to
|
||||
collect data.</para>
|
||||
<para>The available data collection mechanisms are:</para>
|
||||
<para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>Notifications</term>
|
||||
<listitem>
|
||||
<para>Processing notifications from other OpenStack services, by
|
||||
consuming messages from the configured message queue system.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>Polling</term>
|
||||
<listitem>
|
||||
<para>Retrieve information directly from the hypervisor or from the host
|
||||
machine using SNMP, or by using the APIs of other OpenStack services.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>RESTful API</term>
|
||||
<listitem>
|
||||
<para>Pushing samples via the RESTful API of Telemetry.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
<section xml:id="section_telemetry-notifications">
|
||||
<title>Notifications</title>
|
||||
<para>All the services send notifications about the executed operations or system
|
||||
state in OpenStack. Several notifications carry information that can be metered,
|
||||
like when a new VM instance was created by OpenStack Compute service.</para>
|
||||
<para>The Telemetry module has a separate agent that is responsible for consuming
|
||||
notifications, namely the notification agent. This component is responsible for
|
||||
consuming from the message bus and transforming notifications into new samples.
|
||||
</para>
|
||||
<para>The different OpenStack services emit several notifications about the various
|
||||
types of events that happen in the system during normal operation. Not all these
|
||||
notifications are consumed by the Telemetry module, as the intention is only to
|
||||
capture the billable events and all those notifications that can be used for
|
||||
monitoring or profiling purposes. The notification agent filters by the event
|
||||
type, that is contained by each notification message. The following table
|
||||
contains the event types by each OpenStack service that are transformed to samples
|
||||
by Telemetry.</para>
|
||||
<table rules="all">
|
||||
<caption>Consumed event types from OpenStack services</caption>
|
||||
<col width="33%"/>
|
||||
<col width="33%"/>
|
||||
<col width="33%"/>
|
||||
<thead>
|
||||
<tr>
|
||||
<td>OpenStack service</td>
|
||||
<td>Event types</td>
|
||||
<td>Note</td>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>OpenStack Compute</td>
|
||||
<td><para>scheduler.run_instance.scheduled</para>
|
||||
<para>compute.instance.*</para></td>
|
||||
<td>For a more detailed list of Compute notifications please check the
|
||||
<link xlink:href="https://wiki.openstack.org/wiki/SystemUsageData">
|
||||
System Usage Data wiki page</link>.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Bare metal module for OpenStack</td>
|
||||
<td>hardware.ipmi.*</td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>OpenStack Image Service</td>
|
||||
<td><para>image.update</para>
|
||||
<para>image.upload</para>
|
||||
<para>image.delete</para>
|
||||
<para>image.send</para></td>
|
||||
<td>The required configuration for Image service can be found in the
|
||||
<link xlink:href=
|
||||
"http://docs.openstack.org/trunk/install-guide/install/apt/content/ceilometer-install-glance.html">
|
||||
Configure the Image Service for Telemetry section</link> section
|
||||
in the <citetitle>OpenStack Installation Guide</citetitle>.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>OpenStack Networking</td>
|
||||
<td><para>floatingip.create.end</para>
|
||||
<para>floatingip.update.*</para>
|
||||
<para>floatingip.exists</para>
|
||||
<para>network.create.end</para>
|
||||
<para>network.update.*</para>
|
||||
<para>network.exists</para>
|
||||
<para>port.create.end</para>
|
||||
<para>port.update.*</para>
|
||||
<para>port.exists</para>
|
||||
<para>router.create.end</para>
|
||||
<para>router.update.*</para>
|
||||
<para>router.exists</para>
|
||||
<para>subnet.create.end</para>
|
||||
<para>subnet.update.*</para>
|
||||
<para>subnet.exists</para>
|
||||
<para>l3.meter</para></td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Orchestration module</td>
|
||||
<td><para>orchestration.stack.create.end</para>
|
||||
<para>orchestration.stack.update.end</para>
|
||||
<para>orchestration.stack.delete.end</para>
|
||||
<para>orchestration.stack.resume.end</para>
|
||||
<para>orchestration.stack.suspend.end</para></td>
|
||||
<td></td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>OpenStack Block Storage</td>
|
||||
<td><para>volume.exists</para>
|
||||
<para>volume.create.*</para>
|
||||
<para>volume.delete.*</para>
|
||||
<para>volume.resize.*</para>
|
||||
<para>snapshot.exists</para>
|
||||
<para>snapshot.create.*</para>
|
||||
<para>snapshot.delete.*</para>
|
||||
<para>snapshot.resize.*</para></td>
|
||||
<td>The required configuration for Block Storage service can be found in the
|
||||
<link xlink:href=
|
||||
"http://docs.openstack.org/trunk/install-guide/install/apt/content/ceilometer-install-cinder.html">
|
||||
Add the Block Storage service agent for Telemetry section</link>
|
||||
section in the <citetitle>OpenStack Installation Guide</citetitle>.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<note>
|
||||
<para>Some services require additional configuration to emit the notifications
|
||||
using the correct control exchange on the message queue and so forth. These
|
||||
configuration needs are referred in the above table for each OpenStack service
|
||||
that needs it.</para>
|
||||
</note>
|
||||
<note>
|
||||
<para>When the <literal>store_events</literal> option is set to True in
|
||||
<filename>ceilometer.conf</filename>, the notification agent needs database access
|
||||
in order to work properly.</para>
|
||||
</note>
|
||||
<section xml:id="section_telemetry-objectstore-middleware">
|
||||
<title>Middleware for OpenStack Object Storage service</title>
|
||||
<para>A subset of Object Store statistics requires an additional middleware to be installed
|
||||
behind the proxy of Object Store. This additional component emits notifications containing
|
||||
the data-flow-oriented meters, namely the storage.objects.(incoming|outgoing).bytes values.
|
||||
The list of these meters are listed in the <link xlink:href=
|
||||
"http://docs.openstack.org/developer/ceilometer/measurements.html#object-storage-swift">
|
||||
Swift</link> table section in the <citetitle>Telemetry Measurements Reference</citetitle>,
|
||||
marked with <literal>notification</literal> as origin.</para>
|
||||
<para>The instructions on how to install this middleware can be found in <link xlink:href=
|
||||
"http://docs.openstack.org/trunk/install-guide/install/apt/content/ceilometer-install-swift.html">
|
||||
Configure the Object Storage service for Telemetry</link>
|
||||
section in the <citetitle>OpenStack Installation Guide</citetitle>.
|
||||
</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-middleware">
|
||||
<title>Telemetry middleware</title>
|
||||
<para>Telemetry provides the capability of counting the HTTP requests and responses
|
||||
for each API endpoint in OpenStack. This is achieved by storing a sample for each
|
||||
event marked as <literal>http.request</literal> or <literal>http.response</literal>.</para>
|
||||
<para>Telemetry can consume these events if the services are configured to emit
|
||||
notifications with these two event types.</para>
|
||||
</section>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-polling">
|
||||
<title>Polling</title>
|
||||
<para>The Telemetry module is intended to store a complex picture of the
|
||||
infrastructure. This goal requires additional information than what is
|
||||
provided by the events and notifications published by each service.
|
||||
Some information is not emitted directly, like resource usage of the VM
|
||||
instances.</para>
|
||||
<para>Therefore Telemetry uses another method to gather this data by polling
|
||||
the infrastructure including the APIs of the different OpenStack services and
|
||||
other assets, like hypervisors. The latter case requires closer interaction with
|
||||
the compute hosts. To solve this issue, Telemetry uses an agent based
|
||||
architecture to fulfill the requirements against the data collection.</para>
|
||||
<para>There are two agents supporting the polling mechanism, namely the compute
|
||||
agent and the central agent. The following subsections give further information
|
||||
regarding to the archticetural and configuration details of these components.
|
||||
</para>
|
||||
<section xml:id="section_telemetry-central-agent">
|
||||
<title>Central agent</title>
|
||||
<para>As the name of this agent shows, it is a central component in the
|
||||
Telemetry architecture. This agent is responsible for polling public REST APIs
|
||||
to retrieve additional information on OpenStack resources not already surfaced
|
||||
via notifications, and also for polling hardware resources over SNMP.</para>
|
||||
<para>The follwoing services can be polled with this agent:
|
||||
</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>OpenStack Networking</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>OpenStack Object Storage</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>OpenStack Block Storage</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Hardware resources via SNMP</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Energy consumption metrics via <link xlink:href="https://launchpad.net/kwapi">
|
||||
Kwapi</link> framework</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>To install and configure this service use the <link xlink:href=
|
||||
"http://docs.openstack.org/trunk/install-guide/install/apt/content/ceilometer-install.html">
|
||||
Install the Telemtery module</link> section in the <citetitle>OpenStack
|
||||
Installation Guide</citetitle>.</para>
|
||||
<para>The central agent can be run as a single instance currently. It does not need
|
||||
any database connection directly. The samples collected by this agent are sent via
|
||||
message queue to the collector service, which is responsible for persisting the
|
||||
data into the configured database backend.</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-compute-agent">
|
||||
<title>Compute agent</title>
|
||||
<para>This agent is responsible for collecting resource usage data of VM
|
||||
instances on individual compute nodes within an OpenStack deployment. This
|
||||
mechanism requires a closer interaction with the hypervisor, therefore a
|
||||
separate agent type fulfills the collection of the related meters, which
|
||||
placed on the host machines to locally retrieve this information.</para>
|
||||
<para>A compute agent instance has to be installed on each and every compute node,
|
||||
installation instructions can be found in the <link xlink:href=
|
||||
"http://docs.openstack.org/trunk/install-guide/install/apt/content/ceilometer-install-nova.html">
|
||||
Install the Compute agent for Telemetry</link> section in the
|
||||
<citetitle>OpenStack Installation Guide</citetitle>.
|
||||
</para>
|
||||
<para>Just like the central agent, this component also does not need a direct database
|
||||
access. The samples are sent via AMQP to the collector.
|
||||
</para>
|
||||
<para>The list of supported hypervisors can be found in
|
||||
<xref linkend="section_telemetry-supported-hypervisors"/>.
|
||||
The compute agent uses the API of the hypervisor installed on the compute hosts.
|
||||
Therefore the supported meters can be different in case of each virtualization
|
||||
backend, as these tools provide different set of metrics.</para>
|
||||
<para>The list of collected meters can be found in the <link xlink:href=
|
||||
"http://docs.openstack.org/developer/ceilometer/measurements.html#compute-nova">
|
||||
Compute section</link> in the <citetitle>Telemetery Measurements Reference</citetitle>.
|
||||
The support column provides the information that which meter is available for
|
||||
each hypervisor supported by the Telemetry module.</para>
|
||||
<note>
|
||||
<para>Telemetry supports Libvirt, which hides the hypervisor under it.</para>
|
||||
</note>
|
||||
</section>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-post-api">
|
||||
<title>Send samples to Telemetry</title>
|
||||
<para>Most parts of the data collections in the Telemtery module are automated.
|
||||
Telemetry provides the possibility to submit samples via the REST API to allow
|
||||
users to send custom samples into this module.</para>
|
||||
<para>This option makes it possible to send any kind of samples without the need
|
||||
of writing extra code lines or making configuration changes.</para>
|
||||
<para>The samples that can be sent to Telemetry are not limited to the actually
|
||||
existing meters. There is a possibility to provide data for any new, customer
|
||||
defined counter by filling out all the required fields of the POST request.
|
||||
</para>
|
||||
<para>If the sample corresponds to an existing meter, then the fields like
|
||||
<literal>meter-type</literal> and meter name should be matched accordingly.</para>
|
||||
<para>The required fields for sending a sample using the command line client
|
||||
are:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>ID of the corresponding resource. (<parameter>--resource-id</parameter>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Name of meter. (<parameter>--meter-name</parameter>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Type of meter. (<parameter>--meter-type</parameter>)</para>
|
||||
<para>Predefined meter types:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Gauge</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Delta</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Cumulative</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Unit of meter. (<parameter>--meter-unit</parameter>)</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Volume of sample. (<parameter>--sample-volume</parameter>)</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
<para>The <literal>memory.usage</literal> meter is not supported when Libvirt is used in an
|
||||
Openstack deployment. There is still a possibility to provide samples for
|
||||
this meter based on any custom measurements. To send samples to Telemetry
|
||||
using the command line client, the follwoing command should be invoked:
|
||||
<screen><prompt>$</prompt> <userinput>ceilometer sample-create -r 37128ad6-daaa-4d22-9509-b7e1c6b08697 \
|
||||
-m memory.usage --meter-type gauge --meter-unit MB --sample-volume 48</userinput>
|
||||
<?db-font-size 75%?><computeroutput>+-------------------+--------------------------------------------+
|
||||
| Property | Value |
|
||||
+-------------------+--------------------------------------------+
|
||||
| message_id | 6118820c-2137-11e4-a429-08002715c7fb |
|
||||
| name | memory.usage |
|
||||
| project_id | e34eaa91d52a4402b4cb8bc9bbd308c1 |
|
||||
| resource_id | 37128ad6-daaa-4d22-9509-b7e1c6b08697 |
|
||||
| resource_metadata | {} |
|
||||
| source | e34eaa91d52a4402b4cb8bc9bbd308c1:openstack |
|
||||
| timestamp | 2014-08-11T09:10:46.358926 |
|
||||
| type | gauge |
|
||||
| unit | MB |
|
||||
| user_id | 679b0499e7a34ccb9d90b64208401f8e |
|
||||
| volume | 48.0 |
|
||||
+-------------------+--------------------------------------------+</computeroutput></screen>
|
||||
</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-data-collection-processing">
|
||||
<title>Data collection and processing</title>
|
||||
<para>The mechanism via the data is collected and processed is called
|
||||
pipeline. Pipelines on configuration level describe a coupling between
|
||||
sources of samples and the corresponding sinks for transformation and
|
||||
publication of these data.</para>
|
||||
<para>A source is a producer of samples, in effect a set of pollsters
|
||||
and/or notification handlers emitting samples for a set of matching meters.
|
||||
</para>
|
||||
<para>Each source configuration encapsulates meter name matching, polling
|
||||
interval determination, optional resource enumeration or discovery, and
|
||||
mapping to one or more sinks for publication.</para>
|
||||
<para>A sink on the other hand is a consumer of samples, providing logic
|
||||
for the transformation and publication of samples emitted from related
|
||||
sources. Each sink configuration is concerned only with the
|
||||
transformation rules and publication conduits for samples.</para>
|
||||
<para>In effect, a sink describes a chain of handlers. The chain starts
|
||||
with zero or more transformers and ends with one or more publishers.
|
||||
The first transformer in the chain is passed samples from the corresponding
|
||||
source, takes some action such as deriving rate of change, performing unit
|
||||
conversion, or aggregating, before passing the modified sample to the
|
||||
next step that is described in
|
||||
<xref linkend="section_telemetry-publishers"/>.</para>
|
||||
<section xml:id="section_telemetry-pipeline-configuration">
|
||||
<title>Pipeline configuration</title>
|
||||
<para>Pipeline configuration by default, is stored in a separate configuration
|
||||
file, called <filename>pipeline.yaml</filename>, next to the
|
||||
<filename>ceilometer.conf</filename> file. The pipeline
|
||||
configuration file can be set in the <parameter>pipeline_cfg_file</parameter>
|
||||
parameter listed in the <link xlink:href=
|
||||
"http://docs.openstack.org/trunk/config-reference/content/ch_configuring-openstack-telemetry.html"
|
||||
>Description of configuration options for api table</link> section in the
|
||||
<citetitle>OpenStack Configuration Reference</citetitle>. Multiple chains
|
||||
can be defined in one pipeline configuration file.</para>
|
||||
<para>The chain definition looks like the following:</para>
|
||||
<programlisting>---
|
||||
sources:
|
||||
- name: 'source name'
|
||||
interval: 'how often should the samples be injected into the pipeline'
|
||||
meters:
|
||||
- 'meter filter'
|
||||
resources:
|
||||
- 'list of resource URLs'
|
||||
sinks
|
||||
- 'sink name'
|
||||
sinks:
|
||||
- name: 'sink name'
|
||||
transformers: 'definition of transformers'
|
||||
publishers:
|
||||
- 'list of publishers'</programlisting>
|
||||
<para>The interval parameter in the sources section should be defined in seconds.
|
||||
It determines the cadence of sample injection into the pipeline, where samples
|
||||
are produced under the direct control of an agent, for instance via a polling
|
||||
cycle as opposed to incoming notifications.</para>
|
||||
<para>There are several ways to define the list of meters for a pipeline source.
|
||||
The list of valid meters can be found in the <link xlink:href=
|
||||
"http://docs.openstack.org/developer/ceilometer/measurements.html"> Telemetry
|
||||
Measurements Reference</link> document. There is a possibility to define all
|
||||
the meters, or just included or excluded meters, with which a source should
|
||||
operate:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>To include all meters, use the <literal>*</literal> wildcard symbol.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>To define the list of meters, use either of the following:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>To define the list of included meters, use the <literal>meter_name</literal>
|
||||
syntax.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>To define the list of excluded meters, use the <literal>!meter_name</literal>
|
||||
syntax.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>For meters, which have variants identified by a complex name field,
|
||||
use the wildcard symbol to select all, e.g. for “instance:m1.tiny”, use
|
||||
“instance:*”.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<para>The above definition methods can be used in the following combinations:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Use only the wildcard symbol.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Use the list of included meters.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Use the list of excluded meters.</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Use wildcard symbol with the list of excluded meters.</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<note>
|
||||
<para>At least one of the above variations should be included in the meters section.
|
||||
Included and excluded meters cannot co-exist in the same pipeline. Wildcard and
|
||||
included meters cannot co-exist in the same pipeline definition section.</para>
|
||||
</note>
|
||||
<para>The optional resources section of a pipeline source allows a static list of
|
||||
resource URLs to be configured for polling.</para>
|
||||
<para>The transformers section of a pipeline sink provides the possibility to add a list
|
||||
of transformer definitions. The available transformers are:</para>
|
||||
<table rules="all">
|
||||
<caption>List of available transformers</caption>
|
||||
<col width="50%"/>
|
||||
<col width="50%"/>
|
||||
<thead>
|
||||
<tr>
|
||||
<td>Name of transformer</td>
|
||||
<td>Reference name for configuration</td>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>Accumulator</td>
|
||||
<td>accumulator</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Aggregator</td>
|
||||
<td>aggregator</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Arithmetic</td>
|
||||
<td>arithmetic</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Rate of change</td>
|
||||
<td>rate_of_change</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>Unit conversion</td>
|
||||
<td>unit_conversion</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
<para>The publishers section contains the list of publishers, where the samples data should
|
||||
be sent after the possible transformations.</para>
|
||||
<section xml:id="section_telemetry-pipeline-transformers">
|
||||
<title>Transformers</title>
|
||||
<para>The definition of transformers can contain the following fields:</para>
|
||||
<para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term>name</term>
|
||||
<listitem>
|
||||
<para>Name of the transformer.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term>parameters</term>
|
||||
<listitem>
|
||||
<para>Parameters of the transformer.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
<para>The parameters section can contain transformer specific fields, like source and
|
||||
target fields with different subfields in case of the rate of change, which depends on
|
||||
the implementation of the transformer.</para>
|
||||
<simplesect>
|
||||
<title>Rate of change transformer</title>
|
||||
<para>In the case of the transformer that creates the
|
||||
<literal>cpu_util</literal> meter, the definition looks like the following:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "rate_of_change"
|
||||
parameters:
|
||||
target:
|
||||
name: "cpu_util"
|
||||
unit: "%"
|
||||
type: "gauge"
|
||||
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"</programlisting>
|
||||
<para>The rate of change transformer generates the <literal>cpu_util</literal>meter
|
||||
from the sample values of the <literal>cpu</literal> counter, which represents
|
||||
cumulative CPU time in nanoseconds. The transformer definition above defines a
|
||||
scale factor (for nanoseconds, multiple CPUs, etc.), which is applied before the
|
||||
transformation derives a sequence of gauge samples with unit ‘%’, from sequential
|
||||
values of the <literal>cpu</literal> meter.</para>
|
||||
<para>The definition for the disk I/O rate, which is also generated by the rate of change
|
||||
transformer:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "rate_of_change"
|
||||
parameters:
|
||||
source:
|
||||
map_from:
|
||||
name: "disk\\.(read|write)\\.(bytes|requests)"
|
||||
unit: "(B|request)"
|
||||
target:
|
||||
map_to:
|
||||
name: "disk.\\1.\\2.rate"
|
||||
unit: "\\1/s"
|
||||
type: "gauge"</programlisting>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>Unit conversion transformer</title>
|
||||
<para>Transformer to apply a unit conversion. It takes the volume of the meter and
|
||||
multiplies it with the given ‘scale’ expression. Also supports <literal>map_from
|
||||
</literal> and <literal>map_to</literal> like the rate of change transformer.</para>
|
||||
<para>Sample configuration:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "unit_conversion"
|
||||
parameters:
|
||||
target:
|
||||
name: "disk.kilobytes"
|
||||
unit: "KB"
|
||||
scale: "1.0 / 1024.0"</programlisting>
|
||||
<para>With the <parameter>map_from</parameter> and <parameter>map_to</parameter>
|
||||
:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "unit_conversion"
|
||||
parameters:
|
||||
source:
|
||||
map_from:
|
||||
name: "disk\\.(read|write)\\.bytes"
|
||||
target:
|
||||
map_to:
|
||||
name: "disk.\\1.kilobytes"
|
||||
scale: "1.0 / 1024.0"
|
||||
unit: "KB"</programlisting>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>Aggregator transformer</title>
|
||||
<para>A transformer that sums up the incoming samples until enough samples have
|
||||
come in or a timeout has been reached.</para>
|
||||
<para>Timeout can be specified with the <parameter>retention_time</parameter> parameter.
|
||||
If we want to flush the aggregation after a set number of samples have been
|
||||
aggregated, we can specify the size parameter.</para>
|
||||
<para>The volume of the created sample is the sum of the volumes of samples that came
|
||||
into the transformer. Samples can be aggregated by the attributes <parameter>project_id
|
||||
</parameter>, <parameter>user_id</parameter> and <parameter>resource_metadata</parameter>.
|
||||
To aggregate by the chosen attributes, specify them in the configuration and set
|
||||
which value of the attribute to take for the new sample (first to take the first
|
||||
sample’s attribute, last to take the last sample’s attribute, and drop to discard
|
||||
the attribute).</para>
|
||||
<para>To aggregate 60s worth of samples by <parameter>resource_metadata</parameter>
|
||||
and keep the <parameter>resource_metadata</parameter> of the latest received
|
||||
sample:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "aggregator"
|
||||
parameters:
|
||||
retention_time: 60
|
||||
resource_metadata: last</programlisting>
|
||||
<para>To aggregate each 15 samples by <parameter>user_id</parameter> and <parameter>resource_metadata
|
||||
</parameter> and keep the <parameter>user_id</parameter> of the first received sample and
|
||||
drop the <parameter>resource_metadata</parameter>:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "aggregator"
|
||||
parameters:
|
||||
size: 15
|
||||
user_id: first
|
||||
resource_metadata: drop</programlisting>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>Accumulator transformer</title>
|
||||
<para>This transformer simply caches the samples until enough samples have arrived and
|
||||
then flushes them all down the pipeline at once.</para>
|
||||
<programlisting>transformers:
|
||||
- name: "accumulator"
|
||||
parameters:
|
||||
size: 15</programlisting>
|
||||
</simplesect>
|
||||
<simplesect>
|
||||
<title>Multi meter arithmetic transformer</title>
|
||||
<para>This transformer enables us to perform arithmetic calculations over one or more
|
||||
meters and/or their metadata, for example:</para>
|
||||
<programlisting>memory_util = 100 * memory.usage / memory</programlisting>
|
||||
<para>A new sample is created with the properties described in the <literal>target
|
||||
</literal> section of the transformer’s configuration. The sample’s volume is the
|
||||
result of the provided expression. The calculation is performed on samples from
|
||||
the same resource.</para>
|
||||
<note>
|
||||
<para>The calculation is limited to meters with the same interval.</para>
|
||||
</note>
|
||||
<para>Example configuration:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "arithmetic"
|
||||
parameters:
|
||||
target:
|
||||
name: "memory_util"
|
||||
unit: "%"
|
||||
type: "gauge"
|
||||
expr: "100 * $(memory.usage) / $(memory)"</programlisting>
|
||||
<para>To demonstrate the use of metadata, here is the implementation of a silly metric
|
||||
that shows average CPU time per core:</para>
|
||||
<programlisting>transformers:
|
||||
- name: "arithmetic"
|
||||
parameters:
|
||||
target:
|
||||
name: "avg_cpu_per_core"
|
||||
unit: "ns"
|
||||
type: "cumulative"
|
||||
expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"</programlisting>
|
||||
<note>
|
||||
<para>Expression evaluation gracefully handles NaNs and exceptions. In such a case it
|
||||
does not create a new sample but only logs a warning.</para>
|
||||
</note>
|
||||
</simplesect>
|
||||
</section>
|
||||
</section>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-storing-data">
|
||||
<title>Storing samples</title>
|
||||
<para>The Telemetry module has a separate service that is responsible for persisting the data
|
||||
that is coming from the pollsters or received as notifications. The data is stored in
|
||||
a database backend, the list of supported databases can be found in
|
||||
<xref linkend="section_telemetry-supported-dbs"/>.
|
||||
</para>
|
||||
<para>The <systemitem class="service">ceilometer-collector</systemitem> service receives the
|
||||
samples as metering messages from the message bus of the configured AMQP service. It stores
|
||||
these samples without any modification in the configured backend. The service has to run on
|
||||
a host machine from which it has access to the database.</para>
|
||||
<para>Multiple <systemitem class="service">ceilometer-collector</systemitem> process can be
|
||||
run at a time. It is also supported to start multiple worker threads per collector process.
|
||||
The <parameter>collector_workers</parameter> configuration option has to be modified in the
|
||||
<link xlink:href=
|
||||
"http://docs.openstack.org/trunk/config-reference/content/ch_configuring-openstack-telemetry.html">
|
||||
collector section</link> of the <filename>ceilometer.conf</filename> configuration file.</para>
|
||||
<note>
|
||||
<para>Using multiple workers per collector process is not recommended to be used with
|
||||
PostgreSQL as database backend.</para>
|
||||
</note>
|
||||
<para>By default the time to live value (ttl) for samples is set to -1, which means that they
|
||||
are kept in the database forever. This can be changed by modifying the <parameter>time_to_live
|
||||
</parameter> parameter in <filename>ceilometer.conf</filename>. The value has to be specified
|
||||
in seconds and it means that every sample that based on its timestamp is older, than the
|
||||
specified value will be deleted from the database.</para>
|
||||
<para>When the samples are deleted, there are cases, when users and resources remain in the
|
||||
database without any corresponding sample. There is a command line script, that deletes
|
||||
these useless entries, which is called <systemitem class="service">ceilometer-expirer</systemitem>.
|
||||
This script should be run periodically, for instance in a cron job, to ensure that the
|
||||
database is cleaned up properly.</para>
|
||||
<para>The level of support differs in case of the configured backend:</para>
|
||||
<table rules="all">
|
||||
<caption>Time-to-live support for database backends</caption>
|
||||
<col width="24%"/>
|
||||
<col width="38%"/>
|
||||
<col width="38%"/>
|
||||
<thead>
|
||||
<tr>
|
||||
<td>Database</td>
|
||||
<td>ttl value support</td>
|
||||
<td><systemitem class="service">ceilometer-expirer</systemitem>
|
||||
capabilities</td>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
<tr>
|
||||
<td>MongoDB</td>
|
||||
<td>MongoDB has a built-in mechanism for deleting samples that are older
|
||||
than the configured ttl value.</td>
|
||||
<td>In case of this database, only the lingering dead resource,
|
||||
user and project entries entries will be deleted by
|
||||
<systemitem class="service">ceilometer-expirer</systemitem>.
|
||||
</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>SQL-based backends</td>
|
||||
<td>The library (SQLAlchemy) that is used for accessing SQL-based backends does
|
||||
not support using the ttl value.</td>
|
||||
<td><systemitem class="service">ceilometer-expirer</systemitem> has to be
|
||||
used for deleting both the samples and the remaining entires in other
|
||||
database tables. The script will delete samples based on the
|
||||
<parameter>time_to_live</parameter> value that is set in the
|
||||
configuration file.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>HBase</td>
|
||||
<td>HBase does not support this functionality currently, therefore the ttl value
|
||||
in the configuration file is ignored.</td>
|
||||
<td>The samples are not deleted by using
|
||||
<systemitem class="service">ceilometer-expirer</systemitem>,
|
||||
this functionality is not supported.</td>
|
||||
</tr>
|
||||
<tr>
|
||||
<td>DB2</td>
|
||||
<td>Same as in case of MongoDB.</td>
|
||||
<td>Same as in case of MongoDB.</td>
|
||||
</tr>
|
||||
</tbody>
|
||||
</table>
|
||||
</section>
|
||||
</section>
|
@ -6,4 +6,16 @@
|
||||
xml:id="section_telemetry-data-retrieval">
|
||||
<title>Data retrieval</title>
|
||||
<para>TBD</para>
|
||||
<section xml:id="section_telemetry-api-sdk">
|
||||
<title>Telemetry v2 API and SDK</title>
|
||||
<para>TBD</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-publishers">
|
||||
<title>Publishers</title>
|
||||
<para>TBD</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-api-events">
|
||||
<title>Events</title>
|
||||
<para>TBD</para>
|
||||
</section>
|
||||
</section>
|
@ -5,5 +5,216 @@
|
||||
version="5.0"
|
||||
xml:id="section_telemetry-system-architecture">
|
||||
<title>System architecture</title>
|
||||
<para>TBD</para>
|
||||
</section>
|
||||
<para>The Telemetry module uses an agent-based architecture.
|
||||
Several modules combine their responsibilities to collect data,
|
||||
store samples in a database, or provide an API service for handling
|
||||
incoming requests.</para>
|
||||
<para>The Telemetry module is built from the following agents and
|
||||
services:</para>
|
||||
<para>
|
||||
<variablelist>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-api</systemitem></term>
|
||||
<listitem>
|
||||
<para>Presents aggregated metering data to consumers
|
||||
(such as billing engines, analytics tools and so forth).</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-agent-central</systemitem></term>
|
||||
<listitem>
|
||||
<para>Polls the public RESTful APIs of other OpenStack
|
||||
services such as Compute service and Image service, in order to
|
||||
keep tabs on resource existence.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-agent-compute</systemitem></term>
|
||||
<listitem>
|
||||
<para>Polls the local hypervisor or libvirt daemon to acquire
|
||||
performance data for the local instances, messages and emits these
|
||||
data as AMQP messages.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-agent-notification</systemitem></term>
|
||||
<listitem>
|
||||
<para>Consumes AMQP messages from other OpenStack services.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-collector</systemitem></term>
|
||||
<listitem>
|
||||
<para>Consumes AMQP notifications from the agents, then dispatches
|
||||
these data to the metering store.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-alarm-evaluator</systemitem></term>
|
||||
<listitem>
|
||||
<para>Determines when alarms fire due to the associated statistic
|
||||
trend crossing a threshold over a sliding time window.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
<varlistentry>
|
||||
<term><systemitem class="service">ceilometer-alarm-notifier</systemitem></term>
|
||||
<listitem>
|
||||
<para>Initiates alarm actions, for example calling out to a webhook
|
||||
with a description of the alarm state transition.</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
</variablelist>
|
||||
</para>
|
||||
<para>Besides the <systemitem class="service">ceilometer-agent-compute</systemitem> service,
|
||||
all the other services are placed on one or more controller nodes.</para>
|
||||
<note>
|
||||
<para>The <systemitem class="service">ceilometer-agent-central</systemitem> service does
|
||||
not support multiple running instances at a time, it can have only one.</para>
|
||||
</note>
|
||||
<para>The Telemetry architecture highly depends on the AMQP service both for consuming
|
||||
notifications coming from OpenStack services and internal communication.</para>
|
||||
<section xml:id="section_telemetry-supported-dbs">
|
||||
<title>Supported databases</title>
|
||||
<para>The other key external component of Telemetry is the database, where the samples, alarm
|
||||
definitions and alarms are stored.</para>
|
||||
<note>
|
||||
<para>Multiple database backends can be configured in order to store samples and alarms
|
||||
separately.</para>
|
||||
</note>
|
||||
<para>The list of supported database backends:</para>
|
||||
<para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://www.mongodb.org/">MongoDB</link></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://www.mysql.com/">MySQL</link></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://www.postgresql.org/">PostgreSQL</link></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://hbase.apache.org/">HBase</link></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://www-01.ibm.com/software/data/db2/">DB2</link>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-supported-hypervisors">
|
||||
<title>Supported hypervisors</title>
|
||||
<para>The Telemetry module collects information about the virtual machines, which requires
|
||||
close connection to the hypervisor that runs on the compute hosts.</para>
|
||||
<para>The list of supported hypervisors is:</para>
|
||||
<para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>The following hypervisors are supported via
|
||||
<link xlink:href="http://libvirt.org/">Libvirt</link>:</para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>
|
||||
<link xlink:href="http://www.linux-kvm.org/page/Main_Page">Kernel-based
|
||||
Virtual Machine (KVM)</link>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<link xlink:href="http://wiki.qemu.org/Main_Page">Quick Emulator (QEMU)</link>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<link xlink:href="https://linuxcontainers.org/">Linux Containers (LXC)</link>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<link xlink:href="http://www.xenproject.org/help/documentation.html">XEN</link>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>
|
||||
<link xlink:href="http://user-mode-linux.sourceforge.net/">
|
||||
User-mode Linux (UML)</link>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
<note>
|
||||
<para>For details about hypervisor support in Libvirt please check the
|
||||
<link xlink:href="http://libvirt.org/hvsupport.html">Libvirt API
|
||||
support matrix</link>.
|
||||
</para>
|
||||
</note>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link
|
||||
xlink:href="http://www.microsoft.com/en-us/server-cloud/hyper-v-server/default.aspx"
|
||||
>Hyper-V</link>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link
|
||||
xlink:href="http://www.vmware.com/products/vsphere-hypervisor/support.html"
|
||||
>VMWare vSphere</link>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-supported-networking-services">
|
||||
<title>Suported networking services</title>
|
||||
<para>Telemetry is able to retrieve information from OpenStack Networking
|
||||
and external networking services:</para>
|
||||
<para>
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>OpenStack Networking:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para>Basic network metrics</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Firewall-as-a-Service (FWaaS) metrics</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>Loadbalancer-as-a-Service (LBaaS) metrics</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>VPN-as-a-Service (VPNaaS) metrics</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para>SDN controller metrics:
|
||||
<itemizedlist>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://www.opendaylight.org/software">
|
||||
OpenDaylight</link></para>
|
||||
</listitem>
|
||||
<listitem>
|
||||
<para><link xlink:href="http://opencontrail.org/">OpenContrail</link></para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</listitem>
|
||||
</itemizedlist>
|
||||
</para>
|
||||
</section>
|
||||
<section xml:id="section_telemetry-users-roles">
|
||||
<title>Users, roles and tenants</title>
|
||||
<para>This module of OpenStack uses OpenStack Identity for authenticating and authorizing
|
||||
users. The required configuration options are listed in the <link xlink:href=
|
||||
"http://docs.openstack.org/trunk/config-reference/content/ch_configuring-openstack-telemetry.html">
|
||||
Telemetry section</link> in the <citetitle>OpenStack Configuration Reference</citetitle>.</para>
|
||||
<para>Two roles are used in the system basically, which are the 'admin' and 'non-admin'. The
|
||||
authorization happens before processing each API request. The amount of returned data depends
|
||||
on the role the requestor owns.</para>
|
||||
<para>The creation of alarm definitions also highly depends on the role of the user, who
|
||||
initiated the action. Further details about alarm handling can be found in
|
||||
<xref linkend="section_telemetry-alarms"/> in this guide.</para>
|
||||
</section>
|
||||
</section>
|
Loading…
x
Reference in New Issue
Block a user