[User Guides] IA and edits to Telemetry sections
* Editing sentence structure, word choice, and some typos. * Moving data processing and pipeline content * Converting tables to list table format Change-Id: I6136d8c370ebd1c09e340a3b060a56ae54bb5588 Implements: blueprint user-guide-editing
This commit is contained in:
parent
2c803db0da
commit
b260fa8b65
@ -6,16 +6,17 @@ Data collection
|
||||
|
||||
The main responsibility of Telemetry in OpenStack is to collect
|
||||
information about the system that can be used by billing systems or
|
||||
interpreted by analytic tooling. The original focus, regarding to the
|
||||
collected data, was on the counters that can be used for billing, but
|
||||
the range is getting wider continuously.
|
||||
interpreted by analytic tooling. Telemetry in OpenStack originally focused
|
||||
on the counters used for billing, and the recorded range is
|
||||
continuously growing wider.
|
||||
|
||||
Collected data can be stored in the form of samples or events in the
|
||||
supported databases, listed in :ref:`telemetry-supported-databases`.
|
||||
supported databases, which are listed
|
||||
in :ref:`telemetry-supported-databases`.
|
||||
|
||||
Samples can have various sources regarding to the needs and
|
||||
configuration of Telemetry, which requires multiple methods to collect
|
||||
data.
|
||||
Samples can have various sources. Sample sources depend on, and adapt to,
|
||||
the needs and configuration of Telemetry. The Telemetry service requires
|
||||
multiple methods to collect data samples.
|
||||
|
||||
The available data collection mechanisms are:
|
||||
|
||||
@ -33,131 +34,145 @@ RESTful API
|
||||
|
||||
Notifications
|
||||
~~~~~~~~~~~~~
|
||||
All the services send notifications about the executed operations or
|
||||
system state in OpenStack. Several notifications carry information that
|
||||
can be metered, like the CPU time of a VM instance created by OpenStack
|
||||
All OpenStack services send notifications about the executed operations
|
||||
or system state. Several notifications carry information that can be
|
||||
metered. For example, CPU time of a VM instance created by OpenStack
|
||||
Compute service.
|
||||
|
||||
The Telemetry service has a separate agent that is responsible for
|
||||
consuming notifications, namely the notification agent. This component
|
||||
is responsible for consuming from the message bus and transforming
|
||||
notifications into events and measurement samples. Beginning in the Liberty
|
||||
release, the notification agent is responsible for all data processing such as
|
||||
transformations and publishing. After processing, the data is sent via AMQP to
|
||||
the collector service or any external service, which is responsible for
|
||||
persisting the data into the configured database back end.
|
||||
The notification agent works alongside, but separately, from the
|
||||
Telemetry service. The agent is responsible for consuming notifications.
|
||||
This component is responsible for consuming from the message bus and
|
||||
transforming notifications into events and measurement samples.
|
||||
|
||||
Since the Liberty release, the notification agent is responsible
|
||||
for all data processing such as transformations and publishing. After
|
||||
processing, the data is sent via AMQP to the collector service or any
|
||||
external service. These external services persist the data in
|
||||
configured databases.
|
||||
|
||||
The different OpenStack services emit several notifications about the
|
||||
various types of events that happen in the system during normal
|
||||
operation. Not all these notifications are consumed by the Telemetry
|
||||
service, as the intention is only to capture the billable events and
|
||||
notifications that can be used for monitoring or profiling purposes. The
|
||||
notification agent filters by the event type, that is contained by each
|
||||
notification message. The following table contains the event types by
|
||||
each OpenStack service that are transformed to samples by Telemetry.
|
||||
notification agent filters by the event type. Each notification
|
||||
message contains the event type. The following table contains the event
|
||||
types by each OpenStack service that Telemetry transforms into samples.
|
||||
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
| OpenStack service | Event types | Note |
|
||||
+====================+========================+===============================+
|
||||
| OpenStack Compute | scheduler.run\_insta\ | For a more detailed list of |
|
||||
| | nce.scheduled | Compute notifications please |
|
||||
| | | check the `System Usage Data |
|
||||
| | scheduler.select\_\ | Data wiki page <https://wiki |
|
||||
| | destinations | .openstack.org/wiki/ |
|
||||
| | | SystemUsageData>`__. |
|
||||
| | compute.instance.\* | |
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
| Bare metal service | hardware.ipmi.\* | |
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
| OpenStack Image | image.update | The required configuration |
|
||||
| service | | for Image service can be |
|
||||
| | image.upload | found in `Configure the Image |
|
||||
| | | service for Telemetry section |
|
||||
| | image.delete | <http://docs.openstack.org |
|
||||
| | | /mitaka/install-guide-ubuntu |
|
||||
| | image.send | /ceilometer-glance.html>`__ |
|
||||
| | | section in the OpenStack |
|
||||
| | | Installation Guide |
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
| OpenStack | floatingip.create.end | |
|
||||
| Networking | | |
|
||||
| | floatingip.update.\* | |
|
||||
| | | |
|
||||
| | floatingip.exists | |
|
||||
| | | |
|
||||
| | network.create.end | |
|
||||
| | | |
|
||||
| | network.update.\* | |
|
||||
| | | |
|
||||
| | network.exists | |
|
||||
| | | |
|
||||
| | port.create.end | |
|
||||
| | | |
|
||||
| | port.update.\* | |
|
||||
| | | |
|
||||
| | port.exists | |
|
||||
| | | |
|
||||
| | router.create.end | |
|
||||
| | | |
|
||||
| | router.update.\* | |
|
||||
| | | |
|
||||
| | router.exists | |
|
||||
| | | |
|
||||
| | subnet.create.end | |
|
||||
| | | |
|
||||
| | subnet.update.\* | |
|
||||
| | | |
|
||||
| | subnet.exists | |
|
||||
| | | |
|
||||
| | l3.meter | |
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
| Orchestration | orchestration.stack\ | |
|
||||
| service | .create.end | |
|
||||
| | | |
|
||||
| | orchestration.stack\ | |
|
||||
| | .update.end | |
|
||||
| | | |
|
||||
| | orchestration.stack\ | |
|
||||
| | .delete.end | |
|
||||
| | | |
|
||||
| | orchestration.stack\ | |
|
||||
| | .resume.end | |
|
||||
| | | |
|
||||
| | orchestration.stack\ | |
|
||||
| | .suspend.end | |
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
| OpenStack Block | volume.exists | The required configuration |
|
||||
| Storage | | for Block Storage service can |
|
||||
| | volume.create.\* | be found in the `Add the |
|
||||
| | | Block Storage service agent |
|
||||
| | volume.delete.\* | for Telemetry section <http: |
|
||||
| | | //docs.openstack.org/mitaka/ |
|
||||
| | volume.update.\* | install-guide-ubuntu/ |
|
||||
| | | /ceilometer-cinder.html>`__ |
|
||||
| | volume.resize.\* | section in the |
|
||||
| | | OpenStack Installation Guide. |
|
||||
| | volume.attach.\* | |
|
||||
| | | |
|
||||
| | volume.detach.\* | |
|
||||
| | | |
|
||||
| | snapshot.exists | |
|
||||
| | | |
|
||||
| | snapshot.create.\* | |
|
||||
| | | |
|
||||
| | snapshot.delete.\* | |
|
||||
| | | |
|
||||
| | snapshot.update.\* | |
|
||||
| | | |
|
||||
| | volume.backup.create.\ | |
|
||||
| | \* | |
|
||||
| | | |
|
||||
| | volume.backup.delete.\ | |
|
||||
| | \* | |
|
||||
| | | |
|
||||
| | volume.backup.restore.\| |
|
||||
| | \* | |
|
||||
+--------------------+------------------------+-------------------------------+
|
||||
.. list-table::
|
||||
:widths: 10 15 30
|
||||
:header-rows: 1
|
||||
|
||||
* - OpenStack service
|
||||
- Event types
|
||||
- Note
|
||||
* - OpenStack Compute
|
||||
- scheduler.run\_instance.scheduled
|
||||
|
||||
scheduler.select\_\
|
||||
destinations
|
||||
|
||||
compute.instance.\*
|
||||
- For a more detailed list of Compute notifications please
|
||||
check the `System Usage Data wiki page <https://wiki.openstack.org/wiki/
|
||||
SystemUsageData>`__.
|
||||
* - Bare metal service
|
||||
- hardware.ipmi.\*
|
||||
-
|
||||
* - OpenStack Image
|
||||
- image.update
|
||||
|
||||
image.upload
|
||||
|
||||
image.delete
|
||||
|
||||
image.send
|
||||
- The required configuration for Image service can be * - service found in
|
||||
`Configure the Image service for Telemetry section <http://docs.openstack.org/mitaka/install-guide-ubuntu/ceilometer-glance.html>`__
|
||||
section in the OpenStack Installation Guide
|
||||
* - OpenStack Networking
|
||||
- floatingip.create.end
|
||||
|
||||
floatingip.update.\*
|
||||
|
||||
floatingip.exists
|
||||
|
||||
network.create.end
|
||||
|
||||
network.update.\*
|
||||
|
||||
network.exists
|
||||
|
||||
port.create.end
|
||||
|
||||
port.update.\*
|
||||
|
||||
port.exists
|
||||
|
||||
router.create.end
|
||||
|
||||
router.update.\*
|
||||
|
||||
router.exists
|
||||
|
||||
subnet.create.end
|
||||
|
||||
subnet.update.\*
|
||||
|
||||
subnet.exists
|
||||
|
||||
l3.meter
|
||||
-
|
||||
* - Orchestration service
|
||||
- orchestration.stack\
|
||||
.create.end
|
||||
|
||||
orchestration.stack\
|
||||
.update.end
|
||||
|
||||
orchestration.stack\
|
||||
.delete.end
|
||||
|
||||
orchestration.stack\
|
||||
.resume.end
|
||||
|
||||
orchestration.stack\
|
||||
.suspend.end
|
||||
-
|
||||
* - OpenStack Block Storage
|
||||
- volume.exists
|
||||
|
||||
volume.create.\*
|
||||
|
||||
volume.delete.\*
|
||||
|
||||
volume.update.\*
|
||||
|
||||
volume.resize.\*
|
||||
|
||||
volume.attach.\*
|
||||
|
||||
volume.detach.\*
|
||||
|
||||
snapshot.exists
|
||||
|
||||
snapshot.create.\*
|
||||
|
||||
snapshot.delete.\*
|
||||
|
||||
snapshot.update.\*
|
||||
|
||||
volume.backup.create.\
|
||||
\*
|
||||
|
||||
volume.backup.delete.\
|
||||
\*
|
||||
|
||||
volume.backup.restore.\
|
||||
\*
|
||||
- The required configuration for Block Storage service can be found in the
|
||||
`Add the Block Storage service agent for Telemetry section <http://docs.openstack.org/mitaka/install-guide-ubuntu//ceilometer-cinder.html>`__
|
||||
section in the OpenStack Installation Guide.
|
||||
|
||||
.. note::
|
||||
|
||||
@ -181,6 +196,39 @@ OpenStack Installation Guide.
|
||||
``ceilometer.conf``, Prior to the Kilo release, the notification agent
|
||||
needed database access in order to work properly.
|
||||
|
||||
Compute agent
|
||||
-------------
|
||||
|
||||
This agent is responsible for collecting resource usage data of VM
|
||||
instances on individual Compute nodes within an OpenStack deployment.
|
||||
This mechanism requires a closer interaction with the hypervisor,
|
||||
therefore a separate agent type fulfills the collection of the related
|
||||
meters, which is placed on the host machines to retrieve this
|
||||
information locally.
|
||||
|
||||
A Compute agent instance has to be installed on each and every compute
|
||||
node, installation instructions can be found in the `Install the Compute
|
||||
agent for Telemetry
|
||||
<http://docs.openstack.org/mitaka/install-guide-ubuntu/ceilometer-nova.html>`__
|
||||
section in the OpenStack Installation Guide.
|
||||
|
||||
Just like the central agent, this component also does not need a direct
|
||||
database connection. The samples are sent via AMQP to the notification agent.
|
||||
|
||||
The list of supported hypervisors can be found in
|
||||
:ref:`telemetry-supported-hypervisors`. The Compute agent uses the API of the
|
||||
hypervisor installed on the Compute hosts. Therefore, the supported meters may
|
||||
be different in case of each virtualization back end, as each inspection tool
|
||||
provides a different set of meters.
|
||||
|
||||
The list of collected meters can be found in :ref:`telemetry-compute-meters`.
|
||||
The support column provides the information about which meter is available for
|
||||
each hypervisor supported by the Telemetry service.
|
||||
|
||||
.. note::
|
||||
|
||||
Telemetry supports Libvirt, which hides the hypervisor under it.
|
||||
|
||||
Middleware for the OpenStack Object Storage service
|
||||
---------------------------------------------------
|
||||
|
||||
@ -199,8 +247,8 @@ section in the OpenStack Installation Guide.
|
||||
Telemetry middleware
|
||||
--------------------
|
||||
|
||||
Telemetry provides the capability of counting the HTTP requests and
|
||||
responses for each API endpoint in OpenStack. This is achieved by
|
||||
Telemetry provides HTTP request and API endpoint counting
|
||||
capability in OpenStack. This is achieved by
|
||||
storing a sample for each event marked as ``audit.http.request``,
|
||||
``audit.http.response``, ``http.request`` or ``http.response``.
|
||||
|
||||
@ -222,7 +270,7 @@ instances.
|
||||
Therefore Telemetry uses another method to gather this data by polling
|
||||
the infrastructure including the APIs of the different OpenStack
|
||||
services and other assets, like hypervisors. The latter case requires
|
||||
closer interaction with the compute hosts. To solve this issue,
|
||||
closer interaction with the Compute hosts. To solve this issue,
|
||||
Telemetry uses an agent based architecture to fulfill the requirements
|
||||
against the data collection.
|
||||
|
||||
@ -303,54 +351,21 @@ processed.
|
||||
Prior to the Liberty release, data from the polling agents was processed
|
||||
locally and published accordingly rather than by the notification agent.
|
||||
|
||||
Compute agent
|
||||
-------------
|
||||
|
||||
This agent is responsible for collecting resource usage data of VM
|
||||
instances on individual compute nodes within an OpenStack deployment.
|
||||
This mechanism requires a closer interaction with the hypervisor,
|
||||
therefore a separate agent type fulfills the collection of the related
|
||||
meters, which is placed on the host machines to locally retrieve this
|
||||
information.
|
||||
|
||||
A compute agent instance has to be installed on each and every compute
|
||||
node, installation instructions can be found in the `Install the Compute
|
||||
agent for Telemetry
|
||||
<http://docs.openstack.org/mitaka/install-guide-ubuntu/ceilometer-nova.html>`__
|
||||
section in the OpenStack Installation Guide.
|
||||
|
||||
Just like the central agent, this component also does not need a direct
|
||||
database connection. The samples are sent via AMQP to the notification agent.
|
||||
|
||||
The list of supported hypervisors can be found in
|
||||
:ref:`telemetry-supported-hypervisors`. The compute agent uses the API of the
|
||||
hypervisor installed on the compute hosts. Therefore the supported meters may
|
||||
be different in case of each virtualization back end, as each inspection tool
|
||||
provides a different set of meters.
|
||||
|
||||
The list of collected meters can be found in :ref:`telemetry-compute-meters`.
|
||||
The support column provides the information that which meter is available for
|
||||
each hypervisor supported by the Telemetry service.
|
||||
|
||||
.. note::
|
||||
|
||||
Telemetry supports Libvirt, which hides the hypervisor under it.
|
||||
|
||||
.. _telemetry-ipmi-agent:
|
||||
|
||||
IPMI agent
|
||||
----------
|
||||
|
||||
This agent is responsible for collecting IPMI sensor data and Intel Node
|
||||
Manager data on individual compute nodes within an OpenStack deployment.
|
||||
Manager data on individual Compute nodes within an OpenStack deployment.
|
||||
This agent requires an IPMI capable node with the ipmitool utility installed,
|
||||
which is commonly used for IPMI control on various Linux distributions.
|
||||
|
||||
An IPMI agent instance could be installed on each and every compute node
|
||||
An IPMI agent instance could be installed on each and every Compute node
|
||||
with IPMI support, except when the node is managed by the Bare metal
|
||||
service and the ``conductor.send_sensor_data`` option is set to ``true``
|
||||
in the Bare metal service. It is no harm to install this agent on a
|
||||
compute node without IPMI or Intel Node Manager support, as the agent
|
||||
Compute node without IPMI or Intel Node Manager support, as the agent
|
||||
checks for the hardware and if none is available, returns empty data. It
|
||||
is suggested that you install the IPMI agent only on an IPMI capable
|
||||
node for performance reasons.
|
||||
@ -398,7 +413,7 @@ Telemetry services.
|
||||
|
||||
For information about the required configuration options that have to be
|
||||
set in the ``ceilometer.conf`` configuration file for both the central
|
||||
and compute agents, see the `Coordination section
|
||||
and Compute agents, see the `Coordination section
|
||||
<http://docs.openstack.org/mitaka/config-reference/telemetry/telemetry_service_config_opts.html>`__
|
||||
in the OpenStack Configuration Reference.
|
||||
|
||||
@ -432,7 +447,7 @@ Polling agent HA deployment
|
||||
.. note::
|
||||
|
||||
Without the ``backend_url`` option being set only one instance of
|
||||
both the central and compute agent service is able to run and
|
||||
both the central and Compute agent service is able to run and
|
||||
function correctly.
|
||||
|
||||
The availability check of the instances is provided by heartbeat
|
||||
@ -463,7 +478,7 @@ in the OpenStack Configuration Reference.
|
||||
configuration file. For more information about pipelines see
|
||||
:ref:`data-collection-and-processing`.
|
||||
|
||||
To enable the compute agent to run multiple instances simultaneously
|
||||
To enable the Compute agent to run multiple instances simultaneously
|
||||
with workload partitioning, the ``workload_partitioning`` option has to
|
||||
be set to ``True`` under the `Compute section
|
||||
<http://docs.openstack.org/mitaka/config-reference/telemetry/telemetry_service_config_opts.html>`__
|
||||
@ -532,383 +547,6 @@ following command should be invoked:
|
||||
| volume | 48.0 |
|
||||
+-------------------+--------------------------------------------+
|
||||
|
||||
.. _data-collection-and-processing:
|
||||
|
||||
Data collection and processing
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The mechanism by which data is collected and processed is called a
|
||||
pipeline. Pipelines, at the configuration level, describe a coupling
|
||||
between sources of data and the corresponding sinks for transformation
|
||||
and publication of data.
|
||||
|
||||
A source is a producer of data: ``samples`` or ``events``. In effect, it is a
|
||||
set of pollsters or notification handlers emitting datapoints for a set
|
||||
of matching meters and event types.
|
||||
|
||||
Each source configuration encapsulates name matching, polling interval
|
||||
determination, optional resource enumeration or discovery, and mapping
|
||||
to one or more sinks for publication.
|
||||
|
||||
Data gathered can be used for different purposes, which can impact how
|
||||
frequently it needs to be published. Typically, a meter published for
|
||||
billing purposes needs to be updated every 30 minutes while the same
|
||||
meter may be needed for performance tuning every minute.
|
||||
|
||||
.. warning::
|
||||
|
||||
Rapid polling cadences should be avoided, as it results in a huge
|
||||
amount of data in a short time frame, which may negatively affect
|
||||
the performance of both Telemetry and the underlying database back
|
||||
end. We therefore strongly recommend you do not use small
|
||||
granularity values like 10 seconds.
|
||||
|
||||
A sink, on the other hand, is a consumer of data, providing logic for
|
||||
the transformation and publication of data emitted from related sources.
|
||||
|
||||
In effect, a sink describes a chain of handlers. The chain starts with
|
||||
zero or more transformers and ends with one or more publishers. The
|
||||
first transformer in the chain is passed data from the corresponding
|
||||
source, takes some action such as deriving rate of change, performing
|
||||
unit conversion, or aggregating, before passing the modified data to the
|
||||
next step that is described in :ref:`telemetry-publishers`.
|
||||
|
||||
.. _telemetry-pipeline-configuration:
|
||||
|
||||
Pipeline configuration
|
||||
----------------------
|
||||
Pipeline configuration by default, is stored in separate configuration
|
||||
files, called ``pipeline.yaml`` and ``event_pipeline.yaml``, next to
|
||||
the ``ceilometer.conf`` file. The meter pipeline and event pipeline
|
||||
configuration files can be set by the ``pipeline_cfg_file`` and
|
||||
``event_pipeline_cfg_file`` options listed in the `Description of
|
||||
configuration options for api table
|
||||
<http://docs.openstack.org/mitaka/config-reference/telemetry/telemetry_service_config_opts.html>`__
|
||||
section in the OpenStack Configuration Reference respectively. Multiple
|
||||
pipelines can be defined in one pipeline configuration file.
|
||||
|
||||
The meter pipeline definition looks like:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
---
|
||||
sources:
|
||||
- name: 'source name'
|
||||
interval: 'how often should the samples be injected into the pipeline'
|
||||
meters:
|
||||
- 'meter filter'
|
||||
resources:
|
||||
- 'list of resource URLs'
|
||||
sinks
|
||||
- 'sink name'
|
||||
sinks:
|
||||
- name: 'sink name'
|
||||
transformers: 'definition of transformers'
|
||||
publishers:
|
||||
- 'list of publishers'
|
||||
|
||||
The interval parameter in the sources section should be defined in
|
||||
seconds. It determines the polling cadence of sample injection into the
|
||||
pipeline, where samples are produced under the direct control of an
|
||||
agent.
|
||||
|
||||
There are several ways to define the list of meters for a pipeline
|
||||
source. The list of valid meters can be found in :ref:`telemetry-measurements`.
|
||||
There is a possibility to define all the meters, or just included or excluded
|
||||
meters, with which a source should operate:
|
||||
|
||||
- To include all meters, use the ``*`` wildcard symbol. It is highly
|
||||
advisable to select only the meters that you intend on using to avoid
|
||||
flooding the metering database with unused data.
|
||||
|
||||
- To define the list of meters, use either of the following:
|
||||
|
||||
- To define the list of included meters, use the ``meter_name``
|
||||
syntax.
|
||||
|
||||
- To define the list of excluded meters, use the ``!meter_name``
|
||||
syntax.
|
||||
|
||||
- For meters, which have variants identified by a complex name
|
||||
field, use the wildcard symbol to select all, for example,
|
||||
for ``instance:m1.tiny``, use ``instance:\*``.
|
||||
|
||||
.. note::
|
||||
|
||||
Please be aware that we do not have any duplication check between
|
||||
pipelines and if you add a meter to multiple pipelines then it is
|
||||
assumed the duplication is intentional and may be stored multiple
|
||||
times according to the specified sinks.
|
||||
|
||||
The above definition methods can be used in the following combinations:
|
||||
|
||||
- Use only the wildcard symbol.
|
||||
|
||||
- Use the list of included meters.
|
||||
|
||||
- Use the list of excluded meters.
|
||||
|
||||
- Use wildcard symbol with the list of excluded meters.
|
||||
|
||||
.. note::
|
||||
|
||||
At least one of the above variations should be included in the
|
||||
meters section. Included and excluded meters cannot co-exist in the
|
||||
same pipeline. Wildcard and included meters cannot co-exist in the
|
||||
same pipeline definition section.
|
||||
|
||||
The optional resources section of a pipeline source allows a static list
|
||||
of resource URLs to be configured for polling.
|
||||
|
||||
The transformers section of a pipeline sink provides the possibility to
|
||||
add a list of transformer definitions. The available transformers are:
|
||||
|
||||
+-----------------------+------------------------------------+
|
||||
| Name of transformer | Reference name for configuration |
|
||||
+=======================+====================================+
|
||||
| Accumulator | accumulator |
|
||||
+-----------------------+------------------------------------+
|
||||
| Aggregator | aggregator |
|
||||
+-----------------------+------------------------------------+
|
||||
| Arithmetic | arithmetic |
|
||||
+-----------------------+------------------------------------+
|
||||
| Rate of change | rate\_of\_change |
|
||||
+-----------------------+------------------------------------+
|
||||
| Unit conversion | unit\_conversion |
|
||||
+-----------------------+------------------------------------+
|
||||
| Delta | delta |
|
||||
+-----------------------+------------------------------------+
|
||||
|
||||
The publishers section contains the list of publishers, where the
|
||||
samples data should be sent after the possible transformations.
|
||||
|
||||
Similarly, the event pipeline definition looks like:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
---
|
||||
sources:
|
||||
- name: 'source name'
|
||||
events:
|
||||
- 'event filter'
|
||||
sinks
|
||||
- 'sink name'
|
||||
sinks:
|
||||
- name: 'sink name'
|
||||
publishers:
|
||||
- 'list of publishers'
|
||||
|
||||
The event filter uses the same filtering logic as the meter pipeline.
|
||||
|
||||
.. _telemetry-transformers:
|
||||
|
||||
Transformers
|
||||
^^^^^^^^^^^^
|
||||
|
||||
The definition of transformers can contain the following fields:
|
||||
|
||||
name
|
||||
Name of the transformer.
|
||||
|
||||
parameters
|
||||
Parameters of the transformer.
|
||||
|
||||
The parameters section can contain transformer specific fields, like
|
||||
source and target fields with different subfields in case of the rate of
|
||||
change, which depends on the implementation of the transformer.
|
||||
|
||||
In the case of the transformer that creates the ``cpu_util`` meter, the
|
||||
definition looks like:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "rate_of_change"
|
||||
parameters:
|
||||
target:
|
||||
name: "cpu_util"
|
||||
unit: "%"
|
||||
type: "gauge"
|
||||
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"
|
||||
|
||||
The rate of change the transformer generates is the ``cpu_util`` meter
|
||||
from the sample values of the ``cpu`` counter, which represents
|
||||
cumulative CPU time in nanoseconds. The transformer definition above
|
||||
defines a scale factor (for nanoseconds and multiple CPUs), which is
|
||||
applied before the transformation derives a sequence of gauge samples
|
||||
with unit ``%``, from sequential values of the ``cpu`` meter.
|
||||
|
||||
The definition for the disk I/O rate, which is also generated by the
|
||||
rate of change transformer:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "rate_of_change"
|
||||
parameters:
|
||||
source:
|
||||
map_from:
|
||||
name: "disk\\.(read|write)\\.(bytes|requests)"
|
||||
unit: "(B|request)"
|
||||
target:
|
||||
map_to:
|
||||
name: "disk.\\1.\\2.rate"
|
||||
unit: "\\1/s"
|
||||
type: "gauge"
|
||||
|
||||
**Unit conversion transformer**
|
||||
|
||||
Transformer to apply a unit conversion. It takes the volume of the meter
|
||||
and multiplies it with the given ``scale`` expression. Also supports
|
||||
``map_from`` and ``map_to`` like the rate of change transformer.
|
||||
|
||||
Sample configuration:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "unit_conversion"
|
||||
parameters:
|
||||
target:
|
||||
name: "disk.kilobytes"
|
||||
unit: "KB"
|
||||
scale: "volume * 1.0 / 1024.0"
|
||||
|
||||
With ``map_from`` and ``map_to``:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "unit_conversion"
|
||||
parameters:
|
||||
source:
|
||||
map_from:
|
||||
name: "disk\\.(read|write)\\.bytes"
|
||||
target:
|
||||
map_to:
|
||||
name: "disk.\\1.kilobytes"
|
||||
scale: "volume * 1.0 / 1024.0"
|
||||
unit: "KB"
|
||||
|
||||
**Aggregator transformer**
|
||||
|
||||
A transformer that sums up the incoming samples until enough samples
|
||||
have come in or a timeout has been reached.
|
||||
|
||||
Timeout can be specified with the ``retention_time`` option. If we want
|
||||
to flush the aggregation after a set number of samples have been
|
||||
aggregated, we can specify the size parameter.
|
||||
|
||||
The volume of the created sample is the sum of the volumes of samples
|
||||
that came into the transformer. Samples can be aggregated by the
|
||||
attributes ``project_id``, ``user_id`` and ``resource_metadata``. To aggregate
|
||||
by the chosen attributes, specify them in the configuration and set which
|
||||
value of the attribute to take for the new sample (first to take the
|
||||
first sample's attribute, last to take the last sample's attribute, and
|
||||
drop to discard the attribute).
|
||||
|
||||
To aggregate 60s worth of samples by ``resource_metadata`` and keep the
|
||||
``resource_metadata`` of the latest received sample:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "aggregator"
|
||||
parameters:
|
||||
retention_time: 60
|
||||
resource_metadata: last
|
||||
|
||||
To aggregate each 15 samples by ``user_id`` and ``resource_metadata`` and keep
|
||||
the ``user_id`` of the first received sample and drop the
|
||||
``resource_metadata``:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "aggregator"
|
||||
parameters:
|
||||
size: 15
|
||||
user_id: first
|
||||
resource_metadata: drop
|
||||
|
||||
**Accumulator transformer**
|
||||
|
||||
This transformer simply caches the samples until enough samples have
|
||||
arrived and then flushes them all down the pipeline at once:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "accumulator"
|
||||
parameters:
|
||||
size: 15
|
||||
|
||||
**Multi meter arithmetic transformer**
|
||||
|
||||
This transformer enables us to perform arithmetic calculations over one
|
||||
or more meters and/or their metadata, for example:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
memory_util = 100 * memory.usage / memory
|
||||
|
||||
A new sample is created with the properties described in the ``target``
|
||||
section of the transformer's configuration. The sample's
|
||||
volume is the result of the provided expression. The calculation is
|
||||
performed on samples from the same resource.
|
||||
|
||||
.. note::
|
||||
|
||||
The calculation is limited to meters with the same interval.
|
||||
|
||||
Example configuration:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "arithmetic"
|
||||
parameters:
|
||||
target:
|
||||
name: "memory_util"
|
||||
unit: "%"
|
||||
type: "gauge"
|
||||
expr: "100 * $(memory.usage) / $(memory)"
|
||||
|
||||
To demonstrate the use of metadata, here is the implementation of a
|
||||
silly meter that shows average CPU time per core:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "arithmetic"
|
||||
parameters:
|
||||
target:
|
||||
name: "avg_cpu_per_core"
|
||||
unit: "ns"
|
||||
type: "cumulative"
|
||||
expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"
|
||||
|
||||
.. note::
|
||||
|
||||
Expression evaluation gracefully handles NaNs and exceptions. In
|
||||
such a case it does not create a new sample but only logs a warning.
|
||||
|
||||
**Delta transformer**
|
||||
|
||||
This transformer calculates the change between two sample datapoints of a
|
||||
resource. It can be configured to capture only the positive growth deltas.
|
||||
|
||||
Example configuration:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "delta"
|
||||
parameters:
|
||||
target:
|
||||
name: "cpu.delta"
|
||||
growth_only: True
|
||||
|
||||
.. _telemetry-meter-definitions:
|
||||
|
||||
Meter definitions
|
||||
@ -1110,25 +748,29 @@ your database in a consistent state.
|
||||
|
||||
The level of support differs in case of the configured back end:
|
||||
|
||||
+--------------------+-------------------+------------------------------------+
|
||||
| Database | TTL value support | Note |
|
||||
+====================+===================+====================================+
|
||||
| MongoDB | Yes | MongoDB has native TTL support for |
|
||||
| | | deleting samples that are older |
|
||||
| | | than the configured ttl value. |
|
||||
+--------------------+-------------------+------------------------------------+
|
||||
| SQL-based back | Yes | ``ceilometer-expirer`` has to be |
|
||||
| ends | | used for deleting samples and its |
|
||||
| | | related data from the database. |
|
||||
+--------------------+-------------------+------------------------------------+
|
||||
| HBase | No | Telemetry's HBase support does not |
|
||||
| | | include native TTL nor |
|
||||
| | | ``ceilometer-expirer`` support. |
|
||||
+--------------------+-------------------+------------------------------------+
|
||||
| DB2 NoSQL | No | DB2 NoSQL does not have native TTL |
|
||||
| | | nor ``ceilometer-expirer`` |
|
||||
| | | support. |
|
||||
+--------------------+-------------------+------------------------------------+
|
||||
.. list-table::
|
||||
:widths: 33 33 33
|
||||
:header-rows: 1
|
||||
|
||||
* - Database
|
||||
- TTL value support
|
||||
- Note
|
||||
* - MongoDB
|
||||
- Yes
|
||||
- MongoDB has native TTL support for deleting samples
|
||||
that are older than the configured ttl value.
|
||||
* - SQL-based back ends
|
||||
- Yes
|
||||
- ``ceilometer-expirer`` has to be used for deleting
|
||||
samples and its related data from the database.
|
||||
* - HBase
|
||||
- No
|
||||
- Telemetry's HBase support does not include native TTL
|
||||
nor ``ceilometer-expirer`` support.
|
||||
* - DB2 NoSQL
|
||||
- No
|
||||
- DB2 NoSQL does not have native TTL
|
||||
nor ``ceilometer-expirer`` support.
|
||||
|
||||
HTTP dispatcher
|
||||
---------------
|
||||
|
386
doc/admin-guide/source/telemetry-data-pipelines.rst
Normal file
386
doc/admin-guide/source/telemetry-data-pipelines.rst
Normal file
@ -0,0 +1,386 @@
|
||||
.. _data-collection-and-processing:
|
||||
|
||||
==========================================
|
||||
Data collection, processing, and pipelines
|
||||
==========================================
|
||||
|
||||
The mechanism by which data is collected and processed is called a
|
||||
pipeline. Pipelines, at the configuration level, describe a coupling
|
||||
between sources of data and the corresponding sinks for transformation
|
||||
and publication of data.
|
||||
|
||||
A source is a producer of data: ``samples`` or ``events``. In effect, it is a
|
||||
set of pollsters or notification handlers emitting datapoints for a set
|
||||
of matching meters and event types.
|
||||
|
||||
Each source configuration encapsulates name matching, polling interval
|
||||
determination, optional resource enumeration or discovery, and mapping
|
||||
to one or more sinks for publication.
|
||||
|
||||
Data gathered can be used for different purposes, which can impact how
|
||||
frequently it needs to be published. Typically, a meter published for
|
||||
billing purposes needs to be updated every 30 minutes while the same
|
||||
meter may be needed for performance tuning every minute.
|
||||
|
||||
.. warning::
|
||||
|
||||
Rapid polling cadences should be avoided, as it results in a huge
|
||||
amount of data in a short time frame, which may negatively affect
|
||||
the performance of both Telemetry and the underlying database back
|
||||
end. We strongly recommend you do not use small granularity
|
||||
values like 10 seconds.
|
||||
|
||||
A sink, on the other hand, is a consumer of data, providing logic for
|
||||
the transformation and publication of data emitted from related sources.
|
||||
|
||||
In effect, a sink describes a chain of handlers. The chain starts with
|
||||
zero or more transformers and ends with one or more publishers. The
|
||||
first transformer in the chain is passed data from the corresponding
|
||||
source, takes some action such as deriving rate of change, performing
|
||||
unit conversion, or aggregating, before passing the modified data to the
|
||||
next step that is described in :ref:`telemetry-publishers`.
|
||||
|
||||
.. _telemetry-pipeline-configuration:
|
||||
|
||||
Pipeline configuration
|
||||
~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The pipeline configuration is, by default stored in separate configuration
|
||||
files called ``pipeline.yaml`` and ``event_pipeline.yaml`` next to
|
||||
the ``ceilometer.conf`` file. The meter pipeline and event pipeline
|
||||
configuration files can be set by the ``pipeline_cfg_file`` and
|
||||
``event_pipeline_cfg_file`` options listed in the `Description of
|
||||
configuration options for api table
|
||||
<http://docs.openstack.org/mitaka/config-reference/telemetry/telemetry_service_config_opts.html>`__
|
||||
section in the OpenStack Configuration Reference respectively. Multiple
|
||||
pipelines can be defined in one pipeline configuration file.
|
||||
|
||||
The meter pipeline definition looks like:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
---
|
||||
sources:
|
||||
- name: 'source name'
|
||||
interval: 'how often should the samples be injected into the pipeline'
|
||||
meters:
|
||||
- 'meter filter'
|
||||
resources:
|
||||
- 'list of resource URLs'
|
||||
sinks
|
||||
- 'sink name'
|
||||
sinks:
|
||||
- name: 'sink name'
|
||||
transformers: 'definition of transformers'
|
||||
publishers:
|
||||
- 'list of publishers'
|
||||
|
||||
The interval parameter in the sources section should be defined in
|
||||
seconds. It determines the polling cadence of sample injection into the
|
||||
pipeline, where samples are produced under the direct control of an
|
||||
agent.
|
||||
|
||||
There are several ways to define the list of meters for a pipeline
|
||||
source. The list of valid meters can be found in :ref:`telemetry-measurements`.
|
||||
There is a possibility to define all the meters, or just included or excluded
|
||||
meters, with which a source should operate:
|
||||
|
||||
- To include all meters, use the ``*`` wildcard symbol. It is highly
|
||||
advisable to select only the meters that you intend on using to avoid
|
||||
flooding the metering database with unused data.
|
||||
|
||||
- To define the list of meters, use either of the following:
|
||||
|
||||
- To define the list of included meters, use the ``meter_name``
|
||||
syntax.
|
||||
|
||||
- To define the list of excluded meters, use the ``!meter_name``
|
||||
syntax.
|
||||
|
||||
- For meters, which have variants identified by a complex name
|
||||
field, use the wildcard symbol to select all, for example,
|
||||
for ``instance:m1.tiny``, use ``instance:\*``.
|
||||
|
||||
.. note::
|
||||
|
||||
The OpenStack Telemetry service does not have any duplication check
|
||||
between pipelines, and if you add a meter to multiple pipelines then it is
|
||||
assumed the duplication is intentional and may be stored multiple
|
||||
times according to the specified sinks.
|
||||
|
||||
The above definition methods can be used in the following combinations:
|
||||
|
||||
- Use only the wildcard symbol.
|
||||
|
||||
- Use the list of included meters.
|
||||
|
||||
- Use the list of excluded meters.
|
||||
|
||||
- Use wildcard symbol with the list of excluded meters.
|
||||
|
||||
.. note::
|
||||
|
||||
At least one of the above variations should be included in the
|
||||
meters section. Included and excluded meters cannot co-exist in the
|
||||
same pipeline. Wildcard and included meters cannot co-exist in the
|
||||
same pipeline definition section.
|
||||
|
||||
The optional resources section of a pipeline source allows a static list
|
||||
of resource URLs to be configured for polling.
|
||||
|
||||
The transformers section of a pipeline sink provides the possibility to
|
||||
add a list of transformer definitions. The available transformers are:
|
||||
|
||||
.. list-table::
|
||||
:widths: 50 50
|
||||
:header-rows: 1
|
||||
|
||||
* - Name of transformer
|
||||
- Reference name for configuration
|
||||
* - Accumulator
|
||||
- accumulator
|
||||
* - Aggregator
|
||||
- aggregator
|
||||
* - Arithmetic
|
||||
- arithmetic
|
||||
* - Rate of change
|
||||
- rate\_of\_change
|
||||
* - Unit conversion
|
||||
- unit\_conversion
|
||||
* - Delta
|
||||
- delta
|
||||
|
||||
The publishers section contains the list of publishers, where the
|
||||
samples data should be sent after the possible transformations.
|
||||
|
||||
Similarly, the event pipeline definition looks like:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
---
|
||||
sources:
|
||||
- name: 'source name'
|
||||
events:
|
||||
- 'event filter'
|
||||
sinks
|
||||
- 'sink name'
|
||||
sinks:
|
||||
- name: 'sink name'
|
||||
publishers:
|
||||
- 'list of publishers'
|
||||
|
||||
The event filter uses the same filtering logic as the meter pipeline.
|
||||
|
||||
.. _telemetry-transformers:
|
||||
|
||||
Transformers
|
||||
------------
|
||||
|
||||
The definition of transformers can contain the following fields:
|
||||
|
||||
name
|
||||
Name of the transformer.
|
||||
|
||||
parameters
|
||||
Parameters of the transformer.
|
||||
|
||||
The parameters section can contain transformer specific fields, like
|
||||
source and target fields with different subfields in case of the rate of
|
||||
change, which depends on the implementation of the transformer.
|
||||
|
||||
In the case of the transformer that creates the ``cpu_util`` meter, the
|
||||
definition looks like:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "rate_of_change"
|
||||
parameters:
|
||||
target:
|
||||
name: "cpu_util"
|
||||
unit: "%"
|
||||
type: "gauge"
|
||||
scale: "100.0 / (10**9 * (resource_metadata.cpu_number or 1))"
|
||||
|
||||
The rate of change the transformer generates is the ``cpu_util`` meter
|
||||
from the sample values of the ``cpu`` counter, which represents
|
||||
cumulative CPU time in nanoseconds. The transformer definition above
|
||||
defines a scale factor (for nanoseconds and multiple CPUs), which is
|
||||
applied before the transformation derives a sequence of gauge samples
|
||||
with unit ``%``, from sequential values of the ``cpu`` meter.
|
||||
|
||||
The definition for the disk I/O rate, which is also generated by the
|
||||
rate of change transformer:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "rate_of_change"
|
||||
parameters:
|
||||
source:
|
||||
map_from:
|
||||
name: "disk\\.(read|write)\\.(bytes|requests)"
|
||||
unit: "(B|request)"
|
||||
target:
|
||||
map_to:
|
||||
name: "disk.\\1.\\2.rate"
|
||||
unit: "\\1/s"
|
||||
type: "gauge"
|
||||
|
||||
Unit conversion transformer
|
||||
---------------------------
|
||||
|
||||
Transformer to apply a unit conversion. It takes the volume of the meter
|
||||
and multiplies it with the given ``scale`` expression. Also supports
|
||||
``map_from`` and ``map_to`` like the rate of change transformer.
|
||||
|
||||
Sample configuration:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "unit_conversion"
|
||||
parameters:
|
||||
target:
|
||||
name: "disk.kilobytes"
|
||||
unit: "KB"
|
||||
scale: "volume * 1.0 / 1024.0"
|
||||
|
||||
With ``map_from`` and ``map_to``:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "unit_conversion"
|
||||
parameters:
|
||||
source:
|
||||
map_from:
|
||||
name: "disk\\.(read|write)\\.bytes"
|
||||
target:
|
||||
map_to:
|
||||
name: "disk.\\1.kilobytes"
|
||||
scale: "volume * 1.0 / 1024.0"
|
||||
unit: "KB"
|
||||
|
||||
Aggregator transformer
|
||||
----------------------
|
||||
|
||||
A transformer that sums up the incoming samples until enough samples
|
||||
have come in or a timeout has been reached.
|
||||
|
||||
Timeout can be specified with the ``retention_time`` option. If you want
|
||||
to flush the aggregation, after a set number of samples have been
|
||||
aggregated, specify the size parameter.
|
||||
|
||||
The volume of the created sample is the sum of the volumes of samples
|
||||
that came into the transformer. Samples can be aggregated by the
|
||||
attributes ``project_id``, ``user_id`` and ``resource_metadata``. To aggregate
|
||||
by the chosen attributes, specify them in the configuration and set which
|
||||
value of the attribute to take for the new sample (first to take the
|
||||
first sample's attribute, last to take the last sample's attribute, and
|
||||
drop to discard the attribute).
|
||||
|
||||
To aggregate 60s worth of samples by ``resource_metadata`` and keep the
|
||||
``resource_metadata`` of the latest received sample:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "aggregator"
|
||||
parameters:
|
||||
retention_time: 60
|
||||
resource_metadata: last
|
||||
|
||||
To aggregate each 15 samples by ``user_id`` and ``resource_metadata`` and keep
|
||||
the ``user_id`` of the first received sample and drop the
|
||||
``resource_metadata``:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "aggregator"
|
||||
parameters:
|
||||
size: 15
|
||||
user_id: first
|
||||
resource_metadata: drop
|
||||
|
||||
Accumulator transformer
|
||||
-----------------------
|
||||
|
||||
This transformer simply caches the samples until enough samples have
|
||||
arrived and then flushes them all down the pipeline at once:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "accumulator"
|
||||
parameters:
|
||||
size: 15
|
||||
|
||||
Multi meter arithmetic transformer
|
||||
----------------------------------
|
||||
|
||||
This transformer enables us to perform arithmetic calculations over one
|
||||
or more meters and/or their metadata, for example:
|
||||
|
||||
.. code-block:: json
|
||||
|
||||
memory_util = 100 * memory.usage / memory
|
||||
|
||||
A new sample is created with the properties described in the ``target``
|
||||
section of the transformer's configuration. The sample's
|
||||
volume is the result of the provided expression. The calculation is
|
||||
performed on samples from the same resource.
|
||||
|
||||
.. note::
|
||||
|
||||
The calculation is limited to meters with the same interval.
|
||||
|
||||
Example configuration:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "arithmetic"
|
||||
parameters:
|
||||
target:
|
||||
name: "memory_util"
|
||||
unit: "%"
|
||||
type: "gauge"
|
||||
expr: "100 * $(memory.usage) / $(memory)"
|
||||
|
||||
To demonstrate the use of metadata, the following implementation of a
|
||||
novel meter shows average CPU time per core:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "arithmetic"
|
||||
parameters:
|
||||
target:
|
||||
name: "avg_cpu_per_core"
|
||||
unit: "ns"
|
||||
type: "cumulative"
|
||||
expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)"
|
||||
|
||||
.. note::
|
||||
|
||||
Expression evaluation gracefully handles NaNs and exceptions. In
|
||||
such a case it does not create a new sample but only logs a warning.
|
||||
|
||||
Delta transformer
|
||||
-----------------
|
||||
|
||||
This transformer calculates the change between two sample datapoints of a
|
||||
resource. It can be configured to capture only the positive growth deltas.
|
||||
|
||||
Example configuration:
|
||||
|
||||
.. code-block:: yaml
|
||||
|
||||
transformers:
|
||||
- name: "delta"
|
||||
parameters:
|
||||
target:
|
||||
name: "cpu.delta"
|
||||
growth_only: True
|
@ -49,6 +49,7 @@ tab on an OpenStack dashboard.
|
||||
|
||||
telemetry-system-architecture.rst
|
||||
telemetry-data-collection.rst
|
||||
telemetry-data-pipelines.rst
|
||||
telemetry-data-retrieval.rst
|
||||
telemetry-alarms.rst
|
||||
telemetry-measurements.rst
|
||||
|
Loading…
x
Reference in New Issue
Block a user