cloudkitty/doc/source/developer/collector.rst
Luka Peschke 492ec063a7 Remove transformers from the codebase
Since data frames are now handled as objects, transformers are no longer
required. This simplifies the global codebase.

Story: 2005890
Task: 36075
Change-Id: I76d9117bd95d80e51ca95804c999f145e65c3a2d
2019-09-10 14:51:02 +00:00

179 lines
5.2 KiB
ReStructuredText

=========
Collector
=========
Data format
===========
Internally, CloudKitty's data format is a bit more detailled than what can be
found in the `architecture documentation`_.
The internal data format is the following:
.. code-block:: json
{
"bananas": [
{
"vol": {
"unit": "banana",
"qty": 1
},
"rating": {
"price": 1
},
"groupby": {
"xxx_id": "hello",
"yyy_id": "bye",
},
"metadata": {
"flavor": "chocolate",
"eaten_by": "gorilla",
},
}
],
}
However, developers implementing a collector don't need to format the data
themselves, as there are helper functions for these matters.
Implementation
==============
Each collector must implement the following class:
.. autoclass:: cloudkitty.collector.BaseCollector
:members: fetch_all, check_configuration
The ``retrieve`` method of the ``BaseCollector`` class is called by the
orchestrator. This method calls the ``fetch_all`` method of the child class.
To create a collector, you need to implement at least the ``fetch_all`` method.
Data collection
+++++++++++++++
Collectors must implement a ``fetch_all`` method. This method is called for
each metric type, for each scope, for each collect period. It has the
following prototype:
.. autoclass:: cloudkitty.collector.BaseCollector
:members: fetch_all
This method is supposed to return a list of
``cloudkitty.dataframe.DataPoint`` objects.
Example code of a basic collector:
.. code-block:: python
from cloudkitty.collector import BaseCollector
class MyCollector(BaseCollector):
def __init__(self, **kwargs):
super(MyCollector, self).__init__(**kwargs)
def fetch_all(self, metric_name, start, end,
project_id=None, q_filter=None):
data = []
for CONDITION:
# do stuff
data.append(dataframe.DataPoint(
unit,
qty, # int, float, decimal.Decimal or str
0, # price
groupby, # dict
metadata, # dict
))
return data
``project_id`` can be misleading, as it is a legacy name. It contains the
ID of the current scope. The attribute corresponding to the scope is specified
in the configuration, under ``[collect]/scope_key``. Thus, all queries should
filter based on this attribute. Example:
.. code-block:: python
from oslo_config import cfg
from cloudkitty.collector import BaseCollector
CONF = cfg.CONF
class MyCollector(BaseCollector):
def __init__(self, **kwargs):
super(MyCollector, self).__init__(**kwargs)
def fetch_all(self, metric_name, start, end,
project_id=None, q_filter=None):
scope_key = CONF.collect.scope_key
filters = {'start': start, 'stop': stop, scope_key: project_id}
data = self.client.query(
filters=filters,
groupby=self.conf[metric_name]['groupby'])
# Format data etc
return output
Additional configuration
++++++++++++++++++++++++
If you need to extend the metric configuration (add parameters to the
``extra_args`` section of ``metrics.yml``), you can overload the
``check_configuration`` method of the base collector:
.. autoclass:: cloudkitty.collector.BaseCollector
:members: check_configuration
This method uses `voluptuous`_ for data validation. The base schema for each
metric can be found in ``cloudkitty.collector.METRIC_BASE_SCHEMA``. This schema
is meant to be extended by other collectors. Example taken from the gnocchi
collector code:
.. code-block:: python
from cloudkitty import collector
GNOCCHI_EXTRA_SCHEMA = {
Required('extra_args'): {
Required('resource_type'): All(str, Length(min=1)),
# Due to Gnocchi model, metric are grouped by resource.
# This parameter allows to adapt the key of the resource identifier
Required('resource_key', default='id'): All(str, Length(min=1)),
Required('aggregation_method', default='max'):
In(['max', 'mean', 'min']),
},
}
class GnocchiCollector(collector.BaseCollector):
collector_name = 'gnocchi'
@staticmethod
def check_configuration(conf):
conf = collector.BaseCollector.check_configuration(conf)
metric_schema = Schema(collector.METRIC_BASE_SCHEMA).extend(
GNOCCHI_EXTRA_SCHEMA)
output = {}
for metric_name, metric in conf.items():
met = output[metric_name] = metric_schema(metric)
if met['extra_args']['resource_key'] not in met['groupby']:
met['groupby'].append(met['extra_args']['resource_key'])
return output
If your collector does not need any ``extra_args``, it is not required to
overload the ``check_configuration`` method.
.. _architecture documentation: ../admin/architecture.html
.. _voluptuous: https://github.com/alecthomas/voluptuous