From 0d06a3a014360f694c3b996bc67e8d6aaab31ca8 Mon Sep 17 00:00:00 2001 From: Nejc Saje Date: Tue, 8 Jul 2014 15:51:50 +0200 Subject: [PATCH] Arithmetic transformer spec Spec for a transformer that can do arithmetic operations on on one or multiple meters. Change-Id: If9843250b3a5a020b3f39d45d9a6268bb15a2412 --- specs/juno/arithmetic-transformer.rst | 266 ++++++++++++++++++++++++++ 1 file changed, 266 insertions(+) create mode 100644 specs/juno/arithmetic-transformer.rst diff --git a/specs/juno/arithmetic-transformer.rst b/specs/juno/arithmetic-transformer.rst new file mode 100644 index 0000000..4ef5fc2 --- /dev/null +++ b/specs/juno/arithmetic-transformer.rst @@ -0,0 +1,266 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + + http://creativecommons.org/licenses/by/3.0/legalcode + +========================================== +Multi meter arithmetic transformer +========================================== + +https://blueprints.launchpad.net/ceilometer/+spec/arithmetic-transformer + +Adding the ability to take more than one meter into account when performing +a transformation. An example of this is the memory utilization, where:: + + memory_util = 100 * memory.usage / memory + +Problem description +=================== + +Pipeline transformers are currently performing calculations only on one or +more values of the same meter. Some calculations need to take into account +multiple meters, like in the case of memory utilization, where:: + + memory_util = 100 * memory.usage / memory + +Proposed change +=============== + +We suggest an implementation of a flexible transformer capable of performing +arithmetic operations on multiple meters and/or their metadata. To prevent +irregular cadence of the produced meter, calculation is limited to meters +with the same interval. + +The configuration of the new transformer would be:: + + - name: "arithmetic" + parameters: + target: + name: "memory_util" + unit: "%" + type: "gauge" + expr: "100 * $(memory.usage) / $(memory)" + +To demonstrate the use of metadata, here is the implementation of +a silly metric that shows average CPU time per core:: + + - name: "arithmetic" + parameters: + target: + name: "avg_cpu_per_core" + unit: "ns" + type: "cumulative" + expr: "$(cpu) / ($(cpu).resource_metadata.cpu_number or 1)" + +Expression evaluation would gracefully handle NaNs and exceptions. In such +a case it would not create a new sample but would only log a warning. + +**Parsing the meter names** + +Since meter names are not necessarily valid Python identifiers, we must +escape them. The expression string is only parsed at the time the pipeline +is initialized, so it does not cause any performance overhead. + +The escaping is done as follows:: + + $(cpu) -> cpu + $(cpu_util) -> cpu_util + $(cpu.util) -> _cpu_util_ESCAPED (in order to avoid clashes with legal + meter names) + $(class) -> _class_ESCAPED (avoid clash with Python keywords) + +Also, if the meter name isn't followed by a dot ('.'), we default to the +volume parameter. + +So, the expression:: + + $(memory.usage) / $(memory) + +will be translated to:: + + _memory_usage_ESCAPED.volume / memory.volume + +and eval'd with the following Namespace_ object + +.. _Namespace: https://github.com/openstack/ceilometer/blob/master/ceilometer/transformer/conversions.py#L30) + +:: + + Namespace({ + "_memory_usage_ESCAPED": { ... memory.usage sample dict ... } + "memory": { ... memory sample dict ... } + }) + +**Gathering the values** + +Not all the needed samples will arrive at the same time. Multiple approaches +are possible here. + +1. *[CHOSEN APPROACH]* We can cache samples of the necessary meters (keyed by +resource ID) as they arrive. When the pipeline is flushed, we can check if we +have the necessary samples and do the transformation. + +* *pro*: no need for specifying TTL for cached samples +* *con*: arithmetic operation limited only to samples that originate + from the same polling task (=> same interval) + +2. We cache the samples same as above, but do the transformation +as soon as we have the necessary samples (samples that are used in the +calculation expression). After the transformation +we clear the cache. The user provides the maximum age (TTL) of a sample for it +to be considered for calculation. E.g. we get a sample of 'memory.usage' +and we already have a sample of 'memory' in cache, which is less than +TTL=10s old. We can perform the calculation. + +* *pro*: meters used not limited to the same polling task +* *pro*: largest flexibility of all options +* *con*: need to specify maximum age for samples to be used in calculation +* *con*: might produce samples with irregular cadence, as in the case of + the calculation involving meter A with 60s cadence and meter B with 45s + cadence + +3. Similar as #2, but limit only to meters from sources with the same interval. + +* *pro*: produced samples can't have irregular cadence +* *pro*: since all meters involved have the same interval, we can have a + static TTL for samples, no need for user configuration. If we know that + all the samples arrive at approximately the same time (because they are + on the same interval and therefore in the same polling task), we only have + a short period of time in which the samples are valid (say, 5 seconds). +* *con*: The next question is how to set the TTL? With a 600s cadence, + 5 seconds is OK, but with 10s cadence it probably isn't. +* *con*: how to technically enforce this restriction? + +4. Same as #2, but provide a configuration option to not clear the cache +after calculation. + +* *pro*: enables use cases like "we get 'memory.usage' every 10s and 'memory' + every '600s'. If we don't clear the cache, we can calculate 'memory_util' + every 10s using the cached value of 'memory'. +* *con*: extra configuration option might be confusing for the user. The + benefit might outweigh this. + +Alternatives +------------ + +* require user to assign the meters to "variables", e.g.:: + + - name: "arithmetic" + parameters: + target: + name: "memory_util" + unit: "%" + type: "gauge" + a: "memory.usage" + b: "memory" + expr: "100 * a / b" + + Unnecessary since the parsing is only done at pipeline initialization and so + does not cause any overhead in processing. + +* somehow change the pipeline to only send the samples to transformer when + it has all the required samples + + Our approach is simpler, since other transformers already use caching so it + wouldn't make sense to push this functionality to the pipeline. + +Data model impact +----------------- + +None + +REST API impact +--------------- + +None + +Security impact +--------------- + +None + +Pipeline impact +--------------- + +It impacts the format of the pipeline.yaml as described in section +Implementation. + +Other end user impact +--------------------- + +None + +Performance/Scalability Impacts +------------------------------- + +None. Expression parsing is performed only at pipeline initialization. +In regard to caching, other transformers already cache values between +intervals. The only difference is that we would store values for more than +one meter. + + +Other deployer impact +--------------------- + +None + +Developer impact +---------------- + +None + + +Implementation +============== + +Assignee(s) +----------- + +Primary assignee: + nejc-saje + +Ongoing maintainer: + nejc-saje + +Work Items +---------- + +* create the new transformer +* write unit tests for the new transformer + + +Future lifecycle +================ + +None + + +Dependencies +============ + +None + + +Testing +======= + +With the current capabilities of Tempest testing I think unit tests +should suffice. + +Documentation Impact +==================== + +Documentation should be added for the new type of transformer. + +Documentation must make it clear that the meters included in the expression +must have the same cadence. In case that the expression contains meters +with different cadences or nonexisting meters, the user should expect to have +a warning logged for each unsuccessful calculation. + + +References +========== + +None +