407bc49f13
This reverts commit 6eb7f3b72b
.
Reason for revert: Kubernetes service patch broke ovn-* relations
Upgrade to kubernetes_service_patch changed the way the service loadbalancer is setup. Before, it updated the application service, now it creates a new one. This broke ovn-* relations to outside the cluster because the `ingress-bound-address` is now filled with a k8s svc ip address, not reachable from the outside.
Change-Id: I38819cfcf647215e1eec595c2457300cb9058f90
2014 lines
79 KiB
Python
2014 lines
79 KiB
Python
# Copyright 2021 Canonical Ltd.
|
|
# See LICENSE file for licensing details.
|
|
|
|
"""## Overview.
|
|
|
|
This document explains how to integrate with the Grafana charm
|
|
for the purpose of providing a dashboard which can be used by
|
|
end users. It also explains the structure of the data
|
|
expected by the `grafana-dashboard` interface, and may provide a
|
|
mechanism or reference point for providing a compatible interface
|
|
or library by providing a definitive reference guide to the
|
|
structure of relation data which is shared between the Grafana
|
|
charm and any charm providing datasource information.
|
|
|
|
## Provider Library Usage
|
|
|
|
The Grafana charm interacts with its dashboards using its charm
|
|
library. The goal of this library is to be as simple to use as
|
|
possible, and instantiation of the class with or without changing
|
|
the default arguments provides a complete use case. For the simplest
|
|
use case of a charm which bundles dashboards and provides a
|
|
`provides: grafana-dashboard` interface,
|
|
|
|
requires:
|
|
grafana-dashboard:
|
|
interface: grafana_dashboard
|
|
|
|
creation of a `GrafanaDashboardProvider` object with the default arguments is
|
|
sufficient.
|
|
|
|
:class:`GrafanaDashboardProvider` expects that bundled dashboards should
|
|
be included in your charm with a default path of:
|
|
|
|
path/to/charm.py
|
|
path/to/src/grafana_dashboards/*.{json|json.tmpl|.tmpl}
|
|
|
|
Where the files are Grafana dashboard JSON data either from the
|
|
Grafana marketplace, or directly exported from a Grafana instance.
|
|
Refer to the [official docs](https://grafana.com/tutorials/provision-dashboards-and-data-sources/)
|
|
for more information.
|
|
|
|
When constructing a dashboard that is intended to be consumed by COS, make sure to use variables
|
|
for your datasources, and name them "prometheusds" and "lokids". You can also use the following
|
|
juju topology variables in your dashboards: $juju_model, $juju_model_uuid, $juju_application
|
|
and $juju_unit. Note, however, that if metrics are coming via peripheral charms (scrape-config
|
|
or cos-config) then topology labels would not exist.
|
|
|
|
The default constructor arguments are:
|
|
|
|
`charm`: `self` from the charm instantiating this library
|
|
`relation_name`: grafana-dashboard
|
|
`dashboards_path`: "/src/grafana_dashboards"
|
|
|
|
If your configuration requires any changes from these defaults, they
|
|
may be set from the class constructor. It may be instantiated as
|
|
follows:
|
|
|
|
from charms.grafana_k8s.v0.grafana_dashboard import GrafanaDashboardProvider
|
|
|
|
class FooCharm:
|
|
def __init__(self, *args):
|
|
super().__init__(*args, **kwargs)
|
|
...
|
|
self.grafana_dashboard_provider = GrafanaDashboardProvider(self)
|
|
...
|
|
|
|
The first argument (`self`) should be a reference to the parent (providing
|
|
dashboards), as this charm's lifecycle events will be used to re-submit
|
|
dashboard information if a charm is upgraded, the pod is restarted, or other.
|
|
|
|
An instantiated `GrafanaDashboardProvider` validates that the path specified
|
|
in the constructor (or the default) exists, reads the file contents, then
|
|
compresses them with LZMA and adds them to the application relation data
|
|
when a relation is established with Grafana.
|
|
|
|
Provided dashboards will be checked by Grafana, and a series of dropdown menus
|
|
providing the ability to select query targets by Juju Model, application instance,
|
|
and unit will be added if they do not exist.
|
|
|
|
To avoid requiring `jinja` in `GrafanaDashboardProvider` users, template validation
|
|
and rendering occurs on the other side of the relation, and relation data in
|
|
the form of:
|
|
|
|
{
|
|
"event": {
|
|
"valid": `true|false`,
|
|
"errors": [],
|
|
}
|
|
}
|
|
|
|
Will be returned if rendering or validation fails. In this case, the
|
|
`GrafanaDashboardProvider` object will emit a `dashboard_status_changed` event
|
|
of the type :class:`GrafanaDashboardEvent`, which will contain information
|
|
about the validation error.
|
|
|
|
This information is added to the relation data for the charms as serialized JSON
|
|
from a dict, with a structure of:
|
|
```
|
|
{
|
|
"application": {
|
|
"dashboards": {
|
|
"uuid": a uuid generated to ensure a relation event triggers,
|
|
"templates": {
|
|
"file:{hash}": {
|
|
"content": `{compressed_template_data}`,
|
|
"charm": `charm.meta.name`,
|
|
"juju_topology": {
|
|
"model": `charm.model.name`,
|
|
"model_uuid": `charm.model.uuid`,
|
|
"application": `charm.app.name`,
|
|
"unit": `charm.unit.name`,
|
|
}
|
|
},
|
|
"file:{other_file_hash}": {
|
|
...
|
|
},
|
|
},
|
|
},
|
|
},
|
|
}
|
|
```
|
|
|
|
This is ingested by :class:`GrafanaDashboardConsumer`, and is sufficient for configuration.
|
|
|
|
The [COS Configuration Charm](https://charmhub.io/cos-configuration-k8s) can be used to
|
|
add dashboards which are not bundled with charms.
|
|
|
|
## Consumer Library Usage
|
|
|
|
The `GrafanaDashboardConsumer` object may be used by Grafana
|
|
charms to manage relations with available dashboards. For this
|
|
purpose, a charm consuming Grafana dashboard information should do
|
|
the following things:
|
|
|
|
1. Instantiate the `GrafanaDashboardConsumer` object by providing it a
|
|
reference to the parent (Grafana) charm and, optionally, the name of
|
|
the relation that the Grafana charm uses to interact with dashboards.
|
|
This relation must confirm to the `grafana-dashboard` interface.
|
|
|
|
For example a Grafana charm may instantiate the
|
|
`GrafanaDashboardConsumer` in its constructor as follows
|
|
|
|
from charms.grafana_k8s.v0.grafana_dashboard import GrafanaDashboardConsumer
|
|
|
|
def __init__(self, *args):
|
|
super().__init__(*args)
|
|
...
|
|
self.grafana_dashboard_consumer = GrafanaDashboardConsumer(self)
|
|
...
|
|
|
|
2. A Grafana charm also needs to listen to the
|
|
`GrafanaDashboardConsumer` events emitted by the `GrafanaDashboardConsumer`
|
|
by adding itself as an observer for these events:
|
|
|
|
self.framework.observe(
|
|
self.grafana_source_consumer.on.sources_changed,
|
|
self._on_dashboards_changed,
|
|
)
|
|
|
|
Dashboards can be retrieved the :meth:`dashboards`:
|
|
|
|
It will be returned in the format of:
|
|
|
|
```
|
|
[
|
|
{
|
|
"id": unique_id,
|
|
"relation_id": relation_id,
|
|
"charm": the name of the charm which provided the dashboard,
|
|
"content": compressed_template_data
|
|
},
|
|
]
|
|
```
|
|
|
|
The consuming charm should decompress the dashboard.
|
|
"""
|
|
|
|
import base64
|
|
import hashlib
|
|
import json
|
|
import logging
|
|
import lzma
|
|
import os
|
|
import platform
|
|
import re
|
|
import subprocess
|
|
import tempfile
|
|
import uuid
|
|
from pathlib import Path
|
|
from typing import Any, Dict, List, Optional, Tuple, Union
|
|
|
|
import yaml
|
|
from ops.charm import (
|
|
CharmBase,
|
|
HookEvent,
|
|
RelationBrokenEvent,
|
|
RelationChangedEvent,
|
|
RelationCreatedEvent,
|
|
RelationEvent,
|
|
RelationRole,
|
|
)
|
|
from ops.framework import (
|
|
EventBase,
|
|
EventSource,
|
|
Object,
|
|
ObjectEvents,
|
|
StoredDict,
|
|
StoredList,
|
|
StoredState,
|
|
)
|
|
from ops.model import Relation
|
|
|
|
# The unique Charmhub library identifier, never change it
|
|
LIBID = "c49eb9c7dfef40c7b6235ebd67010a3f"
|
|
|
|
# Increment this major API version when introducing breaking changes
|
|
LIBAPI = 0
|
|
|
|
# Increment this PATCH version before using `charmcraft publish-lib` or reset
|
|
# to 0 if you are raising the major API version
|
|
|
|
LIBPATCH = 35
|
|
|
|
logger = logging.getLogger(__name__)
|
|
|
|
|
|
DEFAULT_RELATION_NAME = "grafana-dashboard"
|
|
DEFAULT_PEER_NAME = "grafana"
|
|
RELATION_INTERFACE_NAME = "grafana_dashboard"
|
|
|
|
TOPOLOGY_TEMPLATE_DROPDOWNS = [ # type: ignore
|
|
{
|
|
"allValue": ".*",
|
|
"datasource": "${prometheusds}",
|
|
"definition": "label_values(up,juju_model)",
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "Juju model",
|
|
"multi": True,
|
|
"name": "juju_model",
|
|
"query": {
|
|
"query": "label_values(up,juju_model)",
|
|
"refId": "StandardVariableQuery",
|
|
},
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"sort": 0,
|
|
"tagValuesQuery": "",
|
|
"tags": [],
|
|
"tagsQuery": "",
|
|
"type": "query",
|
|
"useTags": False,
|
|
},
|
|
{
|
|
"allValue": ".*",
|
|
"datasource": "${prometheusds}",
|
|
"definition": 'label_values(up{juju_model=~"$juju_model"},juju_model_uuid)',
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "Juju model uuid",
|
|
"multi": True,
|
|
"name": "juju_model_uuid",
|
|
"query": {
|
|
"query": 'label_values(up{juju_model=~"$juju_model"},juju_model_uuid)',
|
|
"refId": "StandardVariableQuery",
|
|
},
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"sort": 0,
|
|
"tagValuesQuery": "",
|
|
"tags": [],
|
|
"tagsQuery": "",
|
|
"type": "query",
|
|
"useTags": False,
|
|
},
|
|
{
|
|
"allValue": ".*",
|
|
"datasource": "${prometheusds}",
|
|
"definition": 'label_values(up{juju_model=~"$juju_model",juju_model_uuid=~"$juju_model_uuid"},juju_application)',
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "Juju application",
|
|
"multi": True,
|
|
"name": "juju_application",
|
|
"query": {
|
|
"query": 'label_values(up{juju_model=~"$juju_model",juju_model_uuid=~"$juju_model_uuid"},juju_application)',
|
|
"refId": "StandardVariableQuery",
|
|
},
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"sort": 0,
|
|
"tagValuesQuery": "",
|
|
"tags": [],
|
|
"tagsQuery": "",
|
|
"type": "query",
|
|
"useTags": False,
|
|
},
|
|
{
|
|
"allValue": ".*",
|
|
"datasource": "${prometheusds}",
|
|
"definition": 'label_values(up{juju_model=~"$juju_model",juju_model_uuid=~"$juju_model_uuid",juju_application=~"$juju_application"},juju_unit)',
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "Juju unit",
|
|
"multi": True,
|
|
"name": "juju_unit",
|
|
"query": {
|
|
"query": 'label_values(up{juju_model=~"$juju_model",juju_model_uuid=~"$juju_model_uuid",juju_application=~"$juju_application"},juju_unit)',
|
|
"refId": "StandardVariableQuery",
|
|
},
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"sort": 0,
|
|
"tagValuesQuery": "",
|
|
"tags": [],
|
|
"tagsQuery": "",
|
|
"type": "query",
|
|
"useTags": False,
|
|
},
|
|
]
|
|
|
|
DATASOURCE_TEMPLATE_DROPDOWNS = [ # type: ignore
|
|
{
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "Prometheus datasource",
|
|
"multi": True,
|
|
"name": "prometheusds",
|
|
"options": [],
|
|
"query": "prometheus",
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"type": "datasource",
|
|
},
|
|
{
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "Loki datasource",
|
|
"multi": True,
|
|
"name": "lokids",
|
|
"options": [],
|
|
"query": "loki",
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"type": "datasource",
|
|
},
|
|
]
|
|
|
|
REACTIVE_CONVERTER = { # type: ignore
|
|
"allValue": None,
|
|
"datasource": "${prometheusds}",
|
|
"definition": 'label_values(up{juju_model=~"$juju_model",juju_model_uuid=~"$juju_model_uuid",juju_application=~"$juju_application"},host)',
|
|
"description": None,
|
|
"error": None,
|
|
"hide": 0,
|
|
"includeAll": True,
|
|
"label": "hosts",
|
|
"multi": True,
|
|
"name": "host",
|
|
"options": [],
|
|
"query": {
|
|
"query": 'label_values(up{juju_model=~"$juju_model",juju_model_uuid=~"$juju_model_uuid",juju_application=~"$juju_application"},host)',
|
|
"refId": "StandardVariableQuery",
|
|
},
|
|
"refresh": 1,
|
|
"regex": "",
|
|
"skipUrlSync": False,
|
|
"sort": 1,
|
|
"tagValuesQuery": "",
|
|
"tags": [],
|
|
"tagsQuery": "",
|
|
"type": "query",
|
|
"useTags": False,
|
|
}
|
|
|
|
|
|
class RelationNotFoundError(Exception):
|
|
"""Raised if there is no relation with the given name."""
|
|
|
|
def __init__(self, relation_name: str):
|
|
self.relation_name = relation_name
|
|
self.message = "No relation named '{}' found".format(relation_name)
|
|
|
|
super().__init__(self.message)
|
|
|
|
|
|
class RelationInterfaceMismatchError(Exception):
|
|
"""Raised if the relation with the given name has a different interface."""
|
|
|
|
def __init__(
|
|
self,
|
|
relation_name: str,
|
|
expected_relation_interface: str,
|
|
actual_relation_interface: str,
|
|
):
|
|
self.relation_name = relation_name
|
|
self.expected_relation_interface = expected_relation_interface
|
|
self.actual_relation_interface = actual_relation_interface
|
|
self.message = (
|
|
"The '{}' relation has '{}' as "
|
|
"interface rather than the expected '{}'".format(
|
|
relation_name, actual_relation_interface, expected_relation_interface
|
|
)
|
|
)
|
|
|
|
super().__init__(self.message)
|
|
|
|
|
|
class RelationRoleMismatchError(Exception):
|
|
"""Raised if the relation with the given name has a different direction."""
|
|
|
|
def __init__(
|
|
self,
|
|
relation_name: str,
|
|
expected_relation_role: RelationRole,
|
|
actual_relation_role: RelationRole,
|
|
):
|
|
self.relation_name = relation_name
|
|
self.expected_relation_interface = expected_relation_role
|
|
self.actual_relation_role = actual_relation_role
|
|
self.message = "The '{}' relation has role '{}' rather than the expected '{}'".format(
|
|
relation_name, repr(actual_relation_role), repr(expected_relation_role)
|
|
)
|
|
|
|
super().__init__(self.message)
|
|
|
|
|
|
class InvalidDirectoryPathError(Exception):
|
|
"""Raised if the grafana dashboards folder cannot be found or is otherwise invalid."""
|
|
|
|
def __init__(
|
|
self,
|
|
grafana_dashboards_absolute_path: str,
|
|
message: str,
|
|
):
|
|
self.grafana_dashboards_absolute_path = grafana_dashboards_absolute_path
|
|
self.message = message
|
|
|
|
super().__init__(self.message)
|
|
|
|
|
|
def _resolve_dir_against_charm_path(charm: CharmBase, *path_elements: str) -> str:
|
|
"""Resolve the provided path items against the directory of the main file.
|
|
|
|
Look up the directory of the charmed operator file being executed. This is normally
|
|
going to be the charm.py file of the charm including this library. Then, resolve
|
|
the provided path elements and return its absolute path.
|
|
|
|
Raises:
|
|
InvalidDirectoryPathError if the resolved path does not exist or it is not a directory
|
|
|
|
"""
|
|
charm_dir = Path(str(charm.charm_dir))
|
|
if not charm_dir.exists() or not charm_dir.is_dir():
|
|
# Operator Framework does not currently expose a robust
|
|
# way to determine the top level charm source directory
|
|
# that is consistent across deployed charms and unit tests
|
|
# Hence for unit tests the current working directory is used
|
|
# TODO: updated this logic when the following ticket is resolved
|
|
# https://github.com/canonical/operator/issues/643
|
|
charm_dir = Path(os.getcwd())
|
|
|
|
dir_path = charm_dir.absolute().joinpath(*path_elements)
|
|
|
|
if not dir_path.exists():
|
|
raise InvalidDirectoryPathError(str(dir_path), "directory does not exist")
|
|
if not dir_path.is_dir():
|
|
raise InvalidDirectoryPathError(str(dir_path), "is not a directory")
|
|
|
|
return str(dir_path)
|
|
|
|
|
|
def _validate_relation_by_interface_and_direction(
|
|
charm: CharmBase,
|
|
relation_name: str,
|
|
expected_relation_interface: str,
|
|
expected_relation_role: RelationRole,
|
|
) -> None:
|
|
"""Verifies that a relation has the necessary characteristics.
|
|
|
|
Verifies that the `relation_name` provided: (1) exists in metadata.yaml,
|
|
(2) declares as interface the interface name passed as `relation_interface`
|
|
and (3) has the right "direction", i.e., it is a relation that `charm`
|
|
provides or requires.
|
|
|
|
Args:
|
|
charm: a `CharmBase` object to scan for the matching relation.
|
|
relation_name: the name of the relation to be verified.
|
|
expected_relation_interface: the interface name to be matched by the
|
|
relation named `relation_name`.
|
|
expected_relation_role: whether the `relation_name` must be either
|
|
provided or required by `charm`.
|
|
|
|
Raises:
|
|
RelationNotFoundError: If there is no relation in the charm's metadata.yaml
|
|
named like the value of the `relation_name` argument.
|
|
RelationInterfaceMismatchError: If the relation interface of the
|
|
relation named as the provided `relation_name` argument does not
|
|
match the `expected_relation_interface` argument.
|
|
RelationRoleMismatchError: If the relation named as the provided `relation_name`
|
|
argument has a different role than what is specified by the
|
|
`expected_relation_role` argument.
|
|
"""
|
|
if relation_name not in charm.meta.relations:
|
|
raise RelationNotFoundError(relation_name)
|
|
|
|
relation = charm.meta.relations[relation_name]
|
|
|
|
actual_relation_interface = relation.interface_name
|
|
if actual_relation_interface and actual_relation_interface != expected_relation_interface:
|
|
raise RelationInterfaceMismatchError(
|
|
relation_name, expected_relation_interface, actual_relation_interface
|
|
)
|
|
|
|
if expected_relation_role == RelationRole.provides:
|
|
if relation_name not in charm.meta.provides:
|
|
raise RelationRoleMismatchError(
|
|
relation_name, RelationRole.provides, RelationRole.requires
|
|
)
|
|
elif expected_relation_role == RelationRole.requires:
|
|
if relation_name not in charm.meta.requires:
|
|
raise RelationRoleMismatchError(
|
|
relation_name, RelationRole.requires, RelationRole.provides
|
|
)
|
|
else:
|
|
raise Exception("Unexpected RelationDirection: {}".format(expected_relation_role))
|
|
|
|
|
|
def _encode_dashboard_content(content: Union[str, bytes]) -> str:
|
|
if isinstance(content, str):
|
|
content = bytes(content, "utf-8")
|
|
|
|
return base64.b64encode(lzma.compress(content)).decode("utf-8")
|
|
|
|
|
|
def _decode_dashboard_content(encoded_content: str) -> str:
|
|
return lzma.decompress(base64.b64decode(encoded_content.encode("utf-8"))).decode()
|
|
|
|
|
|
def _convert_dashboard_fields(content: str, inject_dropdowns: bool = True) -> str:
|
|
"""Make sure values are present for Juju topology.
|
|
|
|
Inserts Juju topology variables and selectors into the template, as well as
|
|
a variable for Prometheus.
|
|
"""
|
|
dict_content = json.loads(content)
|
|
datasources = {}
|
|
existing_templates = False
|
|
|
|
template_dropdowns = (
|
|
TOPOLOGY_TEMPLATE_DROPDOWNS + DATASOURCE_TEMPLATE_DROPDOWNS # type: ignore
|
|
if inject_dropdowns
|
|
else DATASOURCE_TEMPLATE_DROPDOWNS
|
|
)
|
|
|
|
# If the dashboard has __inputs, get the names to replace them. These are stripped
|
|
# from reactive dashboards in GrafanaDashboardAggregator, but charm authors in
|
|
# newer charms may import them directly from the marketplace
|
|
if "__inputs" in dict_content:
|
|
for field in dict_content["__inputs"]:
|
|
if "type" in field and field["type"] == "datasource":
|
|
datasources[field["name"]] = field["pluginName"].lower()
|
|
del dict_content["__inputs"]
|
|
|
|
# If no existing template variables exist, just insert our own
|
|
if "templating" not in dict_content:
|
|
dict_content["templating"] = {"list": list(template_dropdowns)} # type: ignore
|
|
else:
|
|
# Otherwise, set a flag so we can go back later
|
|
existing_templates = True
|
|
for template_value in dict_content["templating"]["list"]:
|
|
# Build a list of `datasource_name`: `datasource_type` mappings
|
|
# The "query" field is actually "prometheus", "loki", "influxdb", etc
|
|
if "type" in template_value and template_value["type"] == "datasource":
|
|
datasources[template_value["name"]] = template_value["query"].lower()
|
|
|
|
# Put our own variables in the template
|
|
for d in template_dropdowns: # type: ignore
|
|
if d not in dict_content["templating"]["list"]:
|
|
dict_content["templating"]["list"].insert(0, d)
|
|
|
|
dict_content = _replace_template_fields(dict_content, datasources, existing_templates)
|
|
return json.dumps(dict_content)
|
|
|
|
|
|
def _replace_template_fields( # noqa: C901
|
|
dict_content: dict, datasources: dict, existing_templates: bool
|
|
) -> dict:
|
|
"""Make templated fields get cleaned up afterwards.
|
|
|
|
If existing datasource variables are present, try to substitute them.
|
|
"""
|
|
replacements = {"loki": "${lokids}", "prometheus": "${prometheusds}"}
|
|
used_replacements = [] # type: List[str]
|
|
|
|
# If any existing datasources match types we know, or we didn't find
|
|
# any templating variables at all, template them.
|
|
if datasources or not existing_templates:
|
|
panels = dict_content.get("panels", {})
|
|
if panels:
|
|
dict_content["panels"] = _template_panels(
|
|
panels, replacements, used_replacements, existing_templates, datasources
|
|
)
|
|
|
|
# Find panels nested under rows
|
|
rows = dict_content.get("rows", {})
|
|
if rows:
|
|
for row_idx, row in enumerate(rows):
|
|
if "panels" in row.keys():
|
|
rows[row_idx]["panels"] = _template_panels(
|
|
row["panels"],
|
|
replacements,
|
|
used_replacements,
|
|
existing_templates,
|
|
datasources,
|
|
)
|
|
|
|
dict_content["rows"] = rows
|
|
|
|
# Finally, go back and pop off the templates we stubbed out
|
|
deletions = []
|
|
for tmpl in dict_content["templating"]["list"]:
|
|
if tmpl["name"] and tmpl["name"] in used_replacements:
|
|
deletions.append(tmpl)
|
|
|
|
for d in deletions:
|
|
dict_content["templating"]["list"].remove(d)
|
|
|
|
return dict_content
|
|
|
|
|
|
def _template_panels(
|
|
panels: dict,
|
|
replacements: dict,
|
|
used_replacements: list,
|
|
existing_templates: bool,
|
|
datasources: dict,
|
|
) -> dict:
|
|
"""Iterate through a `panels` object and template it appropriately."""
|
|
# Go through all the panels. If they have a datasource set, AND it's one
|
|
# that we can convert to ${lokids} or ${prometheusds}, by stripping off the
|
|
# ${} templating and comparing the name to the list we built, replace it,
|
|
# otherwise, leave it alone.
|
|
#
|
|
for panel in panels:
|
|
if "datasource" not in panel or not panel.get("datasource"):
|
|
continue
|
|
if not existing_templates:
|
|
datasource = panel.get("datasource")
|
|
if isinstance(datasource, str):
|
|
if "loki" in datasource:
|
|
panel["datasource"] = "${lokids}"
|
|
elif "grafana" in datasource:
|
|
continue
|
|
else:
|
|
panel["datasource"] = "${prometheusds}"
|
|
elif isinstance(datasource, dict):
|
|
# In dashboards exported by Grafana 9, datasource type is dict
|
|
dstype = datasource.get("type", "")
|
|
if dstype == "loki":
|
|
panel["datasource"]["uid"] = "${lokids}"
|
|
elif dstype == "prometheus":
|
|
panel["datasource"]["uid"] = "${prometheusds}"
|
|
else:
|
|
logger.debug("Unrecognized datasource type '%s'; skipping", dstype)
|
|
continue
|
|
else:
|
|
logger.error("Unknown datasource format: skipping")
|
|
continue
|
|
else:
|
|
if isinstance(panel["datasource"], str):
|
|
if panel["datasource"].lower() in replacements.values():
|
|
# Already a known template variable
|
|
continue
|
|
# Strip out variable characters and maybe braces
|
|
ds = re.sub(r"(\$|\{|\})", "", panel["datasource"])
|
|
|
|
if ds not in datasources.keys():
|
|
# Unknown, non-templated datasource, potentially a Grafana builtin
|
|
continue
|
|
|
|
replacement = replacements.get(datasources[ds], "")
|
|
if replacement:
|
|
used_replacements.append(ds)
|
|
panel["datasource"] = replacement or panel["datasource"]
|
|
elif isinstance(panel["datasource"], dict):
|
|
dstype = panel["datasource"].get("type", "")
|
|
if panel["datasource"].get("uid", "").lower() in replacements.values():
|
|
# Already a known template variable
|
|
continue
|
|
# Strip out variable characters and maybe braces
|
|
ds = re.sub(r"(\$|\{|\})", "", panel["datasource"].get("uid", ""))
|
|
|
|
if ds not in datasources.keys():
|
|
# Unknown, non-templated datasource, potentially a Grafana builtin
|
|
continue
|
|
|
|
replacement = replacements.get(datasources[ds], "")
|
|
if replacement:
|
|
used_replacements.append(ds)
|
|
panel["datasource"]["uid"] = replacement
|
|
else:
|
|
logger.error("Unknown datasource format: skipping")
|
|
continue
|
|
return panels
|
|
|
|
|
|
def _inject_labels(content: str, topology: dict, transformer: "CosTool") -> str:
|
|
"""Inject Juju topology into panel expressions via CosTool.
|
|
|
|
A dashboard will have a structure approximating:
|
|
{
|
|
"__inputs": [],
|
|
"templating": {
|
|
"list": [
|
|
{
|
|
"name": "prometheusds",
|
|
"type": "prometheus"
|
|
}
|
|
]
|
|
},
|
|
"panels": [
|
|
{
|
|
"foo": "bar",
|
|
"targets": [
|
|
{
|
|
"some": "field",
|
|
"expr": "up{job="foo"}"
|
|
},
|
|
{
|
|
"some_other": "field",
|
|
"expr": "sum(http_requests_total{instance="$foo"}[5m])}
|
|
}
|
|
],
|
|
"datasource": "${someds}"
|
|
}
|
|
]
|
|
}
|
|
|
|
`templating` is used elsewhere in this library, but the structure is not rigid. It is
|
|
not guaranteed that a panel will actually have any targets (it could be a "spacer" with
|
|
no datasource, hence no expression). It could have only one target. It could have multiple
|
|
targets. It could have multiple targets of which only one has an `expr` to evaluate. We need
|
|
to try to handle all of these concisely.
|
|
|
|
`cos-tool` (`github.com/canonical/cos-tool` as a Go module in general)
|
|
does not know "Grafana-isms", such as using `[$_variable]` to modify the query from the user
|
|
interface, so we add placeholders (as `5y`, since it must parse, but a dashboard looking for
|
|
five years for a panel query would be unusual).
|
|
|
|
Args:
|
|
content: dashboard content as a string
|
|
topology: a dict containing topology values
|
|
transformer: a 'CosTool' instance
|
|
Returns:
|
|
dashboard content with replaced values.
|
|
"""
|
|
dict_content = json.loads(content)
|
|
|
|
if "panels" not in dict_content.keys():
|
|
return json.dumps(dict_content)
|
|
|
|
# Go through all the panels and inject topology labels
|
|
# Panels may have more than one 'target' where the expressions live, so that must be
|
|
# accounted for. Additionally, `promql-transform` does not necessarily gracefully handle
|
|
# expressions with range queries including variables. Exclude these.
|
|
#
|
|
# It is not a certainty that the `datasource` field will necessarily reflect the type, so
|
|
# operate on all fields.
|
|
panels = dict_content["panels"]
|
|
topology_with_prefix = {"juju_{}".format(k): v for k, v in topology.items()}
|
|
|
|
# We need to use an index so we can insert the changed element back later
|
|
for panel_idx, panel in enumerate(panels):
|
|
if not isinstance(panel, dict):
|
|
continue
|
|
|
|
# Use the index to insert it back in the same location
|
|
panels[panel_idx] = _modify_panel(panel, topology_with_prefix, transformer)
|
|
|
|
return json.dumps(dict_content)
|
|
|
|
|
|
def _modify_panel(panel: dict, topology: dict, transformer: "CosTool") -> dict:
|
|
"""Inject Juju topology into panel expressions via CosTool.
|
|
|
|
Args:
|
|
panel: a dashboard panel as a dict
|
|
topology: a dict containing topology values
|
|
transformer: a 'CosTool' instance
|
|
Returns:
|
|
the panel with injected values
|
|
"""
|
|
if "targets" not in panel.keys():
|
|
return panel
|
|
|
|
# Pre-compile a regular expression to grab values from inside of []
|
|
range_re = re.compile(r"\[(?P<value>.*?)\]")
|
|
# Do the same for any offsets
|
|
offset_re = re.compile(r"offset\s+(?P<value>-?\s*[$\w]+)")
|
|
|
|
known_datasources = {"${prometheusds}": "promql", "${lokids}": "logql"}
|
|
|
|
targets = panel["targets"]
|
|
|
|
# We need to use an index so we can insert the changed element back later
|
|
for idx, target in enumerate(targets):
|
|
# If there's no expression, we don't need to do anything
|
|
if "expr" not in target.keys():
|
|
continue
|
|
expr = target["expr"]
|
|
|
|
if "datasource" not in panel.keys():
|
|
continue
|
|
|
|
if isinstance(panel["datasource"], str):
|
|
if panel["datasource"] not in known_datasources:
|
|
continue
|
|
querytype = known_datasources[panel["datasource"]]
|
|
elif isinstance(panel["datasource"], dict):
|
|
if panel["datasource"]["uid"] not in known_datasources:
|
|
continue
|
|
querytype = known_datasources[panel["datasource"]["uid"]]
|
|
else:
|
|
logger.error("Unknown datasource format: skipping")
|
|
continue
|
|
|
|
# Capture all values inside `[]` into a list which we'll iterate over later to
|
|
# put them back in-order. Then apply the regex again and replace everything with
|
|
# `[5y]` so promql/parser will take it.
|
|
#
|
|
# Then do it again for offsets
|
|
range_values = [m.group("value") for m in range_re.finditer(expr)]
|
|
expr = range_re.sub(r"[5y]", expr)
|
|
|
|
offset_values = [m.group("value") for m in offset_re.finditer(expr)]
|
|
expr = offset_re.sub(r"offset 5y", expr)
|
|
# Retrieve the new expression (which may be unchanged if there were no label
|
|
# matchers in the expression, or if tt was unable to be parsed like logql. It's
|
|
# virtually impossible to tell from any datasource "name" in a panel what the
|
|
# actual type is without re-implementing a complete dashboard parser, but no
|
|
# harm will some from passing invalid promql -- we'll just get the original back.
|
|
#
|
|
replacement = transformer.inject_label_matchers(expr, topology, querytype)
|
|
|
|
if replacement == target["expr"]:
|
|
# promql-tranform caught an error. Move on
|
|
continue
|
|
|
|
# Go back and substitute values in [] which were pulled out
|
|
# Enumerate with an index... again. The same regex is ok, since it will still match
|
|
# `[(.*?)]`, which includes `[5y]`, our placeholder
|
|
for i, match in enumerate(range_re.finditer(replacement)):
|
|
# Replace one-by-one, starting from the left. We build the string back with
|
|
# `str.replace(string_to_replace, replacement_value, count)`. Limit the count
|
|
# to one, since we are going through one-by-one through the list we saved earlier
|
|
# in `range_values`.
|
|
replacement = replacement.replace(
|
|
"[{}]".format(match.group("value")),
|
|
"[{}]".format(range_values[i]),
|
|
1,
|
|
)
|
|
|
|
for i, match in enumerate(offset_re.finditer(replacement)):
|
|
# Replace one-by-one, starting from the left. We build the string back with
|
|
# `str.replace(string_to_replace, replacement_value, count)`. Limit the count
|
|
# to one, since we are going through one-by-one through the list we saved earlier
|
|
# in `range_values`.
|
|
replacement = replacement.replace(
|
|
"offset {}".format(match.group("value")),
|
|
"offset {}".format(offset_values[i]),
|
|
1,
|
|
)
|
|
|
|
# Use the index to insert it back in the same location
|
|
targets[idx]["expr"] = replacement
|
|
|
|
panel["targets"] = targets
|
|
return panel
|
|
|
|
|
|
def _type_convert_stored(obj):
|
|
"""Convert Stored* to their appropriate types, recursively."""
|
|
if isinstance(obj, StoredList):
|
|
return list(map(_type_convert_stored, obj))
|
|
if isinstance(obj, StoredDict):
|
|
rdict = {} # type: Dict[Any, Any]
|
|
for k in obj.keys():
|
|
rdict[k] = _type_convert_stored(obj[k])
|
|
return rdict
|
|
return obj
|
|
|
|
|
|
class GrafanaDashboardsChanged(EventBase):
|
|
"""Event emitted when Grafana dashboards change."""
|
|
|
|
def __init__(self, handle, data=None):
|
|
super().__init__(handle)
|
|
self.data = data
|
|
|
|
def snapshot(self) -> Dict:
|
|
"""Save grafana source information."""
|
|
return {"data": self.data}
|
|
|
|
def restore(self, snapshot):
|
|
"""Restore grafana source information."""
|
|
self.data = snapshot["data"]
|
|
|
|
|
|
class GrafanaDashboardEvents(ObjectEvents):
|
|
"""Events raised by :class:`GrafanaSourceEvents`."""
|
|
|
|
dashboards_changed = EventSource(GrafanaDashboardsChanged)
|
|
|
|
|
|
class GrafanaDashboardEvent(EventBase):
|
|
"""Event emitted when Grafana dashboards cannot be resolved.
|
|
|
|
Enables us to set a clear status on the provider.
|
|
"""
|
|
|
|
def __init__(self, handle, errors: List[Dict[str, str]] = [], valid: bool = False):
|
|
super().__init__(handle)
|
|
self.errors = errors
|
|
self.error_message = "; ".join([error["error"] for error in errors if "error" in error])
|
|
self.valid = valid
|
|
|
|
def snapshot(self) -> Dict:
|
|
"""Save grafana source information."""
|
|
return {
|
|
"error_message": self.error_message,
|
|
"valid": self.valid,
|
|
"errors": json.dumps(self.errors),
|
|
}
|
|
|
|
def restore(self, snapshot):
|
|
"""Restore grafana source information."""
|
|
self.error_message = snapshot["error_message"]
|
|
self.valid = snapshot["valid"]
|
|
self.errors = json.loads(str(snapshot["errors"]))
|
|
|
|
|
|
class GrafanaProviderEvents(ObjectEvents):
|
|
"""Events raised by :class:`GrafanaSourceEvents`."""
|
|
|
|
dashboard_status_changed = EventSource(GrafanaDashboardEvent)
|
|
|
|
|
|
class GrafanaDashboardProvider(Object):
|
|
"""An API to provide Grafana dashboards to a Grafana charm."""
|
|
|
|
_stored = StoredState()
|
|
on = GrafanaProviderEvents() # pyright: ignore
|
|
|
|
def __init__(
|
|
self,
|
|
charm: CharmBase,
|
|
relation_name: str = DEFAULT_RELATION_NAME,
|
|
dashboards_path: str = "src/grafana_dashboards",
|
|
) -> None:
|
|
"""API to provide Grafana dashboard to a Grafana charmed operator.
|
|
|
|
The :class:`GrafanaDashboardProvider` object provides an API
|
|
to upload dashboards to a Grafana charm. In its most streamlined
|
|
usage, the :class:`GrafanaDashboardProvider` is integrated in a
|
|
charmed operator as follows:
|
|
|
|
self.grafana = GrafanaDashboardProvider(self)
|
|
|
|
The :class:`GrafanaDashboardProvider` will look for dashboard
|
|
templates in the `<charm-py-directory>/grafana_dashboards` folder.
|
|
Additionally, dashboard templates can be uploaded programmatically
|
|
via the :method:`GrafanaDashboardProvider.add_dashboard` method.
|
|
|
|
To use the :class:`GrafanaDashboardProvider` API, you need a relation
|
|
defined in your charm operator's metadata.yaml as follows:
|
|
|
|
provides:
|
|
grafana-dashboard:
|
|
interface: grafana_dashboard
|
|
|
|
If you would like to use relation name other than `grafana-dashboard`,
|
|
you will need to specify the relation name via the `relation_name`
|
|
argument when instantiating the :class:`GrafanaDashboardProvider` object.
|
|
However, it is strongly advised to keep the default relation name,
|
|
so that people deploying your charm will have a consistent experience
|
|
with all other charms that provide Grafana dashboards.
|
|
|
|
It is possible to provide a different file path for the Grafana dashboards
|
|
to be automatically managed by the :class:`GrafanaDashboardProvider` object
|
|
via the `dashboards_path` argument. This may be necessary when the directory
|
|
structure of your charmed operator repository is not the "usual" one as
|
|
generated by `charmcraft init`, for example when adding the charmed operator
|
|
in a Java repository managed by Maven or Gradle. However, unless there are
|
|
such constraints with other tooling, it is strongly advised to store the
|
|
Grafana dashboards in the default `<charm-py-directory>/grafana_dashboards`
|
|
folder, in order to provide a consistent experience for other charmed operator
|
|
authors.
|
|
|
|
Args:
|
|
charm: a :class:`CharmBase` object which manages this
|
|
:class:`GrafanaProvider` object. Generally this is
|
|
`self` in the instantiating class.
|
|
relation_name: a :string: name of the relation managed by this
|
|
:class:`GrafanaDashboardProvider`; it defaults to "grafana-dashboard".
|
|
dashboards_path: a filesystem path relative to the charm root
|
|
where dashboard templates can be located. By default, the library
|
|
expects dashboard files to be in the `<charm-py-directory>/grafana_dashboards`
|
|
directory.
|
|
"""
|
|
_validate_relation_by_interface_and_direction(
|
|
charm, relation_name, RELATION_INTERFACE_NAME, RelationRole.provides
|
|
)
|
|
|
|
try:
|
|
dashboards_path = _resolve_dir_against_charm_path(charm, dashboards_path)
|
|
except InvalidDirectoryPathError as e:
|
|
logger.warning(
|
|
"Invalid Grafana dashboards folder at %s: %s",
|
|
e.grafana_dashboards_absolute_path,
|
|
e.message,
|
|
)
|
|
|
|
super().__init__(charm, relation_name)
|
|
|
|
self._charm = charm
|
|
self._relation_name = relation_name
|
|
self._dashboards_path = dashboards_path
|
|
|
|
# No peer relation bucket we can rely on providers, keep StoredState here, too
|
|
self._stored.set_default(dashboard_templates={}) # type: ignore
|
|
|
|
self.framework.observe(self._charm.on.leader_elected, self._update_all_dashboards_from_dir)
|
|
self.framework.observe(self._charm.on.upgrade_charm, self._update_all_dashboards_from_dir)
|
|
|
|
self.framework.observe(
|
|
self._charm.on[self._relation_name].relation_created,
|
|
self._on_grafana_dashboard_relation_created,
|
|
)
|
|
self.framework.observe(
|
|
self._charm.on[self._relation_name].relation_changed,
|
|
self._on_grafana_dashboard_relation_changed,
|
|
)
|
|
|
|
def add_dashboard(self, content: str, inject_dropdowns: bool = True) -> None:
|
|
"""Add a dashboard to the relation managed by this :class:`GrafanaDashboardProvider`.
|
|
|
|
Args:
|
|
content: a string representing a Jinja template. Currently, no
|
|
global variables are added to the Jinja template evaluation
|
|
context.
|
|
inject_dropdowns: a :boolean: indicating whether topology dropdowns should be
|
|
added to the dashboard
|
|
"""
|
|
# Update of storage must be done irrespective of leadership, so
|
|
# that the stored state is there when this unit becomes leader.
|
|
stored_dashboard_templates: Any = self._stored.dashboard_templates # pyright: ignore
|
|
|
|
encoded_dashboard = _encode_dashboard_content(content)
|
|
|
|
# Use as id the first chars of the encoded dashboard, so that
|
|
# it is predictable across units.
|
|
id = "prog:{}".format(encoded_dashboard[-24:-16])
|
|
|
|
stored_dashboard_templates[id] = self._content_to_dashboard_object(
|
|
encoded_dashboard, inject_dropdowns
|
|
)
|
|
stored_dashboard_templates[id]["dashboard_alt_uid"] = self._generate_alt_uid(id)
|
|
|
|
if self._charm.unit.is_leader():
|
|
for dashboard_relation in self._charm.model.relations[self._relation_name]:
|
|
self._upset_dashboards_on_relation(dashboard_relation)
|
|
|
|
def remove_non_builtin_dashboards(self) -> None:
|
|
"""Remove all dashboards to the relation added via :method:`add_dashboard`."""
|
|
# Update of storage must be done irrespective of leadership, so
|
|
# that the stored state is there when this unit becomes leader.
|
|
stored_dashboard_templates: Any = self._stored.dashboard_templates # pyright: ignore
|
|
|
|
for dashboard_id in list(stored_dashboard_templates.keys()):
|
|
if dashboard_id.startswith("prog:"):
|
|
del stored_dashboard_templates[dashboard_id]
|
|
self._stored.dashboard_templates = stored_dashboard_templates
|
|
|
|
if self._charm.unit.is_leader():
|
|
for dashboard_relation in self._charm.model.relations[self._relation_name]:
|
|
self._upset_dashboards_on_relation(dashboard_relation)
|
|
|
|
def update_dashboards(self) -> None:
|
|
"""Trigger the re-evaluation of the data on all relations."""
|
|
if self._charm.unit.is_leader():
|
|
for dashboard_relation in self._charm.model.relations[self._relation_name]:
|
|
self._upset_dashboards_on_relation(dashboard_relation)
|
|
|
|
def _update_all_dashboards_from_dir(
|
|
self, _: Optional[HookEvent] = None, inject_dropdowns: bool = True
|
|
) -> None:
|
|
"""Scans the built-in dashboards and updates relations with changes."""
|
|
# Update of storage must be done irrespective of leadership, so
|
|
# that the stored state is there when this unit becomes leader.
|
|
|
|
# Ensure we do not leave outdated dashboards by removing from stored all
|
|
# the encoded dashboards that start with "file/".
|
|
if self._dashboards_path:
|
|
stored_dashboard_templates: Any = self._stored.dashboard_templates # pyright: ignore
|
|
|
|
for dashboard_id in list(stored_dashboard_templates.keys()):
|
|
if dashboard_id.startswith("file:"):
|
|
del stored_dashboard_templates[dashboard_id]
|
|
|
|
# Path.glob uses fnmatch on the backend, which is pretty limited, so use a
|
|
# custom function for the filter
|
|
def _is_dashboard(p: Path) -> bool:
|
|
return p.is_file() and p.name.endswith((".json", ".json.tmpl", ".tmpl"))
|
|
|
|
for path in filter(_is_dashboard, Path(self._dashboards_path).glob("*")):
|
|
# path = Path(path)
|
|
id = "file:{}".format(path.stem)
|
|
stored_dashboard_templates[id] = self._content_to_dashboard_object(
|
|
_encode_dashboard_content(path.read_bytes()), inject_dropdowns
|
|
)
|
|
stored_dashboard_templates[id]["dashboard_alt_uid"] = self._generate_alt_uid(id)
|
|
|
|
self._stored.dashboard_templates = stored_dashboard_templates
|
|
|
|
if self._charm.unit.is_leader():
|
|
for dashboard_relation in self._charm.model.relations[self._relation_name]:
|
|
self._upset_dashboards_on_relation(dashboard_relation)
|
|
|
|
def _generate_alt_uid(self, key: str) -> str:
|
|
"""Generate alternative uid for dashboards.
|
|
|
|
Args:
|
|
key: A string used (along with charm.meta.name) to build the hash uid.
|
|
|
|
Returns: A hash string.
|
|
"""
|
|
raw_dashboard_alt_uid = "{}-{}".format(self._charm.meta.name, key)
|
|
return hashlib.shake_256(raw_dashboard_alt_uid.encode("utf-8")).hexdigest(8)
|
|
|
|
def _reinitialize_dashboard_data(self, inject_dropdowns: bool = True) -> None:
|
|
"""Triggers a reload of dashboard outside of an eventing workflow.
|
|
|
|
Args:
|
|
inject_dropdowns: a :bool: used to indicate whether topology dropdowns should be added
|
|
|
|
This will destroy any existing relation data.
|
|
"""
|
|
try:
|
|
_resolve_dir_against_charm_path(self._charm, self._dashboards_path)
|
|
self._update_all_dashboards_from_dir(inject_dropdowns=inject_dropdowns)
|
|
|
|
except InvalidDirectoryPathError as e:
|
|
logger.warning(
|
|
"Invalid Grafana dashboards folder at %s: %s",
|
|
e.grafana_dashboards_absolute_path,
|
|
e.message,
|
|
)
|
|
stored_dashboard_templates: Any = self._stored.dashboard_templates # pyright: ignore
|
|
|
|
for dashboard_id in list(stored_dashboard_templates.keys()):
|
|
if dashboard_id.startswith("file:"):
|
|
del stored_dashboard_templates[dashboard_id]
|
|
self._stored.dashboard_templates = stored_dashboard_templates
|
|
|
|
# With all the file-based dashboards cleared out, force a refresh
|
|
# of relation data
|
|
if self._charm.unit.is_leader():
|
|
for dashboard_relation in self._charm.model.relations[self._relation_name]:
|
|
self._upset_dashboards_on_relation(dashboard_relation)
|
|
|
|
def _on_grafana_dashboard_relation_created(self, event: RelationCreatedEvent) -> None:
|
|
"""Watch for a relation being created and automatically send dashboards.
|
|
|
|
Args:
|
|
event: The :class:`RelationJoinedEvent` sent when a
|
|
`grafana_dashboaard` relationship is joined
|
|
"""
|
|
if self._charm.unit.is_leader():
|
|
self._update_all_dashboards_from_dir()
|
|
self._upset_dashboards_on_relation(event.relation)
|
|
|
|
def _on_grafana_dashboard_relation_changed(self, event: RelationChangedEvent) -> None:
|
|
"""Watch for changes so we know if there's an error to signal back to the parent charm.
|
|
|
|
Args:
|
|
event: The `RelationChangedEvent` that triggered this handler.
|
|
"""
|
|
if self._charm.unit.is_leader():
|
|
data = json.loads(event.relation.data[event.app].get("event", "{}")) # type: ignore
|
|
|
|
if not data:
|
|
return
|
|
|
|
valid = bool(data.get("valid", True))
|
|
errors = data.get("errors", [])
|
|
if valid and not errors:
|
|
self.on.dashboard_status_changed.emit(valid=valid) # pyright: ignore
|
|
else:
|
|
self.on.dashboard_status_changed.emit( # pyright: ignore
|
|
valid=valid, errors=errors
|
|
)
|
|
|
|
def _upset_dashboards_on_relation(self, relation: Relation) -> None:
|
|
"""Update the dashboards in the relation data bucket."""
|
|
# It's completely ridiculous to add a UUID, but if we don't have some
|
|
# pseudo-random value, this never makes it across 'juju set-state'
|
|
stored_data = {
|
|
"templates": _type_convert_stored(self._stored.dashboard_templates), # pyright: ignore
|
|
"uuid": str(uuid.uuid4()),
|
|
}
|
|
|
|
relation.data[self._charm.app]["dashboards"] = json.dumps(stored_data)
|
|
|
|
def _content_to_dashboard_object(self, content: str, inject_dropdowns: bool = True) -> Dict:
|
|
return {
|
|
"charm": self._charm.meta.name,
|
|
"content": content,
|
|
"juju_topology": self._juju_topology if inject_dropdowns else {},
|
|
"inject_dropdowns": inject_dropdowns,
|
|
}
|
|
|
|
# This is not actually used in the dashboards, but is present to provide a secondary
|
|
# salt to ensure uniqueness in the dict keys in case individual charm units provide
|
|
# dashboards
|
|
@property
|
|
def _juju_topology(self) -> Dict:
|
|
return {
|
|
"model": self._charm.model.name,
|
|
"model_uuid": self._charm.model.uuid,
|
|
"application": self._charm.app.name,
|
|
"unit": self._charm.unit.name,
|
|
}
|
|
|
|
@property
|
|
def dashboard_templates(self) -> List:
|
|
"""Return a list of the known dashboard templates."""
|
|
return list(self._stored.dashboard_templates.values()) # type: ignore
|
|
|
|
|
|
class GrafanaDashboardConsumer(Object):
|
|
"""A consumer object for working with Grafana Dashboards."""
|
|
|
|
on = GrafanaDashboardEvents() # pyright: ignore
|
|
_stored = StoredState()
|
|
|
|
def __init__(
|
|
self,
|
|
charm: CharmBase,
|
|
relation_name: str = DEFAULT_RELATION_NAME,
|
|
) -> None:
|
|
"""API to receive Grafana dashboards from charmed operators.
|
|
|
|
The :class:`GrafanaDashboardConsumer` object provides an API
|
|
to consume dashboards provided by a charmed operator using the
|
|
:class:`GrafanaDashboardProvider` library. The
|
|
:class:`GrafanaDashboardConsumer` is integrated in a
|
|
charmed operator as follows:
|
|
|
|
self.grafana = GrafanaDashboardConsumer(self)
|
|
|
|
To use this library, you need a relation defined as follows in
|
|
your charm operator's metadata.yaml:
|
|
|
|
requires:
|
|
grafana-dashboard:
|
|
interface: grafana_dashboard
|
|
|
|
If you would like to use a different relation name than
|
|
`grafana-dashboard`, you need to specify the relation name via the
|
|
`relation_name` argument. However, it is strongly advised not to
|
|
change the default, so that people deploying your charm will have
|
|
a consistent experience with all other charms that consume Grafana
|
|
dashboards.
|
|
|
|
Args:
|
|
charm: a :class:`CharmBase` object which manages this
|
|
:class:`GrafanaProvider` object. Generally this is
|
|
`self` in the instantiating class.
|
|
relation_name: a :string: name of the relation managed by this
|
|
:class:`GrafanaDashboardConsumer`; it defaults to "grafana-dashboard".
|
|
"""
|
|
_validate_relation_by_interface_and_direction(
|
|
charm, relation_name, RELATION_INTERFACE_NAME, RelationRole.requires
|
|
)
|
|
|
|
super().__init__(charm, relation_name)
|
|
self._charm = charm
|
|
self._relation_name = relation_name
|
|
self._tranformer = CosTool(self._charm)
|
|
|
|
self._stored.set_default(dashboards={}) # type: ignore
|
|
|
|
self.framework.observe(
|
|
self._charm.on[self._relation_name].relation_changed,
|
|
self._on_grafana_dashboard_relation_changed,
|
|
)
|
|
self.framework.observe(
|
|
self._charm.on[self._relation_name].relation_broken,
|
|
self._on_grafana_dashboard_relation_broken,
|
|
)
|
|
self.framework.observe(
|
|
self._charm.on[DEFAULT_PEER_NAME].relation_changed,
|
|
self._on_grafana_peer_changed,
|
|
)
|
|
|
|
def get_dashboards_from_relation(self, relation_id: int) -> List:
|
|
"""Get a list of known dashboards for one instance of the monitored relation.
|
|
|
|
Args:
|
|
relation_id: the identifier of the relation instance, as returned by
|
|
:method:`ops.model.Relation.id`.
|
|
|
|
Returns: a list of known dashboards coming from the provided relation instance.
|
|
"""
|
|
return [
|
|
self._to_external_object(relation_id, dashboard)
|
|
for dashboard in self._get_stored_dashboards(relation_id)
|
|
]
|
|
|
|
def _on_grafana_dashboard_relation_changed(self, event: RelationChangedEvent) -> None:
|
|
"""Handle relation changes in related providers.
|
|
|
|
If there are changes in relations between Grafana dashboard consumers
|
|
and providers, this event handler (if the unit is the leader) will
|
|
get data for an incoming grafana-dashboard relation through a
|
|
:class:`GrafanaDashboardsChanged` event, and make the relation data
|
|
available in the app's datastore object. The Grafana charm can
|
|
then respond to the event to update its configuration.
|
|
"""
|
|
changes = False
|
|
if self._charm.unit.is_leader():
|
|
changes = self._render_dashboards_and_signal_changed(event.relation)
|
|
|
|
if changes:
|
|
self.on.dashboards_changed.emit() # pyright: ignore
|
|
|
|
def _on_grafana_peer_changed(self, _: RelationChangedEvent) -> None:
|
|
"""Emit dashboard events on peer events so secondary charm data updates."""
|
|
if self._charm.unit.is_leader():
|
|
return
|
|
self.on.dashboards_changed.emit() # pyright: ignore
|
|
|
|
def update_dashboards(self, relation: Optional[Relation] = None) -> None:
|
|
"""Re-establish dashboards on one or more relations.
|
|
|
|
If something changes between this library and a datasource, try to re-establish
|
|
invalid dashboards and invalidate active ones.
|
|
|
|
Args:
|
|
relation: a specific relation for which the dashboards have to be
|
|
updated. If not specified, all relations managed by this
|
|
:class:`GrafanaDashboardConsumer` will be updated.
|
|
"""
|
|
if self._charm.unit.is_leader():
|
|
relations = (
|
|
[relation] if relation else self._charm.model.relations[self._relation_name]
|
|
)
|
|
|
|
for relation in relations:
|
|
self._render_dashboards_and_signal_changed(relation)
|
|
|
|
def _on_grafana_dashboard_relation_broken(self, event: RelationBrokenEvent) -> None:
|
|
"""Update job config when providers depart.
|
|
|
|
When a Grafana dashboard provider departs, the configuration
|
|
for that provider is removed from the list of dashboards
|
|
"""
|
|
if not self._charm.unit.is_leader():
|
|
return
|
|
|
|
self._remove_all_dashboards_for_relation(event.relation)
|
|
|
|
def _render_dashboards_and_signal_changed(self, relation: Relation) -> bool: # type: ignore
|
|
"""Validate a given dashboard.
|
|
|
|
Verify that the passed dashboard data is able to be found in our list
|
|
of datasources and will render. If they do, let the charm know by
|
|
emitting an event.
|
|
|
|
Args:
|
|
relation: Relation; The relation the dashboard is associated with.
|
|
|
|
Returns:
|
|
a boolean indicating whether an event should be emitted
|
|
"""
|
|
other_app = relation.app
|
|
|
|
raw_data = relation.data[other_app].get("dashboards", "") # pyright: ignore
|
|
|
|
if not raw_data:
|
|
logger.warning(
|
|
"No dashboard data found in the %s:%s relation",
|
|
self._relation_name,
|
|
str(relation.id),
|
|
)
|
|
return False
|
|
|
|
data = json.loads(raw_data)
|
|
|
|
# The only piece of data needed on this side of the relations is "templates"
|
|
templates = data.pop("templates")
|
|
|
|
# The dashboards are WAY too big since this ultimately calls out to Juju to
|
|
# set the relation data, and it overflows the maximum argument length for
|
|
# subprocess, so we have to use b64, annoyingly.
|
|
# Worse, Python3 expects absolutely everything to be a byte, and a plain
|
|
# `base64.b64encode()` is still too large, so we have to go through hoops
|
|
# of encoding to byte, compressing with lzma, converting to base64 so it
|
|
# can be converted to JSON, then all the way back.
|
|
|
|
rendered_dashboards = []
|
|
relation_has_invalid_dashboards = False
|
|
|
|
for _, (fname, template) in enumerate(templates.items()):
|
|
content = None
|
|
error = None
|
|
topology = template.get("juju_topology", {})
|
|
try:
|
|
content = _decode_dashboard_content(template["content"])
|
|
inject_dropdowns = template.get("inject_dropdowns", True)
|
|
content = self._manage_dashboard_uid(content, template)
|
|
content = _convert_dashboard_fields(content, inject_dropdowns)
|
|
|
|
if topology:
|
|
content = _inject_labels(content, topology, self._tranformer)
|
|
|
|
content = _encode_dashboard_content(content)
|
|
except lzma.LZMAError as e:
|
|
error = str(e)
|
|
relation_has_invalid_dashboards = True
|
|
except json.JSONDecodeError as e:
|
|
error = str(e.msg)
|
|
logger.warning("Invalid JSON in Grafana dashboard: {}".format(fname))
|
|
continue
|
|
|
|
# Prepend the relation name and ID to the dashboard ID to avoid clashes with
|
|
# multiple relations with apps from the same charm, or having dashboards with
|
|
# the same ids inside their charm operators
|
|
rendered_dashboards.append(
|
|
{
|
|
"id": "{}:{}/{}".format(relation.name, relation.id, fname),
|
|
"original_id": fname,
|
|
"content": content if content else None,
|
|
"template": template,
|
|
"valid": (error is None),
|
|
"error": error,
|
|
}
|
|
)
|
|
|
|
if relation_has_invalid_dashboards:
|
|
self._remove_all_dashboards_for_relation(relation)
|
|
|
|
invalid_templates = [
|
|
data["original_id"] for data in rendered_dashboards if not data["valid"]
|
|
]
|
|
|
|
logger.warning(
|
|
"Cannot add one or more Grafana dashboards from relation '{}:{}': the following "
|
|
"templates are invalid: {}".format(
|
|
relation.name,
|
|
relation.id,
|
|
invalid_templates,
|
|
)
|
|
)
|
|
|
|
relation.data[self._charm.app]["event"] = json.dumps(
|
|
{
|
|
"errors": [
|
|
{
|
|
"dashboard_id": rendered_dashboard["original_id"],
|
|
"error": rendered_dashboard["error"],
|
|
}
|
|
for rendered_dashboard in rendered_dashboards
|
|
if rendered_dashboard["error"]
|
|
]
|
|
}
|
|
)
|
|
|
|
# Dropping dashboards for a relation needs to be signalled
|
|
return True
|
|
|
|
stored_data = rendered_dashboards
|
|
currently_stored_data = self._get_stored_dashboards(relation.id)
|
|
|
|
coerced_data = _type_convert_stored(currently_stored_data) if currently_stored_data else {}
|
|
|
|
if not coerced_data == stored_data:
|
|
stored_dashboards = self.get_peer_data("dashboards")
|
|
stored_dashboards[relation.id] = stored_data
|
|
self.set_peer_data("dashboards", stored_dashboards)
|
|
return True
|
|
return None # type: ignore
|
|
|
|
def _manage_dashboard_uid(self, dashboard: str, template: dict) -> str:
|
|
"""Add an uid to the dashboard if it is not present."""
|
|
dashboard_dict = json.loads(dashboard)
|
|
|
|
if not dashboard_dict.get("uid", None) and "dashboard_alt_uid" in template:
|
|
dashboard_dict["uid"] = template["dashboard_alt_uid"]
|
|
|
|
return json.dumps(dashboard_dict)
|
|
|
|
def _remove_all_dashboards_for_relation(self, relation: Relation) -> None:
|
|
"""If an errored dashboard is in stored data, remove it and trigger a deletion."""
|
|
if self._get_stored_dashboards(relation.id):
|
|
stored_dashboards = self.get_peer_data("dashboards")
|
|
stored_dashboards.pop(str(relation.id))
|
|
self.set_peer_data("dashboards", stored_dashboards)
|
|
self.on.dashboards_changed.emit() # pyright: ignore
|
|
|
|
def _to_external_object(self, relation_id, dashboard):
|
|
return {
|
|
"id": dashboard["original_id"],
|
|
"relation_id": relation_id,
|
|
"charm": dashboard["template"]["charm"],
|
|
"content": _decode_dashboard_content(dashboard["content"]),
|
|
}
|
|
|
|
@property
|
|
def dashboards(self) -> List[Dict]:
|
|
"""Get a list of known dashboards across all instances of the monitored relation.
|
|
|
|
Returns: a list of known dashboards. The JSON of each of the dashboards is available
|
|
in the `content` field of the corresponding `dict`.
|
|
"""
|
|
dashboards = []
|
|
|
|
for _, (relation_id, dashboards_for_relation) in enumerate(
|
|
self.get_peer_data("dashboards").items()
|
|
):
|
|
for dashboard in dashboards_for_relation:
|
|
dashboards.append(self._to_external_object(relation_id, dashboard))
|
|
|
|
return dashboards
|
|
|
|
def _get_stored_dashboards(self, relation_id: int) -> list:
|
|
"""Pull stored dashboards out of the peer data bucket."""
|
|
return self.get_peer_data("dashboards").get(str(relation_id), {})
|
|
|
|
def _set_default_data(self) -> None:
|
|
"""Set defaults if they are not in peer relation data."""
|
|
data = {"dashboards": {}} # type: ignore
|
|
for k, v in data.items():
|
|
if not self.get_peer_data(k):
|
|
self.set_peer_data(k, v)
|
|
|
|
def set_peer_data(self, key: str, data: Any) -> None:
|
|
"""Put information into the peer data bucket instead of `StoredState`."""
|
|
self._charm.peers.data[self._charm.app][key] = json.dumps(data) # type: ignore[attr-defined]
|
|
|
|
def get_peer_data(self, key: str) -> Any:
|
|
"""Retrieve information from the peer data bucket instead of `StoredState`."""
|
|
data = self._charm.peers.data[self._charm.app].get(key, "") # type: ignore[attr-defined]
|
|
return json.loads(data) if data else {}
|
|
|
|
|
|
class GrafanaDashboardAggregator(Object):
|
|
"""API to retrieve Grafana dashboards from machine dashboards.
|
|
|
|
The :class:`GrafanaDashboardAggregator` object provides a way to
|
|
collate and aggregate Grafana dashboards from reactive/machine charms
|
|
and transport them into Charmed Operators, using Juju topology.
|
|
For detailed usage instructions, see the documentation for
|
|
:module:`cos-proxy-operator`, as this class is intended for use as a
|
|
single point of intersection rather than use in individual charms.
|
|
|
|
Since :class:`GrafanaDashboardAggregator` serves as a bridge between
|
|
Canonical Observability Stack Charmed Operators and Reactive Charms,
|
|
deployed in a Reactive Juju model, both a target relation which is
|
|
used to collect events from Reactive charms and a `grafana_relation`
|
|
which is used to send the collected data back to the Canonical
|
|
Observability Stack are required.
|
|
|
|
In its most streamlined usage, :class:`GrafanaDashboardAggregator` is
|
|
integrated in a charmed operator as follows:
|
|
self.grafana = GrafanaDashboardAggregator(self)
|
|
|
|
Args:
|
|
charm: a :class:`CharmBase` object which manages this
|
|
:class:`GrafanaProvider` object. Generally this is
|
|
`self` in the instantiating class.
|
|
target_relation: a :string: name of a relation managed by this
|
|
:class:`GrafanaDashboardAggregator`, which is used to communicate
|
|
with reactive/machine charms it defaults to "dashboards".
|
|
grafana_relation: a :string: name of a relation used by this
|
|
:class:`GrafanaDashboardAggregator`, which is used to communicate
|
|
with charmed grafana. It defaults to "downstream-grafana-dashboard"
|
|
"""
|
|
|
|
_stored = StoredState()
|
|
on = GrafanaProviderEvents() # pyright: ignore
|
|
|
|
def __init__(
|
|
self,
|
|
charm: CharmBase,
|
|
target_relation: str = "dashboards",
|
|
grafana_relation: str = "downstream-grafana-dashboard",
|
|
):
|
|
super().__init__(charm, grafana_relation)
|
|
|
|
# Reactive charms may be RPC-ish and not leave reliable data around. Keep
|
|
# StoredState here
|
|
self._stored.set_default( # type: ignore
|
|
dashboard_templates={},
|
|
id_mappings={},
|
|
)
|
|
|
|
self._charm = charm
|
|
self._target_relation = target_relation
|
|
self._grafana_relation = grafana_relation
|
|
|
|
self.framework.observe(
|
|
self._charm.on[self._grafana_relation].relation_joined,
|
|
self._update_remote_grafana,
|
|
)
|
|
self.framework.observe(
|
|
self._charm.on[self._grafana_relation].relation_changed,
|
|
self._update_remote_grafana,
|
|
)
|
|
self.framework.observe(
|
|
self._charm.on[self._target_relation].relation_changed,
|
|
self.update_dashboards,
|
|
)
|
|
self.framework.observe(
|
|
self._charm.on[self._target_relation].relation_broken,
|
|
self.remove_dashboards,
|
|
)
|
|
|
|
def update_dashboards(self, event: RelationEvent) -> None:
|
|
"""If we get a dashboard from a reactive charm, parse it out and update."""
|
|
if self._charm.unit.is_leader():
|
|
self._upset_dashboards_on_event(event)
|
|
|
|
def _upset_dashboards_on_event(self, event: RelationEvent) -> None:
|
|
"""Update the dashboards in the relation data bucket."""
|
|
dashboards = self._handle_reactive_dashboards(event)
|
|
|
|
if not dashboards:
|
|
logger.warning(
|
|
"Could not find dashboard data after a relation change for {}".format(event.app)
|
|
)
|
|
return
|
|
|
|
for id in dashboards:
|
|
self._stored.dashboard_templates[id] = self._content_to_dashboard_object( # type: ignore
|
|
dashboards[id], event
|
|
)
|
|
|
|
self._stored.id_mappings[event.app.name] = dashboards # type: ignore
|
|
self._update_remote_grafana(event)
|
|
|
|
def _update_remote_grafana(self, _: Optional[RelationEvent] = None) -> None:
|
|
"""Push dashboards to the downstream Grafana relation."""
|
|
# It's still ridiculous to add a UUID here, but needed
|
|
stored_data = {
|
|
"templates": _type_convert_stored(self._stored.dashboard_templates), # pyright: ignore
|
|
"uuid": str(uuid.uuid4()),
|
|
}
|
|
|
|
if self._charm.unit.is_leader():
|
|
for grafana_relation in self.model.relations[self._grafana_relation]:
|
|
grafana_relation.data[self._charm.app]["dashboards"] = json.dumps(stored_data)
|
|
|
|
def remove_dashboards(self, event: RelationBrokenEvent) -> None:
|
|
"""Remove a dashboard if the relation is broken."""
|
|
app_ids = _type_convert_stored(self._stored.id_mappings.get(event.app.name, "")) # type: ignore
|
|
|
|
if not app_ids:
|
|
logger.info("Could not look up stored dashboards for %s", event.app.name) # type: ignore
|
|
return
|
|
|
|
del self._stored.id_mappings[event.app.name] # type: ignore
|
|
for id in app_ids:
|
|
del self._stored.dashboard_templates[id] # type: ignore
|
|
|
|
stored_data = {
|
|
"templates": _type_convert_stored(self._stored.dashboard_templates), # pyright: ignore
|
|
"uuid": str(uuid.uuid4()),
|
|
}
|
|
|
|
if self._charm.unit.is_leader():
|
|
for grafana_relation in self.model.relations[self._grafana_relation]:
|
|
grafana_relation.data[self._charm.app]["dashboards"] = json.dumps(stored_data)
|
|
|
|
# Yes, this has a fair amount of branching. It's not that complex, though
|
|
def _strip_existing_datasources(self, dash: dict) -> dict: # noqa: C901
|
|
"""Remove existing reactive charm datasource templating out.
|
|
|
|
This method iterates through *known* places where reactive charms may set
|
|
data in contributed dashboards and removes them.
|
|
|
|
`dashboard["__inputs"]` is a property sometimes set when exporting dashboards from
|
|
the Grafana UI. It is not present in earlier Grafana versions, and can be disabled
|
|
in 5.3.4 and above (optionally). If set, any values present will be substituted on
|
|
import. Some reactive charms use this for Prometheus. COS uses dropdown selectors
|
|
for datasources, and leaving this present results in "default" datasource values
|
|
which are broken.
|
|
|
|
Similarly, `dashboard["templating"]["list"][N]["name"] == "host"` can be used to
|
|
set a `host` variable for use in dashboards which is not meaningful in the context
|
|
of Juju topology and will yield broken dashboards.
|
|
|
|
Further properties may be discovered.
|
|
"""
|
|
try:
|
|
if "list" in dash["templating"]:
|
|
for i in range(len(dash["templating"]["list"])):
|
|
if (
|
|
"datasource" in dash["templating"]["list"][i]
|
|
and dash["templating"]["list"][i]["datasource"] is not None
|
|
):
|
|
if "Juju" in dash["templating"]["list"][i].get("datasource", ""):
|
|
dash["templating"]["list"][i]["datasource"] = r"${prometheusds}"
|
|
|
|
# Strip out newly-added 'juju_application' template variables which
|
|
# don't line up with our drop-downs
|
|
dash_mutable = dash
|
|
for i in range(len(dash["templating"]["list"])):
|
|
if (
|
|
"name" in dash["templating"]["list"][i]
|
|
and dash["templating"]["list"][i].get("name", "") == "app"
|
|
):
|
|
del dash_mutable["templating"]["list"][i]
|
|
|
|
if dash_mutable:
|
|
dash = dash_mutable
|
|
except KeyError:
|
|
logger.debug("No existing templating data in dashboard")
|
|
|
|
if "__inputs" in dash:
|
|
inputs = dash
|
|
for i in range(len(dash["__inputs"])):
|
|
if dash["__inputs"][i].get("pluginName", "") == "Prometheus":
|
|
del inputs["__inputs"][i]
|
|
if inputs:
|
|
dash["__inputs"] = inputs["__inputs"]
|
|
else:
|
|
del dash["__inputs"]
|
|
|
|
return dash
|
|
|
|
def _handle_reactive_dashboards(self, event: RelationEvent) -> Optional[Dict]:
|
|
"""Look for a dashboard in relation data (during a reactive hook) or builtin by name."""
|
|
if not self._charm.unit.is_leader():
|
|
return {}
|
|
|
|
templates = []
|
|
id = ""
|
|
|
|
# Reactive data can reliably be pulled out of events. In theory, if we got an event,
|
|
# it's on the bucket, but using event explicitly keeps the mental model in
|
|
# place for reactive
|
|
for k in event.relation.data[event.unit].keys(): # type: ignore
|
|
if k.startswith("request_"):
|
|
templates.append(json.loads(event.relation.data[event.unit][k])["dashboard"]) # type: ignore
|
|
|
|
for k in event.relation.data[event.app].keys(): # type: ignore
|
|
if k.startswith("request_"):
|
|
templates.append(json.loads(event.relation.data[event.app][k])["dashboard"]) # type: ignore
|
|
|
|
builtins = self._maybe_get_builtin_dashboards(event)
|
|
|
|
if not templates and not builtins:
|
|
logger.warning("NOTHING!")
|
|
return {}
|
|
|
|
dashboards = {}
|
|
for t in templates:
|
|
# This seems ridiculous, too, but to get it from a "dashboards" key in serialized JSON
|
|
# in the bucket back out to the actual "dashboard" we _need_, this is the way
|
|
# This is not a mistake -- there's a double nesting in reactive charms, and
|
|
# Grafana won't load it. We have to unbox:
|
|
# event.relation.data[event.<type>]["request_*"]["dashboard"]["dashboard"],
|
|
# and the final unboxing is below.
|
|
#
|
|
# Apparently SOME newer dashboards (such as Ceph) do not have this double nesting, so
|
|
# now we get to account for both :toot:
|
|
dash = t.get("dashboard", {}) or t
|
|
|
|
# Replace values with LMA-style templating
|
|
dash = self._strip_existing_datasources(dash)
|
|
dash = json.dumps(dash)
|
|
|
|
# Replace the old-style datasource templates
|
|
dash = re.sub(r"<< datasource >>", r"${prometheusds}", dash)
|
|
dash = re.sub(r'"datasource": "prom.*?"', r'"datasource": "${prometheusds}"', dash)
|
|
dash = re.sub(
|
|
r'"datasource": "\$datasource"', r'"datasource": "${prometheusds}"', dash
|
|
)
|
|
dash = re.sub(r'"uid": "\$datasource"', r'"uid": "${prometheusds}"', dash)
|
|
dash = re.sub(
|
|
r'"datasource": "(!?\w)[\w|\s|-]+?Juju generated.*?"',
|
|
r'"datasource": "${prometheusds}"',
|
|
dash,
|
|
)
|
|
|
|
# Yank out "new"+old LMA topology
|
|
dash = re.sub(
|
|
r'(,?\s?juju_application=~)\\"\$app\\"', r'\1\\"$juju_application\\"', dash
|
|
)
|
|
|
|
# Replace old piechart panels
|
|
dash = re.sub(r'"type": "grafana-piechart-panel"', '"type": "piechart"', dash)
|
|
|
|
from jinja2 import DebugUndefined, Template
|
|
|
|
content = _encode_dashboard_content(
|
|
Template(dash, undefined=DebugUndefined).render(datasource=r"${prometheusds}") # type: ignore
|
|
)
|
|
id = "prog:{}".format(content[-24:-16])
|
|
|
|
dashboards[id] = content
|
|
return {**builtins, **dashboards}
|
|
|
|
def _maybe_get_builtin_dashboards(self, event: RelationEvent) -> Dict:
|
|
"""Tries to match the event with an included dashboard.
|
|
|
|
Scans dashboards packed with the charm instantiating this class, and tries to match
|
|
one with the event. There is no guarantee that any given event will match a builtin,
|
|
since each charm instantiating this class may include a different set of dashboards,
|
|
or none.
|
|
"""
|
|
builtins = {}
|
|
dashboards_path = None
|
|
|
|
try:
|
|
dashboards_path = _resolve_dir_against_charm_path(
|
|
self._charm, "src/grafana_dashboards"
|
|
)
|
|
except InvalidDirectoryPathError as e:
|
|
logger.warning(
|
|
"Invalid Grafana dashboards folder at %s: %s",
|
|
e.grafana_dashboards_absolute_path,
|
|
e.message,
|
|
)
|
|
|
|
if dashboards_path:
|
|
|
|
def is_dashboard(p: Path) -> bool:
|
|
return p.is_file() and p.name.endswith((".json", ".json.tmpl", ".tmpl"))
|
|
|
|
for path in filter(is_dashboard, Path(dashboards_path).glob("*")):
|
|
# path = Path(path)
|
|
if event.app.name in path.name: # type: ignore
|
|
id = "file:{}".format(path.stem)
|
|
builtins[id] = self._content_to_dashboard_object(
|
|
_encode_dashboard_content(path.read_bytes()), event
|
|
)
|
|
|
|
return builtins
|
|
|
|
def _content_to_dashboard_object(self, content: str, event: RelationEvent) -> Dict:
|
|
return {
|
|
"charm": event.app.name, # type: ignore
|
|
"content": content,
|
|
"juju_topology": self._juju_topology(event),
|
|
"inject_dropdowns": True,
|
|
}
|
|
|
|
# This is not actually used in the dashboards, but is present to provide a secondary
|
|
# salt to ensure uniqueness in the dict keys in case individual charm units provide
|
|
# dashboards
|
|
def _juju_topology(self, event: RelationEvent) -> Dict:
|
|
return {
|
|
"model": self._charm.model.name,
|
|
"model_uuid": self._charm.model.uuid,
|
|
"application": event.app.name, # type: ignore
|
|
"unit": event.unit.name, # type: ignore
|
|
}
|
|
|
|
|
|
class CosTool:
|
|
"""Uses cos-tool to inject label matchers into alert rule expressions and validate rules."""
|
|
|
|
_path = None
|
|
_disabled = False
|
|
|
|
def __init__(self, charm):
|
|
self._charm = charm
|
|
|
|
@property
|
|
def path(self):
|
|
"""Lazy lookup of the path of cos-tool."""
|
|
if self._disabled:
|
|
return None
|
|
if not self._path:
|
|
self._path = self._get_tool_path()
|
|
if not self._path:
|
|
logger.debug("Skipping injection of juju topology as label matchers")
|
|
self._disabled = True
|
|
return self._path
|
|
|
|
def apply_label_matchers(self, rules: dict, type: str) -> dict:
|
|
"""Will apply label matchers to the expression of all alerts in all supplied groups."""
|
|
if not self.path:
|
|
return rules
|
|
for group in rules["groups"]:
|
|
rules_in_group = group.get("rules", [])
|
|
for rule in rules_in_group:
|
|
topology = {}
|
|
# if the user for some reason has provided juju_unit, we'll need to honor it
|
|
# in most cases, however, this will be empty
|
|
for label in [
|
|
"juju_model",
|
|
"juju_model_uuid",
|
|
"juju_application",
|
|
"juju_charm",
|
|
"juju_unit",
|
|
]:
|
|
if label in rule["labels"]:
|
|
topology[label] = rule["labels"][label]
|
|
|
|
rule["expr"] = self.inject_label_matchers(rule["expr"], topology, type)
|
|
return rules
|
|
|
|
def validate_alert_rules(self, rules: dict) -> Tuple[bool, str]:
|
|
"""Will validate correctness of alert rules, returning a boolean and any errors."""
|
|
if not self.path:
|
|
logger.debug("`cos-tool` unavailable. Not validating alert correctness.")
|
|
return True, ""
|
|
|
|
with tempfile.TemporaryDirectory() as tmpdir:
|
|
rule_path = Path(tmpdir + "/validate_rule.yaml")
|
|
|
|
# Smash "our" rules format into what upstream actually uses, which is more like:
|
|
#
|
|
# groups:
|
|
# - name: foo
|
|
# rules:
|
|
# - alert: SomeAlert
|
|
# expr: up
|
|
# - alert: OtherAlert
|
|
# expr: up
|
|
transformed_rules = {"groups": []} # type: ignore
|
|
for rule in rules["groups"]:
|
|
transformed = {"name": str(uuid.uuid4()), "rules": [rule]}
|
|
transformed_rules["groups"].append(transformed)
|
|
|
|
rule_path.write_text(yaml.dump(transformed_rules))
|
|
|
|
args = [str(self.path), "validate", str(rule_path)]
|
|
# noinspection PyBroadException
|
|
try:
|
|
self._exec(args)
|
|
return True, ""
|
|
except subprocess.CalledProcessError as e:
|
|
logger.debug("Validating the rules failed: %s", e.output)
|
|
return False, ", ".join([line for line in e.output if "error validating" in line])
|
|
|
|
def inject_label_matchers(self, expression: str, topology: dict, type: str) -> str:
|
|
"""Add label matchers to an expression."""
|
|
if not topology:
|
|
return expression
|
|
if not self.path:
|
|
logger.debug("`cos-tool` unavailable. Leaving expression unchanged: %s", expression)
|
|
return expression
|
|
args = [str(self.path), "--format", type, "transform"]
|
|
|
|
variable_topology = {k: "${}".format(k) for k in topology.keys()}
|
|
args.extend(
|
|
[
|
|
"--label-matcher={}={}".format(key, value)
|
|
for key, value in variable_topology.items()
|
|
]
|
|
)
|
|
|
|
# Pass a leading "--" so expressions with a negation or subtraction aren't interpreted as
|
|
# flags
|
|
args.extend(["--", "{}".format(expression)])
|
|
# noinspection PyBroadException
|
|
try:
|
|
return re.sub(r'="\$juju', r'=~"$juju', self._exec(args))
|
|
except subprocess.CalledProcessError as e:
|
|
logger.debug('Applying the expression failed: "%s", falling back to the original', e)
|
|
return expression
|
|
|
|
def _get_tool_path(self) -> Optional[Path]:
|
|
arch = platform.machine()
|
|
arch = "amd64" if arch == "x86_64" else arch
|
|
res = "cos-tool-{}".format(arch)
|
|
try:
|
|
path = Path(res).resolve()
|
|
path.chmod(0o777)
|
|
return path
|
|
except NotImplementedError:
|
|
logger.debug("System lacks support for chmod")
|
|
except FileNotFoundError:
|
|
logger.debug('Could not locate cos-tool at: "{}"'.format(res))
|
|
return None
|
|
|
|
def _exec(self, cmd) -> str:
|
|
result = subprocess.run(cmd, check=True, stdout=subprocess.PIPE)
|
|
output = result.stdout.decode("utf-8").strip()
|
|
return output
|