Add support for prometheus-k8s
Add support for the metrics-endpoint relation. This allows relating ceph-mon to prometheus-k8s which is being used in the COS Lite observability stack. Upon relation, the ceph prometheus module will be enabled and a corresponding scrape job configured for prometheus-k8s. Drive-by test improvement for the utils module Change-Id: Iaeee57aaa6f3678fdaef35f2582b4b4c974acb2a
This commit is contained in:
parent
1ee3d04fda
commit
24dfc7440d
|
@ -140,6 +140,9 @@ The charm supports Ceph metric monitoring with Prometheus. Add relations to the
|
|||
> **Note**: Prometheus support is available starting with Ceph Luminous
|
||||
(xenial-queens UCA pocket).
|
||||
|
||||
Alternatively, integration with the [COS Lite][cos-lite] observability
|
||||
stack is available via the metrics-endpoint relation.
|
||||
|
||||
## Actions
|
||||
|
||||
This section lists Juju [actions][juju-docs-actions] supported by the charm.
|
||||
|
@ -224,3 +227,4 @@ For general charm questions refer to the OpenStack [Charm Guide][cg].
|
|||
[cloud-archive-ceph]: https://wiki.ubuntu.com/OpenStack/CloudArchive#Ceph_and_the_UCA
|
||||
[upstream-ceph-buckets]: https://docs.ceph.com/docs/master/rados/operations/crush-map/#types-and-buckets
|
||||
[jq]: https://stedolan.github.io/jq/
|
||||
[cos-lite]: https://charmhub.io/cos-lite
|
||||
|
|
|
@ -0,0 +1,306 @@
|
|||
# Copyright 2022 Canonical Ltd.
|
||||
# See LICENSE file for licensing details.
|
||||
"""## Overview.
|
||||
|
||||
This document explains how to use the `JujuTopology` class to
|
||||
create and consume topology information from Juju in a consistent manner.
|
||||
|
||||
The goal of the Juju topology is to uniquely identify a piece
|
||||
of software running across any of your Juju-managed deployments.
|
||||
This is achieved by combining the following four elements:
|
||||
|
||||
- Model name
|
||||
- Model UUID
|
||||
- Application name
|
||||
- Unit identifier
|
||||
|
||||
|
||||
For a more in-depth description of the concept, as well as a
|
||||
walk-through of it's use-case in observability, see
|
||||
[this blog post](https://juju.is/blog/model-driven-observability-part-2-juju-topology-metrics)
|
||||
on the Juju blog.
|
||||
|
||||
## Library Usage
|
||||
|
||||
This library may be used to create and consume `JujuTopology` objects.
|
||||
The `JujuTopology` class provides three ways to create instances:
|
||||
|
||||
### Using the `from_charm` method
|
||||
|
||||
Enables instantiation by supplying the charm as an argument. When
|
||||
creating topology objects for the current charm, this is the recommended
|
||||
approach.
|
||||
|
||||
```python
|
||||
topology = JujuTopology.from_charm(self)
|
||||
```
|
||||
|
||||
### Using the `from_dict` method
|
||||
|
||||
Allows for instantion using a dictionary of relation data, like the
|
||||
`scrape_metadata` from Prometheus or the labels of an alert rule. When
|
||||
creating topology objects for remote charms, this is the recommended
|
||||
approach.
|
||||
|
||||
```python
|
||||
scrape_metadata = json.loads(relation.data[relation.app].get("scrape_metadata", "{}"))
|
||||
topology = JujuTopology.from_dict(scrape_metadata)
|
||||
```
|
||||
|
||||
### Using the class constructor
|
||||
|
||||
Enables instantiation using whatever values you want. While this
|
||||
is useful in some very specific cases, this is almost certainly not
|
||||
what you are looking for as setting these values manually may
|
||||
result in observability metrics which do not uniquely identify a
|
||||
charm in order to provide accurate usage reporting, alerting,
|
||||
horizontal scaling, or other use cases.
|
||||
|
||||
```python
|
||||
topology = JujuTopology(
|
||||
model="some-juju-model",
|
||||
model_uuid="00000000-0000-0000-0000-000000000001",
|
||||
application="fancy-juju-application",
|
||||
unit="fancy-juju-application/0",
|
||||
charm_name="fancy-juju-application-k8s",
|
||||
)
|
||||
```
|
||||
|
||||
"""
|
||||
|
||||
import re
|
||||
from collections import OrderedDict
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
# The unique Charmhub library identifier, never change it
|
||||
LIBID = "bced1658f20f49d28b88f61f83c2d232"
|
||||
|
||||
LIBAPI = 0
|
||||
LIBPATCH = 2
|
||||
|
||||
|
||||
class InvalidUUIDError(Exception):
|
||||
"""Invalid UUID was provided."""
|
||||
|
||||
def __init__(self, uuid: str):
|
||||
self.message = "'{}' is not a valid UUID.".format(uuid)
|
||||
super().__init__(self.message)
|
||||
|
||||
|
||||
class JujuTopology:
|
||||
"""JujuTopology is used for storing, generating and formatting juju topology information."""
|
||||
|
||||
def __init__(
|
||||
self,
|
||||
model: str,
|
||||
model_uuid: str,
|
||||
application: str,
|
||||
unit: str = None,
|
||||
charm_name: str = None,
|
||||
):
|
||||
"""Build a JujuTopology object.
|
||||
|
||||
A `JujuTopology` object is used for storing and transforming
|
||||
Juju topology information. This information is used to
|
||||
annotate Prometheus scrape jobs and alert rules. Such
|
||||
annotation when applied to scrape jobs helps in identifying
|
||||
the source of the scrapped metrics. On the other hand when
|
||||
applied to alert rules topology information ensures that
|
||||
evaluation of alert expressions is restricted to the source
|
||||
(charm) from which the alert rules were obtained.
|
||||
|
||||
Args:
|
||||
model: a string name of the Juju model
|
||||
model_uuid: a globally unique string identifier for the Juju model
|
||||
application: an application name as a string
|
||||
unit: a unit name as a string
|
||||
charm_name: name of charm as a string
|
||||
"""
|
||||
if not self.is_valid_uuid(model_uuid):
|
||||
raise InvalidUUIDError(model_uuid)
|
||||
|
||||
self._model = model
|
||||
self._model_uuid = model_uuid
|
||||
self._application = application
|
||||
self._charm_name = charm_name
|
||||
self._unit = unit
|
||||
|
||||
def is_valid_uuid(self, uuid):
|
||||
"""Validate the supplied UUID against the Juju Model UUID pattern."""
|
||||
# TODO:
|
||||
# Harness is harcoding an UUID that is v1 not v4: f2c1b2a6-e006-11eb-ba80-0242ac130004
|
||||
# See: https://github.com/canonical/operator/issues/779
|
||||
#
|
||||
# >>> uuid.UUID("f2c1b2a6-e006-11eb-ba80-0242ac130004").version
|
||||
# 1
|
||||
#
|
||||
# we changed the validation of the 3ed UUID block: 4[a-f0-9]{3} -> [a-f0-9]{4}
|
||||
# See: https://github.com/canonical/operator/blob/main/ops/testing.py#L1094
|
||||
#
|
||||
# Juju in fact generates a UUID v4: https://github.com/juju/utils/blob/master/uuid.go#L62
|
||||
# but does not validate it is actually v4:
|
||||
# See:
|
||||
# - https://github.com/juju/utils/blob/master/uuid.go#L22
|
||||
# - https://github.com/juju/schema/blob/master/strings.go#L79
|
||||
#
|
||||
# Once Harness fixes this, we should remove this comment and refactor the regex or
|
||||
# the entire method using the uuid module to validate UUIDs
|
||||
regex = re.compile(
|
||||
"^[a-f0-9]{8}-?[a-f0-9]{4}-?[a-f0-9]{4}-?[89ab][a-f0-9]{3}-?[a-f0-9]{12}$"
|
||||
)
|
||||
return bool(regex.match(uuid))
|
||||
|
||||
@classmethod
|
||||
def from_charm(cls, charm):
|
||||
"""Creates a JujuTopology instance by using the model data available on a charm object.
|
||||
|
||||
Args:
|
||||
charm: a `CharmBase` object for which the `JujuTopology` will be constructed
|
||||
Returns:
|
||||
a `JujuTopology` object.
|
||||
"""
|
||||
return cls(
|
||||
model=charm.model.name,
|
||||
model_uuid=charm.model.uuid,
|
||||
application=charm.model.app.name,
|
||||
unit=charm.model.unit.name,
|
||||
charm_name=charm.meta.name,
|
||||
)
|
||||
|
||||
@classmethod
|
||||
def from_dict(cls, data: dict):
|
||||
"""Factory method for creating `JujuTopology` children from a dictionary.
|
||||
|
||||
Args:
|
||||
data: a dictionary with five keys providing topology information. The keys are
|
||||
- "model"
|
||||
- "model_uuid"
|
||||
- "application"
|
||||
- "unit"
|
||||
- "charm_name"
|
||||
`unit` and `charm_name` may be empty, but will result in more limited
|
||||
labels. However, this allows us to support charms without workloads.
|
||||
|
||||
Returns:
|
||||
a `JujuTopology` object.
|
||||
"""
|
||||
return cls(
|
||||
model=data["model"],
|
||||
model_uuid=data["model_uuid"],
|
||||
application=data["application"],
|
||||
unit=data.get("unit", ""),
|
||||
charm_name=data.get("charm_name", ""),
|
||||
)
|
||||
|
||||
def as_dict(
|
||||
self, *, remapped_keys: Dict[str, str] = None, excluded_keys: List[str] = None
|
||||
) -> OrderedDict:
|
||||
"""Format the topology information into an ordered dict.
|
||||
|
||||
Keeping the dictionary ordered is important to be able to
|
||||
compare dicts without having to resort to deep comparisons.
|
||||
|
||||
Args:
|
||||
remapped_keys: A dictionary mapping old key names to new key names,
|
||||
which will be substituted when invoked.
|
||||
excluded_keys: A list of key names to exclude from the returned dict.
|
||||
uuid_length: The length to crop the UUID to.
|
||||
"""
|
||||
ret = OrderedDict(
|
||||
[
|
||||
("model", self.model),
|
||||
("model_uuid", self.model_uuid),
|
||||
("application", self.application),
|
||||
("unit", self.unit),
|
||||
("charm_name", self.charm_name),
|
||||
]
|
||||
)
|
||||
if excluded_keys:
|
||||
ret = OrderedDict({k: v for k, v in ret.items() if k not in excluded_keys})
|
||||
|
||||
if remapped_keys:
|
||||
ret = OrderedDict(
|
||||
(remapped_keys.get(k), v) if remapped_keys.get(k) else (k, v) for k, v in ret.items() # type: ignore
|
||||
)
|
||||
|
||||
return ret
|
||||
|
||||
@property
|
||||
def identifier(self) -> str:
|
||||
"""Format the topology information into a terse string.
|
||||
|
||||
This crops the model UUID, making it unsuitable for comparisons against
|
||||
anything but other identifiers. Mainly to be used as a display name or file
|
||||
name where long strings might become an issue.
|
||||
|
||||
>>> JujuTopology( \
|
||||
model = "a-model", \
|
||||
model_uuid = "00000000-0000-4000-8000-000000000000", \
|
||||
application = "some-app", \
|
||||
unit = "some-app/1" \
|
||||
).identifier
|
||||
'a-model_00000000_some-app'
|
||||
"""
|
||||
parts = self.as_dict(
|
||||
excluded_keys=["unit", "charm_name"],
|
||||
)
|
||||
|
||||
parts["model_uuid"] = self.model_uuid_short
|
||||
values = parts.values()
|
||||
|
||||
return "_".join([str(val) for val in values]).replace("/", "_")
|
||||
|
||||
@property
|
||||
def label_matcher_dict(self) -> Dict[str, str]:
|
||||
"""Format the topology information into a dict with keys having 'juju_' as prefix.
|
||||
|
||||
Relabelled topology never includes the unit as it would then only match
|
||||
the leader unit (ie. the unit that produced the dict).
|
||||
"""
|
||||
items = self.as_dict(
|
||||
remapped_keys={"charm_name": "charm"},
|
||||
excluded_keys=["unit"],
|
||||
).items()
|
||||
|
||||
return {"juju_{}".format(key): value for key, value in items if value}
|
||||
|
||||
@property
|
||||
def label_matchers(self) -> str:
|
||||
"""Format the topology information into a promql/logql label matcher string.
|
||||
|
||||
Topology label matchers should never include the unit as it
|
||||
would then only match the leader unit (ie. the unit that
|
||||
produced the matchers).
|
||||
"""
|
||||
items = self.label_matcher_dict.items()
|
||||
return ", ".join(['{}="{}"'.format(key, value) for key, value in items if value])
|
||||
|
||||
@property
|
||||
def model(self) -> str:
|
||||
"""Getter for the juju model value."""
|
||||
return self._model
|
||||
|
||||
@property
|
||||
def model_uuid(self) -> str:
|
||||
"""Getter for the juju model uuid value."""
|
||||
return self._model_uuid
|
||||
|
||||
@property
|
||||
def model_uuid_short(self) -> str:
|
||||
"""Getter for the juju model value, truncated to the first eight letters."""
|
||||
return self._model_uuid[:8]
|
||||
|
||||
@property
|
||||
def application(self) -> str:
|
||||
"""Getter for the juju application value."""
|
||||
return self._application
|
||||
|
||||
@property
|
||||
def charm_name(self) -> Optional[str]:
|
||||
"""Getter for the juju charm name value."""
|
||||
return self._charm_name
|
||||
|
||||
@property
|
||||
def unit(self) -> Optional[str]:
|
||||
"""Getter for the juju unit value."""
|
||||
return self._unit
|
File diff suppressed because it is too large
Load Diff
|
@ -36,6 +36,8 @@ provides:
|
|||
interface: ceph-rbd-mirror
|
||||
prometheus:
|
||||
interface: http
|
||||
metrics-endpoint:
|
||||
interface: prometheus_scrape
|
||||
dashboard:
|
||||
interface: ceph-dashboard
|
||||
requires:
|
||||
|
|
|
@ -0,0 +1,51 @@
|
|||
# Copyright 2022 Canonical Ltd.
|
||||
# See LICENSE file for licensing details.
|
||||
|
||||
"""Provide ceph metrics to prometheus
|
||||
|
||||
Configure prometheus scrape jobs via the metrics-endpoint relation.
|
||||
"""
|
||||
import logging
|
||||
from typing import Optional, Union, List
|
||||
|
||||
from charms.prometheus_k8s.v0 import prometheus_scrape
|
||||
from charms_ceph import utils as ceph_utils
|
||||
from ops.framework import BoundEvent
|
||||
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
DEFAULT_CEPH_JOB = {
|
||||
"metrics_path": "/metrics",
|
||||
"static_configs": [{"targets": ["*:9283"]}],
|
||||
}
|
||||
|
||||
|
||||
class CephMetricsEndpointProvider(prometheus_scrape.MetricsEndpointProvider):
|
||||
def __init__(
|
||||
self,
|
||||
charm,
|
||||
relation_name: str = prometheus_scrape.DEFAULT_RELATION_NAME,
|
||||
jobs=None,
|
||||
alert_rules_path: str = prometheus_scrape.DEFAULT_ALERT_RULES_RELATIVE_PATH, # noqa
|
||||
refresh_event: Optional[Union[BoundEvent, List[BoundEvent]]] = None,
|
||||
):
|
||||
if jobs is None:
|
||||
jobs = [DEFAULT_CEPH_JOB]
|
||||
super().__init__(
|
||||
charm,
|
||||
relation_name=relation_name,
|
||||
jobs=jobs,
|
||||
alert_rules_path=alert_rules_path,
|
||||
refresh_event=refresh_event,
|
||||
)
|
||||
|
||||
def _on_relation_changed(self, event):
|
||||
"""Enable prometheus on relation change"""
|
||||
if self._charm.unit.is_leader() and ceph_utils.is_bootstrapped():
|
||||
logger.debug(
|
||||
"is_leader and is_bootstrapped, running rel changed: %s", event
|
||||
)
|
||||
ceph_utils.mgr_enable_module("prometheus")
|
||||
logger.debug("module_enabled")
|
||||
super()._on_relation_changed(event)
|
|
@ -5,6 +5,7 @@ from ops.main import main
|
|||
import ops_openstack.core
|
||||
|
||||
import ceph_hooks as hooks
|
||||
import ceph_metrics
|
||||
|
||||
|
||||
class CephMonCharm(ops_openstack.core.OSBaseCharm):
|
||||
|
@ -70,6 +71,8 @@ class CephMonCharm(ops_openstack.core.OSBaseCharm):
|
|||
self._stored.is_started = True
|
||||
fw = self.framework
|
||||
|
||||
self.metrics_endpoint = ceph_metrics.CephMetricsEndpointProvider(self)
|
||||
|
||||
fw.observe(self.on.install, self.on_install)
|
||||
fw.observe(self.on.config_changed, self.on_config)
|
||||
fw.observe(self.on.pre_series_upgrade, self.on_pre_series_upgrade)
|
||||
|
|
2
tox.ini
2
tox.ini
|
@ -84,7 +84,7 @@ deps = -r{toxinidir}/requirements.txt
|
|||
[testenv:pep8]
|
||||
basepython = python3
|
||||
deps = flake8==3.9.2
|
||||
charm-tools==2.8.3
|
||||
charm-tools==2.8.4
|
||||
commands = flake8 {posargs} unit_tests tests actions files src
|
||||
charm-proof
|
||||
|
||||
|
|
|
@ -0,0 +1,91 @@
|
|||
#!/usr/bin/env python3
|
||||
|
||||
# Copyright 2022 Canonical Ltd.
|
||||
# See LICENSE file for licensing details.
|
||||
|
||||
from unittest.mock import patch
|
||||
import unittest
|
||||
|
||||
from ops import storage, model, framework
|
||||
from ops.testing import Harness, _TestingModelBackend
|
||||
|
||||
import charm
|
||||
|
||||
|
||||
class TestCephMetrics(unittest.TestCase):
|
||||
def setUp(self):
|
||||
super().setUp()
|
||||
self.harness = Harness(charm.CephMonCharm)
|
||||
|
||||
# BEGIN: Workaround until network_get is implemented
|
||||
class _TestingOPSModelBackend(_TestingModelBackend):
|
||||
def network_get(self, endpoint_name, relation_id=None):
|
||||
network_data = {
|
||||
"bind-addresses": [
|
||||
{
|
||||
"addresses": [{"value": "10.0.0.10"}],
|
||||
}
|
||||
],
|
||||
}
|
||||
return network_data
|
||||
|
||||
self.harness._backend = _TestingOPSModelBackend(
|
||||
self.harness._unit_name, self.harness._meta
|
||||
)
|
||||
self.harness._model = model.Model(
|
||||
self.harness._meta, self.harness._backend
|
||||
)
|
||||
self.harness._framework = framework.Framework(
|
||||
storage.SQLiteStorage(":memory:"),
|
||||
self.harness._charm_dir,
|
||||
self.harness._meta,
|
||||
self.harness._model,
|
||||
)
|
||||
# END Workaround
|
||||
self.addCleanup(self.harness.cleanup)
|
||||
self.harness.begin()
|
||||
self.harness.set_leader(True)
|
||||
|
||||
def test_init(self):
|
||||
self.assertEqual(
|
||||
self.harness.charm.metrics_endpoint._relation_name,
|
||||
"metrics-endpoint",
|
||||
)
|
||||
|
||||
@patch("ceph_metrics.ceph_utils.is_bootstrapped", return_value=True)
|
||||
@patch("ceph_metrics.ceph_utils.is_mgr_module_enabled", return_value=False)
|
||||
@patch("ceph_metrics.ceph_utils.mgr_enable_module")
|
||||
def test_add_rel(
|
||||
self,
|
||||
mgr_enable_module,
|
||||
_is_mgr_module_enable,
|
||||
_is_bootstrapped,
|
||||
):
|
||||
rel_id = self.harness.add_relation("metrics-endpoint", "prometheus")
|
||||
self.harness.add_relation_unit(rel_id, "prometheus/0")
|
||||
|
||||
unit_rel_data = self.harness.get_relation_data(
|
||||
rel_id, self.harness.model.unit
|
||||
)
|
||||
self.assertEqual(
|
||||
unit_rel_data["prometheus_scrape_unit_address"], "10.0.0.10"
|
||||
)
|
||||
|
||||
# Trigger relation change event as a side effect
|
||||
self.harness.update_relation_data(
|
||||
rel_id, "prometheus/0", {"foo": "bar"}
|
||||
)
|
||||
|
||||
mgr_enable_module.assert_called_once()
|
||||
|
||||
app_rel_data = self.harness.get_relation_data(
|
||||
rel_id, self.harness.model.app
|
||||
)
|
||||
jobs = app_rel_data["scrape_jobs"]
|
||||
self.assertEqual(
|
||||
jobs,
|
||||
(
|
||||
'[{"metrics_path": "/metrics", '
|
||||
'"static_configs": [{"targets": ["*:9283"]}]}]'
|
||||
),
|
||||
)
|
|
@ -297,8 +297,7 @@ class CephUtilsTestCase(test_utils.CharmTestCase):
|
|||
releases = utils.get_ceph_osd_releases()
|
||||
|
||||
self.assertEqual(len(releases), 2)
|
||||
self.assertEqual(releases[0], ceph_release_1)
|
||||
self.assertEqual(releases[1], ceph_release_2)
|
||||
self.assertEqual(sorted(releases), [ceph_release_1, ceph_release_2])
|
||||
|
||||
@mock.patch.object(utils.subprocess, 'check_output')
|
||||
@mock.patch.object(utils.json, 'loads')
|
||||
|
|
Loading…
Reference in New Issue