Juju Charm - Ceph MON

Go to file

Peter Sabaini 24fccea832 Add alerting rules for RGW multisite deployments Add default prometheus alerting rules for RadosGW multisite deployments based on the built-in Ceph RGW multisite metrics. Note that the included prometheus_alerts.yml.default rule file is included for reference only. The ceph-mon charm will utilize the resource file from https://charmhub.io/ceph-mon/resources/alert-rules for deployment so that operators can easily customize these rules. Change-Id: I5a12162d73686963132a952bddd85ec205964de4		2024-01-17 16:50:37 +01:00
actions	Revert "Create NRPE check to verify ceph daemons versions"	2023-02-01 12:31:16 +09:00
files	Add alerting rules for RGW multisite deployments	2024-01-17 16:50:37 +01:00
lib/charms	Implement prometheus alert rules	2022-09-23 14:22:06 +02:00
src	Merge "Retry setting rbd_stats_pools prometheus config"	2024-01-10 07:45:07 +00:00
templates	Add configuration options for disk usage alerting thresholds	2021-05-12 17:25:50 +03:00
tests	Fix version retrieval	2023-09-29 21:04:23 +02:00
unit_tests	Merge "Add nagios check for radosgw-admin sync status"	2024-01-10 07:40:46 +00:00
.gitignore	Update to classic charms to build using charmcraft in CI	2022-02-16 10:45:25 -05:00
.gitreview	OpenDev Migration Patch	2019-04-19 19:28:22 +00:00
.project	Add support for Juju network spaces	2016-04-07 16:22:52 +01:00
.pydevproject	Add support for Juju network spaces	2016-04-07 16:22:52 +01:00
.stestr.conf	Move from .testr.conf to .stestr.conf	2017-11-30 10:44:40 +00:00
.zuul.yaml	Add support for interim Ubuntu releases	2023-03-17 08:56:03 -04:00
actions.yaml	Revert "Create NRPE check to verify ceph daemons versions"	2023-02-01 12:31:16 +09:00
bindep.txt	Fix: init alert rules on rel change	2022-11-30 14:59:32 +01:00
build-requirements.txt	Update to classic charms to build using charmcraft in CI	2022-02-16 10:45:25 -05:00
charmcraft.yaml	Add 2023.2 Bobcat support	2023-08-03 13:52:35 -04:00
config.yaml	Merge "Add nagios check for radosgw-admin sync status"	2024-01-10 07:40:46 +00:00
copyright	Re-license charm as Apache-2.0	2016-07-01 13:55:54 +01:00
hardening.yaml	Add hardening support	2016-03-29 20:26:58 +01:00
icon.svg	Update charm icon	2017-07-31 14:15:49 -05:00
LICENSE	Re-license charm as Apache-2.0	2016-07-01 13:55:54 +01:00
metadata.yaml	Merge "Add 2023.2 Bobcat support"	2023-08-04 15:12:34 +00:00
osci.yaml	Prune CI test jobs and test bundles	2023-09-04 16:26:25 +02:00
README.md	Implement prometheus alert rules	2022-09-23 14:22:06 +02:00
rename.sh	Update to classic charms to build using charmcraft in CI	2022-02-16 10:45:25 -05:00
requirements.txt	Unpin tox version	2023-01-19 11:11:53 +09:00
setup.cfg	[dosaboy,r=james-page] Add broker functionality	2014-11-19 16:12:04 -06:00
test-requirements.txt	Rewrite the get-erasure-profile action with the ops framework	2022-10-26 11:28:42 +02:00
tox.ini	Tox: add Python 3.11 section to tox.ini	2023-11-10 14:12:29 +02:00

README.md

Overview

Ceph is a unified, distributed storage system designed for excellent performance, reliability, and scalability.

The ceph-mon charm deploys Ceph monitor nodes, allowing one to create a monitor cluster. It is used in conjunction with the ceph-osd charm. Together, these charms can scale out the amount of storage available in a Ceph cluster.

Usage

Configuration

This section covers common and/or important configuration options. See file config.yaml for the full list of options, along with their descriptions and default values. See the Juju documentation for details on configuring applications.

`customize-failure-domain`

The customize-failure-domain option determines how a Ceph CRUSH map is configured.

A value of 'false' (the default) will lead to a map that will replicate data across hosts (implemented as Ceph bucket type 'host'). With a value of 'true' all MAAS-defined zones will be used to generate a map that will replicate data across Ceph availability zones (implemented as bucket type 'rack').

This option is also supported by the ceph-osd charm. Its value must be the same for both charms.

`monitor-count`

The monitor-count option gives the number of ceph-mon units in the monitor sub-cluster (where one ceph-mon unit represents one MON). The default value is '3' and is generally a good choice, but it is good practice to set this explicitly to avoid a possible race condition during the formation of the sub-cluster. To establish quorum and enable partition tolerance an odd number of ceph-mon units is required.

Important

: A monitor count of less than three is not recommended for production environments. Test environments can use a single ceph-mon unit by setting this option to '1'.

`expected-osd-count`

The expected-osd-count option states the number of OSDs expected to be deployed in the cluster. This value can influence the number of placement groups (PGs) to use per pool. The PG calculation is based either on the actual number of OSDs or this option's value, whichever is greater. The default value is '0', which tells the charm to only consider the actual number of OSDs. If the actual number of OSDs is less than three then this option must explicitly state that number. Only until a sufficient (or prescribed) number of OSDs has been attained will the charm be able to create Ceph pools.

Note

: The inability to create a pool due to an insufficient number of OSDs will cause any consuming application (characterised by a relation involving the ceph-mon:client endpoint) to remain in the 'waiting' state.

`source`

The source option states the software sources. A common value is an OpenStack UCA release (e.g. 'cloud:xenial-queens' or 'cloud:bionic-ussuri'). See Ceph and the UCA. The underlying host's existing apt sources will be used if this option is not specified (this behaviour can be explicitly chosen by using the value of 'distro').

Deployment

A cloud with three MON nodes is a typical design whereas three OSDs are considered the minimum. For example, to deploy a Ceph cluster consisting of three OSDs (one per ceph-osd unit) and three MONs:

juju deploy -n 3 --config ceph-osd.yaml ceph-osd
juju deploy -n 3 --to lxd:0,lxd:1,lxd:2 ceph-mon
juju add-relation ceph-osd:mon ceph-mon:osd

Here, a containerised MON is running alongside each storage node. We've assumed that the machines spawned in the first command are assigned IDs of 0, 1, and 2.

By default, the monitor cluster will not be complete until three ceph-mon units have been deployed. This is to ensure that a quorum is achieved prior to the addition of storage devices.

See the Ceph documentation for notes on monitor cluster deployment strategies.

Note

: Refer to the Install OpenStack page in the OpenStack Charms Deployment Guide for instructions on installing a monitor cluster for use with OpenStack.

Network spaces

This charm supports the use of Juju network spaces (Juju v.2.0). This feature optionally allows specific types of the application's network traffic to be bound to subnets that the underlying hardware is connected to.

Note

: Spaces must be configured in the backing cloud prior to deployment.

The ceph-mon charm exposes the following Ceph traffic types (bindings):

'public' (front-side)
'cluster' (back-side)

For example, providing that spaces 'data-space' and 'cluster-space' exist, the deploy command above could look like this:

juju deploy -n 3 --config ceph-mon.yaml ceph-mon \
   --bind "public=data-space cluster=cluster-space"

Alternatively, configuration can be provided as part of a bundle:

    ceph-mon:
      charm: cs:ceph-mon
      num_units: 1
      bindings:
        public: data-space
        cluster: cluster-space

Refer to the Ceph Network Reference to learn about the implications of segregating Ceph network traffic.

Note

: Existing ceph-mon units configured with the ceph-public-network or ceph-cluster-network options will continue to honour them. Furthermore, these options override any space bindings, if set.

Monitoring

The charm supports Ceph metric monitoring with Prometheus. Add relations to the prometheus application in this way:

juju deploy prometheus2
juju add-relation ceph-mon prometheus2

Note

: Prometheus support is available starting with Ceph Luminous (xenial-queens UCA pocket).

Alternatively, integration with the COS Lite observability stack is available via the metrics-endpoint relation.

Relating to prometheus-k8s via the metrics-endpoint interface (as is found in the COS Lite bundle) will send metrics to prometheus. Additionally, alerting rules will be configured for prometheus as well. Alerting rules are configured as a resource alert-rules; the default rules are taken from upstream ceph rules. It is possible to replace the default with customized rules by attaching a resource:

juju attach ceph-mon alert-rules=./my-prom-alerts.yaml.rules

Actions

This section lists Juju actions supported by the charm. Actions allow specific operations to be performed on a per-unit basis. To display action descriptions run juju actions ceph-mon. If the charm is not deployed then see file actions.yaml.

change-osd-weight
copy-pool
create-cache-tier
create-crush-rule
create-erasure-profile
create-pool
crushmap-update
delete-erasure-profile
delete-pool
get-erasure-profile
get-health
list-erasure-profiles
list-inconsistent-objs
list-pools
pause-health
pool-get
pool-set
pool-statistics
purge-osd
remove-cache-tier
remove-pool-snapshot
rename-pool
resume-health
security-checklist
set-noout
set-pool-max-bytes
show-disk-free
snapshot-pool
unset-noout

Presenting the list of Ceph pools with details

The following example returns the list of pools with details: id, name, size and min_size. The jq utility has been used to parse the action output in json format.

juju run-action --wait ceph-mon/leader list-pools detail=true \
  --format json | jq '.[].results.pools | fromjson | .[]
  | {pool:.pool, name:.pool_name, size:.size, min_size:.min_size}'

Sample output:

{
  "pool": 1,
  "name": "test",
  "size": 3,
  "min_size": 2
}
{
  "pool": 2,
  "name": "test2",
  "size": 3,
  "min_size": 2
}

Bugs

Please report bugs on Launchpad.

For general charm questions refer to the OpenStack Charm Guide.