Ceph plugin updates for Luminous

Since the Luminous release of Ceph, the plugin no longer exports metrics
such as object storage daemon stats, placement groups and pool stats.
Check for the installed version of the Ceph command and parse results
according to version.

Include test data for Jewel and Luminous Ceph clusters.

Story: 2005032
Task: 29515

Change-Id: I0aef0db25f49545c715b07880edd57135e3beafe
Co-Authored-By: Bharat Kunwar <bharat@stackhpc.com>
Co-Authored-By: Doug Szumski <doug@stackhpc.com>
This commit is contained in:
Stig Telfer 2019-03-08 18:41:12 +01:00 committed by Doug Szumski
parent 57dd39a6a4
commit 3406f8243c
19 changed files with 4068 additions and 308 deletions

View File

@ -743,8 +743,10 @@ The `custom` section of `init_config` is optional and may be blank or removed en
Because `check_mk_agent` can only return all local metrics at once, the `check_mk_local` plugin requires no instances to be defined in the configuration. It runs `check_mk_agent` once and processes all the results. This way, new `check_mk` local scripts can be added without having to modify the plugin configuration.
## Ceph
This section describes the Ceph check that can be performed by the Agent. The Ceph check gathers metrics from multiple ceph clusters. The Ceph check requires a configuration file called `ceph.yaml` to be available in the agent conf.d configuration directory. The config file must contain the cluster name that you are interested in monitoring (defaults to `ceph`). Also, it is possible to configure the agent to collect only specific metrics about the cluster (usage, stats, monitors, osds or pools).
## Ceph (deprecated after Luminous)
This section describes the Ceph check that can be performed by the Agent. It supports Ceph releases from Jewel through to Luminous. After the Luminous release, it is recommended to enable the Ceph Prometheus endpoint and use the Monasca Agent Prometheus plugin.
The Ceph check gathers metrics from multiple ceph clusters. The Ceph check requires a configuration file called `ceph.yaml` to be available in the agent conf.d configuration directory. The config file must contain the cluster name that you are interested in monitoring (defaults to `ceph`). Also, it is possible to configure the agent to collect only specific metrics about the cluster (usage, stats, monitors, osds or pools).
Requirements:
* ceph-common
@ -776,97 +778,97 @@ instances:
The Ceph checks return the following metrics:
| Metric Name | Dimensions | Semantics |
| ----------- | ---------- | --------- |
| ceph.cluster.total_bytes | hostname, ceph_cluster, service=ceph | Total capacity of the cluster in bytes |
| ceph.cluster.total_used_bytes | hostname, ceph_cluster, service=ceph | Capacity of the cluster currently in use in bytes |
| ceph.cluster.total_avail_bytes | hostname, ceph_cluster, service=ceph | Available space within the cluster in bytes |
| ceph.cluster.objects.total_count | hostname, ceph_cluster, service=ceph | No. of rados objects within the cluster |
| ceph.cluster.utilization_perc | hostname, ceph_cluster, service=ceph | Percentage of available storage on the cluster |
| ceph.cluster.health_status | hostname, ceph_cluster, service=ceph | Health status of cluster, can vary between 3 states (err:2, warn:1, ok:0) |
| ceph.cluster.osds.down_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in DOWN state |
| ceph.cluster.osds.out_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in OUT state |
| ceph.cluster.osds.up_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in UP state |
| ceph.cluster.osds.in_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in IN state |
| ceph.cluster.osds.total_count | hostname, ceph_cluster, service=ceph | Total number of OSDs in the cluster |
| ceph.cluster.objects.degraded_count | hostname, ceph_cluster, service=ceph | Number of degraded objects across all PGs, includes replicas |
| ceph.cluster.objects.misplaced_count | hostname, ceph_cluster, service=ceph | Number of misplaced objects across all PGs, includes replicas |
| ceph.cluster.pgs.avg_per_osd | hostname, ceph_cluster, service=ceph | Average number of PGs per OSD in the cluster |
| ceph.cluster.pgs.total_count | hostname, ceph_cluster, service=ceph | Total no. of PGs in the cluster |
| ceph.cluster.pgs.scrubbing_count | hostname, ceph_cluster, service=ceph | Number of scrubbing PGs in the cluster |
| ceph.cluster.pgs.deep_scrubbing_count | hostname, ceph_cluster, service=ceph | Number of deep scrubbing PGs in the cluster |
| ceph.cluster.pgs.degraded_count | hostname, ceph_cluster, service=ceph | Number of PGs in a degraded state |
| ceph.cluster.pgs.stuck_degraded_count | hostname, ceph_cluster, service=ceph | No. of PGs stuck in a degraded state |
| ceph.cluster.pgs.unclean_count | hostname, ceph_cluster, service=ceph | Number of PGs in an unclean state |
| ceph.cluster.pgs.stuck_unclean_count | hostname, ceph_cluster, service=ceph | Number of PGs stuck in an unclean state |
| ceph.cluster.pgs.undersized_count | hostname, ceph_cluster, service=ceph | Number of undersized PGs in the cluster |
| ceph.cluster.pgs.stuck_undersized_count | hostname, ceph_cluster, service=ceph | Number of stuck undersized PGs in the cluster |
| ceph.cluster.pgs.stale_count | hostname, ceph_cluster, service=ceph | Number of stale PGs in the cluster |
| ceph.cluster.pgs.stuck_stale_count | hostname, ceph_cluster, service=ceph | Number of stuck stale PGs in the cluster |
| ceph.cluster.pgs.remapped_count | hostname, ceph_cluster, service=ceph | Number of PGs that are remapped and incurring cluster-wide movement |
| ceph.cluster.recovery.bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being recovered in cluster per second |
| ceph.cluster.recovery.keys_per_sec | hostname, ceph_cluster, service=ceph | Rate of keys being recovered in cluster per second |
| ceph.cluster.recovery.objects_per_sec | hostname, ceph_cluster, service=ceph | Rate of objects being recovered in cluster per second |
| ceph.cluster.client.read_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being read by all clients per second |
| ceph.cluster.client.write_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being written by all clients per second |
| ceph.cluster.client.read_ops | hostname, ceph_cluster, service=ceph | Total client read I/O ops on the cluster measured per second |
| ceph.cluster.client.write_ops | hostname, ceph_cluster, service=ceph | Total client write I/O ops on the cluster measured per second |
| ceph.cluster.cache.flush_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being flushed from the cache pool per second |
| ceph.cluster.cache.evict_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being evicted from the cache pool per second |
| ceph.cluster.cache.promote_ops | hostname, ceph_cluster, service=ceph | Total cache promote operations measured per second |
| ceph.cluster.slow_requests_count | hostname, ceph_cluster, service=ceph | Number of slow requests |
| ceph.cluster.quorum_size | hostname, ceph_cluster, service=ceph | Number of monitors in quorum |
| ceph.monitor.total_bytes | hostname, ceph_cluster, monitor, service=ceph | Total storage capacity of the monitor node |
| ceph.monitor.used_bytes | hostname, ceph_cluster, monitor, service=ceph | Storage of the monitor node that is currently allocated for use |
| ceph.monitor.avail_bytes | hostname, ceph_cluster, monitor, service=ceph | Total unused storage capacity that the monitor node has left |
| ceph.monitor.avail_perc | hostname, ceph_cluster, monitor, service=ceph | Percentage of total unused storage capacity that the monitor node has left |
| ceph.monitor.store.total_bytes | hostname, ceph_cluster, monitor, service=ceph | Total capacity of the FileStore backing the monitor daemon |
| ceph.monitor.store.sst_bytes | hostname, ceph_cluster, monitor, service=ceph | Capacity of the FileStore used only for raw SSTs |
| ceph.monitor.store.log_bytes | hostname, ceph_cluster, monitor, service=ceph | Capacity of the FileStore used only for logging |
| ceph.monitor.store.misc_bytes | hostname, ceph_cluster, monitor, service=ceph | Capacity of the FileStore used only for storing miscellaneous information |
| ceph.monitor.skew | hostname, ceph_cluster, monitor, service=ceph | Monitor clock skew |
| ceph.monitor.latency | hostname, ceph_cluster, monitor, service=ceph | Monitor's latency |
| ceph.osd.crush_weight | hostname, ceph_cluster, osd, service=ceph | OSD crush weight |
| ceph.osd.depth | hostname, ceph_cluster, osd, service=ceph | OSD depth |
| ceph.osd.reweight | hostname, ceph_cluster, osd, service=ceph | OSD reweight |
| ceph.osd.total_bytes | hostname, ceph_cluster, osd, service=ceph | OSD total bytes |
| ceph.osd.used_bytes | hostname, ceph_cluster, osd, service=ceph | OSD used storage in bytes |
| ceph.osd.avail_bytes | hostname, ceph_cluster, osd, service=ceph | OSD available storage in bytes |
| ceph.osd.utilization_perc | hostname, ceph_cluster, osd, service=ceph | OSD utilization |
| ceph.osd.variance | hostname, ceph_cluster, osd, service=ceph | OSD variance |
| ceph.osd.pgs_count | hostname, ceph_cluster, osd, service=ceph | OSD placement group count |
| ceph.osd.perf.commit_latency_seconds | hostname, ceph_cluster, osd, service=ceph | OSD commit latency in seconds |
| ceph.osd.perf.apply_latency_seconds | hostname, ceph_cluster, osd, service=ceph | OSD apply latency in seconds |
| ceph.osd.up | hostname, ceph_cluster, osd, service=ceph | OSD up status (up: 1, down: 0) |
| ceph.osd.in | hostname, ceph_cluster, osd, service=ceph | OSD in status (in: 1, out: 0) |
| ceph.osds.total_bytes | hostname, ceph_cluster, service=ceph | OSDs total storage in bytes |
| ceph.osds.total_used_bytes | hostname, ceph_cluster, service=ceph | OSDs total used storage in bytes |
| ceph.osds.total_avail_bytes | hostname, ceph_cluster, service=ceph | OSDs total available storage in bytes |
| ceph.osds.avg_utilization_perc | hostname, ceph_cluster, osd, service=ceph | OSDs average utilization in percent |
| ceph.pool.used_bytes | hostname, ceph_cluster, pool, service=ceph | Capacity of the pool that is currently under use |
| ceph.pool.used_raw_bytes | hostname, ceph_cluster, pool, service=ceph | Raw capacity of the pool that is currently under use, this factors in the size |
| ceph.pool.max_avail_bytes | hostname, ceph_cluster, pool, service=ceph | Free space for this ceph pool |
| ceph.pool.objects_count | hostname, ceph_cluster, pool, service=ceph | Total no. of objects allocated within the pool |
| ceph.pool.dirty_objects_count | hostname, ceph_cluster, pool, service=ceph | Total no. of dirty objects in a cache-tier pool |
| ceph.pool.read_io | hostname, ceph_cluster, pool, service=ceph | Total read i/o calls for the pool |
| ceph.pool.read_bytes | hostname, ceph_cluster, pool, service=ceph | Total read throughput for the pool |
| ceph.pool.write_io | hostname, ceph_cluster, pool, service=ceph | Total write i/o calls for the pool |
| ceph.pool.write | hostname, ceph_cluster, pool, service=ceph | Total write throughput for the pool |
| ceph.pool.quota_max_bytes | hostname, ceph_cluster, pool, service=ceph | Quota maximum bytes for the pool |
| ceph.pool.quota_max_objects | hostname, ceph_cluster, pool, service=ceph | Quota maximum objects for the pool |
| ceph.pool.total_bytes | hostname, ceph_cluster, pool, service=ceph | Total capacity of the pool in bytes |
| ceph.pool.utilization_perc | hostname, ceph_cluster, pool, service=ceph | Percentage of used storage for the pool |
| ceph.pool.client.read_bytes_sec | hostname, ceph_cluster, pool, service=ceph | Read bytes per second on the pool |
| ceph.pool.client.write_bytes_sec | hostname, ceph_cluster, pool, service=ceph | Write bytes per second on the pool |
| ceph.pool.client.read_ops | hostname, ceph_cluster, pool, service=ceph | Read operations per second on the pool |
| ceph.pool.client.write_ops | hostname, ceph_cluster, pool, service=ceph | Write operations per second on the pool |
| ceph.pool.recovery.objects_per_sec | hostname, ceph_cluster, pool, service=ceph | Objects recovered per second on the pool |
| ceph.pool.recovery.bytes_per_sec | hostname, ceph_cluster, pool, service=ceph | Bytes recovered per second on the pool |
| ceph.pool.recovery.keys_per_sec | hostname, ceph_cluster, pool, service=ceph | Keys recovered per second on the pool |
| ceph.pool.recovery.objects | hostname, ceph_cluster, pool, service=ceph | Objects recovered on the pool |
| ceph.pool.recovery.bytes | hostname, ceph_cluster, pool, service=ceph | Bytes recovered on the pool |
| ceph.pool.recovery.keys | hostname, ceph_cluster, pool, service=ceph | Keys recovered on the pool |
| ceph.pools.count | hostname, ceph_cluster, service=ceph | Number of pools on the cluster |
| Metric Name | Dimensions | Semantics | Notes |
| ----------- | ---------- | --------- | ----- |
| ceph.cluster.total_bytes | hostname, ceph_cluster, service=ceph | Total capacity of the cluster in bytes | |
| ceph.cluster.total_used_bytes | hostname, ceph_cluster, service=ceph | Capacity of the cluster currently in use in bytes | |
| ceph.cluster.total_avail_bytes | hostname, ceph_cluster, service=ceph | Available space within the cluster in bytes | |
| ceph.cluster.objects.total_count | hostname, ceph_cluster, service=ceph | No. of rados objects within the cluster | |
| ceph.cluster.utilization_perc | hostname, ceph_cluster, service=ceph | Percentage of available storage on the cluster | |
| ceph.cluster.health_status | hostname, ceph_cluster, service=ceph | Health status of cluster, can vary between 3 states (err:2, warn:1, ok:0) | |
| ceph.cluster.osds.down_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in DOWN state | |
| ceph.cluster.osds.out_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in OUT state | |
| ceph.cluster.osds.up_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in UP state | |
| ceph.cluster.osds.in_count | hostname, ceph_cluster, service=ceph | Number of OSDs that are in IN state | |
| ceph.cluster.osds.total_count | hostname, ceph_cluster, service=ceph | Total number of OSDs in the cluster | |
| ceph.cluster.objects.degraded_count | hostname, ceph_cluster, service=ceph | Number of degraded objects across all PGs, includes replicas | |
| ceph.cluster.objects.misplaced_count | hostname, ceph_cluster, service=ceph | Number of misplaced objects across all PGs, includes replicas | |
| ceph.cluster.pgs.avg_per_osd | hostname, ceph_cluster, service=ceph | Average number of PGs per OSD in the cluster | |
| ceph.cluster.pgs.total_count | hostname, ceph_cluster, service=ceph | Total no. of PGs in the cluster | |
| ceph.cluster.pgs.scrubbing_count | hostname, ceph_cluster, service=ceph | Number of scrubbing PGs in the cluster | |
| ceph.cluster.pgs.deep_scrubbing_count | hostname, ceph_cluster, service=ceph | Number of deep scrubbing PGs in the cluster | |
| ceph.cluster.pgs.degraded_count | hostname, ceph_cluster, service=ceph | Number of PGs in a degraded state | |
| ceph.cluster.pgs.stuck_degraded_count | hostname, ceph_cluster, service=ceph | No. of PGs stuck in a degraded state | |
| ceph.cluster.pgs.unclean_count | hostname, ceph_cluster, service=ceph | Number of PGs in an unclean state | |
| ceph.cluster.pgs.stuck_unclean_count | hostname, ceph_cluster, service=ceph | Number of PGs stuck in an unclean state | |
| ceph.cluster.pgs.undersized_count | hostname, ceph_cluster, service=ceph | Number of undersized PGs in the cluster | |
| ceph.cluster.pgs.stuck_undersized_count | hostname, ceph_cluster, service=ceph | Number of stuck undersized PGs in the cluster | |
| ceph.cluster.pgs.stale_count | hostname, ceph_cluster, service=ceph | Number of stale PGs in the cluster | |
| ceph.cluster.pgs.stuck_stale_count | hostname, ceph_cluster, service=ceph | Number of stuck stale PGs in the cluster | |
| ceph.cluster.pgs.remapped_count | hostname, ceph_cluster, service=ceph | Number of PGs that are remapped and incurring cluster-wide movement | |
| ceph.cluster.recovery.bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being recovered in cluster per second | |
| ceph.cluster.recovery.keys_per_sec | hostname, ceph_cluster, service=ceph | Rate of keys being recovered in cluster per second | |
| ceph.cluster.recovery.objects_per_sec | hostname, ceph_cluster, service=ceph | Rate of objects being recovered in cluster per second | |
| ceph.cluster.client.read_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being read by all clients per second | |
| ceph.cluster.client.write_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being written by all clients per second | |
| ceph.cluster.client.read_ops | hostname, ceph_cluster, service=ceph | Total client read I/O ops on the cluster measured per second | |
| ceph.cluster.client.write_ops | hostname, ceph_cluster, service=ceph | Total client write I/O ops on the cluster measured per second | |
| ceph.cluster.cache.flush_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being flushed from the cache pool per second | |
| ceph.cluster.cache.evict_bytes_per_sec | hostname, ceph_cluster, service=ceph | Rate of bytes being evicted from the cache pool per second | |
| ceph.cluster.cache.promote_ops | hostname, ceph_cluster, service=ceph | Total cache promote operations measured per second | |
| ceph.cluster.slow_requests_count | hostname, ceph_cluster, service=ceph | Number of slow requests | |
| ceph.cluster.quorum_size | hostname, ceph_cluster, service=ceph | Number of monitors in quorum | |
| ceph.monitor.total_bytes | hostname, ceph_cluster, monitor, service=ceph | Total storage capacity of the monitor node | Removed after Jewel |
| ceph.monitor.used_bytes | hostname, ceph_cluster, monitor, service=ceph | Storage of the monitor node that is currently allocated for use | Removed after Jewel |
| ceph.monitor.avail_bytes | hostname, ceph_cluster, monitor, service=ceph | Total unused storage capacity that the monitor node has left | Removed after Jewel |
| ceph.monitor.avail_perc | hostname, ceph_cluster, monitor, service=ceph | Percentage of total unused storage capacity that the monitor node has left | Removed after Jewel |
| ceph.monitor.store.total_bytes | hostname, ceph_cluster, monitor, service=ceph | Total capacity of the FileStore backing the monitor daemon | Removed after Jewel |
| ceph.monitor.store.sst_bytes | hostname, ceph_cluster, monitor, service=ceph | Capacity of the FileStore used only for raw SSTs | Removed after Jewel |
| ceph.monitor.store.log_bytes | hostname, ceph_cluster, monitor, service=ceph | Capacity of the FileStore used only for logging | Removed after Jewel |
| ceph.monitor.store.misc_bytes | hostname, ceph_cluster, monitor, service=ceph | Capacity of the FileStore used only for storing miscellaneous information | Removed after Jewel |
| ceph.monitor.skew | hostname, ceph_cluster, monitor, service=ceph | Monitor clock skew | Removed after Jewel |
| ceph.monitor.latency | hostname, ceph_cluster, monitor, service=ceph | Monitor's latency | Removed after Jewel |
| ceph.osd.crush_weight | hostname, ceph_cluster, osd, service=ceph | OSD crush weight | |
| ceph.osd.depth | hostname, ceph_cluster, osd, service=ceph | OSD depth | |
| ceph.osd.reweight | hostname, ceph_cluster, osd, service=ceph | OSD reweight | |
| ceph.osd.total_bytes | hostname, ceph_cluster, osd, service=ceph | OSD total bytes | |
| ceph.osd.used_bytes | hostname, ceph_cluster, osd, service=ceph | OSD used storage in bytes | |
| ceph.osd.avail_bytes | hostname, ceph_cluster, osd, service=ceph | OSD available storage in bytes | |
| ceph.osd.utilization_perc | hostname, ceph_cluster, osd, service=ceph | OSD utilization | |
| ceph.osd.variance | hostname, ceph_cluster, osd, service=ceph | OSD variance | |
| ceph.osd.pgs_count | hostname, ceph_cluster, osd, service=ceph | OSD placement group count | |
| ceph.osd.perf.commit_latency_seconds | hostname, ceph_cluster, osd, service=ceph | OSD commit latency in seconds | |
| ceph.osd.perf.apply_latency_seconds | hostname, ceph_cluster, osd, service=ceph | OSD apply latency in seconds | |
| ceph.osd.up | hostname, ceph_cluster, osd, service=ceph | OSD up status (up: 1, down: 0) | |
| ceph.osd.in | hostname, ceph_cluster, osd, service=ceph | OSD in status (in: 1, out: 0) | |
| ceph.osds.total_bytes | hostname, ceph_cluster, service=ceph | OSDs total storage in bytes | |
| ceph.osds.total_used_bytes | hostname, ceph_cluster, service=ceph | OSDs total used storage in bytes | |
| ceph.osds.total_avail_bytes | hostname, ceph_cluster, service=ceph | OSDs total available storage in bytes | |
| ceph.osds.avg_utilization_perc | hostname, ceph_cluster, osd, service=ceph | OSDs average utilization in percent | |
| ceph.pool.used_bytes | hostname, ceph_cluster, pool, service=ceph | Capacity of the pool that is currently under use | |
| ceph.pool.used_raw_bytes | hostname, ceph_cluster, pool, service=ceph | Raw capacity of the pool that is currently under use, this factors in the size | |
| ceph.pool.max_avail_bytes | hostname, ceph_cluster, pool, service=ceph | Free space for this ceph pool | |
| ceph.pool.objects_count | hostname, ceph_cluster, pool, service=ceph | Total no. of objects allocated within the pool | |
| ceph.pool.dirty_objects_count | hostname, ceph_cluster, pool, service=ceph | Total no. of dirty objects in a cache-tier pool | |
| ceph.pool.read_io | hostname, ceph_cluster, pool, service=ceph | Total read i/o calls for the pool | |
| ceph.pool.read_bytes | hostname, ceph_cluster, pool, service=ceph | Total read throughput for the pool | |
| ceph.pool.write_io | hostname, ceph_cluster, pool, service=ceph | Total write i/o calls for the pool | |
| ceph.pool.write | hostname, ceph_cluster, pool, service=ceph | Total write throughput for the pool | |
| ceph.pool.quota_max_bytes | hostname, ceph_cluster, pool, service=ceph | Quota maximum bytes for the pool | |
| ceph.pool.quota_max_objects | hostname, ceph_cluster, pool, service=ceph | Quota maximum objects for the pool | |
| ceph.pool.total_bytes | hostname, ceph_cluster, pool, service=ceph | Total capacity of the pool in bytes | |
| ceph.pool.utilization_perc | hostname, ceph_cluster, pool, service=ceph | Percentage of used storage for the pool | |
| ceph.pool.client.read_bytes_sec | hostname, ceph_cluster, pool, service=ceph | Read bytes per second on the pool | |
| ceph.pool.client.write_bytes_sec | hostname, ceph_cluster, pool, service=ceph | Write bytes per second on the pool | |
| ceph.pool.client.read_ops | hostname, ceph_cluster, pool, service=ceph | Read operations per second on the pool | |
| ceph.pool.client.write_ops | hostname, ceph_cluster, pool, service=ceph | Write operations per second on the pool | |
| ceph.pool.recovery.objects_per_sec | hostname, ceph_cluster, pool, service=ceph | Objects recovered per second on the pool | |
| ceph.pool.recovery.bytes_per_sec | hostname, ceph_cluster, pool, service=ceph | Bytes recovered per second on the pool | |
| ceph.pool.recovery.keys_per_sec | hostname, ceph_cluster, pool, service=ceph | Keys recovered per second on the pool | |
| ceph.pool.recovery.objects | hostname, ceph_cluster, pool, service=ceph | Objects recovered on the pool | |
| ceph.pool.recovery.bytes | hostname, ceph_cluster, pool, service=ceph | Bytes recovered on the pool | |
| ceph.pool.recovery.keys | hostname, ceph_cluster, pool, service=ceph | Keys recovered on the pool | |
| ceph.pools.count | hostname, ceph_cluster, service=ceph | Number of pools on the cluster | |
## Certificate Expiration (HTTPS)
An extension to the Agent provides the ability to determine the expiration date of the certificate for the URL. The metric is days until the certificate expires

View File

@ -362,10 +362,15 @@ class Ceph(checks.AgentCheck):
"""
metrics = {}
ceph_status_health = ceph_status['health']
metrics['ceph.cluster.health_status'] = self._parse_ceph_status(
ceph_status_health['overall_status'])
# The Ceph tools are deprecating 'overall_status' in favour of 'status'
# The deprecation logic fixes overall_status=HEALTH_WARN for awareness.
ceph_health = ceph_status_health.get('status',
ceph_status_health.get(
'overall_status',
'HEALTH_UNKNOWN'))
metrics['ceph.cluster.health_status'] = self._parse_ceph_status(ceph_health)
for s in ceph_status_health['summary']:
for s in ceph_status_health.get('summary', []):
metrics.update(self._get_summary_metrics(s['summary']))
osds = ceph_status['osdmap']['osdmap']
@ -399,14 +404,16 @@ class Ceph(checks.AgentCheck):
'ceph.cluster.pgs.total_count'] / metrics[
'ceph.cluster.osds.total_count']
# In Luminous the format of output parsed here changed slightly.
# Check for both known variations.
ceph_status_plain = ceph_status_plain.split('\n')
for l in ceph_status_plain:
line = l.strip(' ')
if line.startswith('recovery io'):
if line.startswith('recovery io') or line.startswith('recovery:'):
metrics.update(self._get_recovery_io(line))
elif line.startswith('client io'):
elif line.startswith('client io') or line.startswith('client:'):
metrics.update(self._get_client_io(line))
elif line.startswith('cache io'):
elif line.startswith('cache io') or line.startswith('cache:'):
metrics.update(self._get_cache_io(line))
metrics['ceph.cluster.quorum_size'] = len(ceph_status['quorum'])
@ -417,9 +424,10 @@ class Ceph(checks.AgentCheck):
with metrics regarding each monitor found, in the format
{'monitor1': {metric1': value1, ...}, 'monitor2': {metric1': value1}}
"""
# This data is not returned in Luminous or later.
mon_metrics = {}
for health_service in ceph_status['health']['health'][
'health_services']:
for health_service in ceph_status.get('health').get('health', {}).get(
'health_services', []):
for mon in health_service['mons']:
store_stats = mon['store_stats']
mon['name'] = safe_decode(mon['name'], incoming='utf-8')

View File

@ -0,0 +1 @@
ceph version 10.2.6 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) jewel (stable)

View File

@ -0,0 +1,219 @@
{
"pools": [
{
"id": 1,
"name": "cephfs_metadata",
"stats": {
"bytes_used": 47653204,
"dirty": 66133,
"kb_used": 46537,
"max_avail": 236492341248,
"objects": 66133,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 91122080,
"rd": 545015,
"rd_bytes": 1511467008,
"wr": 1823016,
"wr_bytes": 15099714560
}
},
{
"id": 2,
"name": "cephfs_data",
"stats": {
"bytes_used": 1743163949040,
"dirty": 1287159,
"kb_used": 1702308544,
"max_avail": 52729787449344,
"objects": 1287159,
"percent_used": 0.72,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 3419234369536,
"rd": 624295721,
"rd_bytes": 25137454158848,
"wr": 65020292,
"wr_bytes": 18614600047616
}
},
{
"id": 13,
"name": ".rgw.root",
"stats": {
"bytes_used": 1113,
"dirty": 4,
"kb_used": 2,
"max_avail": 52953687785472,
"objects": 4,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 2226,
"rd": 4806,
"rd_bytes": 3280896,
"wr": 4,
"wr_bytes": 4096
}
},
{
"id": 14,
"name": "default.rgw.control",
"stats": {
"bytes_used": 0,
"dirty": 8,
"kb_used": 0,
"max_avail": 52953687785472,
"objects": 8,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 0,
"rd": 0,
"rd_bytes": 0,
"wr": 0,
"wr_bytes": 0
}
},
{
"id": 15,
"name": "default.rgw.meta",
"stats": {
"bytes_used": 832,
"dirty": 6,
"kb_used": 1,
"max_avail": 52953687785472,
"objects": 6,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 1664,
"rd": 220,
"rd_bytes": 204800,
"wr": 147,
"wr_bytes": 19456
}
},
{
"id": 16,
"name": "default.rgw.log",
"stats": {
"bytes_used": 0,
"dirty": 207,
"kb_used": 0,
"max_avail": 52953687785472,
"objects": 207,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 0,
"rd": 21378199,
"rd_bytes": 21891063808,
"wr": 14245040,
"wr_bytes": 0
}
},
{
"id": 17,
"name": "default.rgw.buckets.index",
"stats": {
"bytes_used": 0,
"dirty": 1,
"kb_used": 0,
"max_avail": 52953687785472,
"objects": 1,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 0,
"rd": 562,
"rd_bytes": 580608,
"wr": 237,
"wr_bytes": 0
}
},
{
"id": 18,
"name": "default.rgw.buckets.data",
"stats": {
"bytes_used": 95250,
"dirty": 6,
"kb_used": 94,
"max_avail": 52953687785472,
"objects": 6,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 190500,
"rd": 162,
"rd_bytes": 291840,
"wr": 427,
"wr_bytes": 602112
}
},
{
"id": 19,
"name": "kubernetes",
"stats": {
"bytes_used": 4859,
"dirty": 8,
"kb_used": 5,
"max_avail": 79430529581056,
"objects": 8,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 8503,
"rd": 164,
"rd_bytes": 133120,
"wr": 90892,
"wr_bytes": 126869424128
}
},
{
"id": 20,
"name": "volumes",
"stats": {
"bytes_used": 0,
"dirty": 0,
"kb_used": 0,
"max_avail": 79430529581056,
"objects": 0,
"percent_used": 0.0,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 0,
"rd": 0,
"rd_bytes": 0,
"wr": 0,
"wr_bytes": 0
}
},
{
"id": 21,
"name": "benchmark",
"stats": {
"bytes_used": 37906503270,
"dirty": 33213250,
"kb_used": 37018070,
"max_avail": 79430529581056,
"objects": 33213250,
"percent_used": 0.02,
"quota_bytes": 0,
"quota_objects": 0,
"raw_bytes_used": 66925428736,
"rd": 2261777,
"rd_bytes": 3829217280,
"wr": 33280403,
"wr_bytes": 62774775808
}
}
],
"stats": {
"total_avail_bytes": 113108752654336,
"total_bytes": 121070163910656,
"total_objects": 34566782,
"total_used_bytes": 7961411256320
}
}

View File

@ -0,0 +1,575 @@
{
"nodes": [
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 0,
"kb": 5886633532,
"kb_avail": 5479588668,
"kb_used": 407044864,
"name": "osd.0",
"pgs": 228,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.914731,
"var": 1.051531
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 1,
"kb": 5886633532,
"kb_avail": 5474891388,
"kb_used": 411742144,
"name": "osd.1",
"pgs": 204,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.994527,
"var": 1.063665
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 2,
"kb": 5886633532,
"kb_avail": 5443725372,
"kb_used": 442908160,
"name": "osd.2",
"pgs": 227,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 7.523964,
"var": 1.144177
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 3,
"kb": 5886633532,
"kb_avail": 5537865788,
"kb_used": 348767744,
"name": "osd.3",
"pgs": 196,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 5.92474,
"var": 0.900982
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 4,
"kb": 5886633532,
"kb_avail": 5503312828,
"kb_used": 383320704,
"name": "osd.4",
"pgs": 204,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.511713,
"var": 0.990243
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 5,
"kb": 5886633532,
"kb_avail": 5529989116,
"kb_used": 356644416,
"name": "osd.5",
"pgs": 197,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.058546,
"var": 0.92133
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 6,
"kb": 5886633532,
"kb_avail": 5560897596,
"kb_used": 325735936,
"name": "osd.6",
"pgs": 185,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 5.533484,
"var": 0.841483
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 7,
"kb": 5886633532,
"kb_avail": 5519587580,
"kb_used": 367045952,
"name": "osd.7",
"pgs": 205,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.235244,
"var": 0.9482
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 9,
"kb": 5886633532,
"kb_avail": 5516844732,
"kb_used": 369788800,
"name": "osd.9",
"pgs": 206,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.281838,
"var": 0.955286
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 12,
"kb": 5886633532,
"kb_avail": 5449338940,
"kb_used": 437294592,
"name": "osd.12",
"pgs": 230,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 7.428602,
"var": 1.129676
},
{
"crush_weight": 0.232788,
"depth": 2,
"device_class": "ssd",
"id": 30,
"kb": 249955652,
"kb_avail": 243443404,
"kb_used": 6512248,
"name": "osd.30",
"pgs": 1030,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 2.605361,
"var": 0.3962
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 8,
"kb": 5886633532,
"kb_avail": 5492111612,
"kb_used": 394521920,
"name": "osd.8",
"pgs": 214,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.701996,
"var": 1.01918
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 10,
"kb": 5886633532,
"kb_avail": 5458873660,
"kb_used": 427759872,
"name": "osd.10",
"pgs": 218,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 7.26663,
"var": 1.105044
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 11,
"kb": 5886633532,
"kb_avail": 5481880892,
"kb_used": 404752640,
"name": "osd.11",
"pgs": 194,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.875791,
"var": 1.045609
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 13,
"kb": 5886633532,
"kb_avail": 5497830588,
"kb_used": 388802944,
"name": "osd.13",
"pgs": 218,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.604844,
"var": 1.004406
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 14,
"kb": 5886633532,
"kb_avail": 5555946108,
"kb_used": 330687424,
"name": "osd.14",
"pgs": 190,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 5.617598,
"var": 0.854274
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 15,
"kb": 5886633532,
"kb_avail": 5480719740,
"kb_used": 405913792,
"name": "osd.15",
"pgs": 225,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.895517,
"var": 1.048609
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 16,
"kb": 5886633532,
"kb_avail": 5509113980,
"kb_used": 377519552,
"name": "osd.16",
"pgs": 192,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.413166,
"var": 0.975257
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 17,
"kb": 5886633532,
"kb_avail": 5524070460,
"kb_used": 362563072,
"name": "osd.17",
"pgs": 199,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.15909,
"var": 0.936619
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 18,
"kb": 5886633532,
"kb_avail": 5483380732,
"kb_used": 403252800,
"name": "osd.18",
"pgs": 229,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 6.850313,
"var": 1.041734
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 19,
"kb": 5886633532,
"kb_avail": 5469845436,
"kb_used": 416788096,
"name": "osd.19",
"pgs": 210,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 7.080245,
"var": 1.076701
},
{
"crush_weight": 0.232788,
"depth": 2,
"device_class": "ssd",
"id": 31,
"kb": 249955652,
"kb_avail": 244499580,
"kb_used": 5456072,
"name": "osd.31",
"pgs": 1030,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 2.182816,
"var": 0.331943
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 20,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.20",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 21,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.21",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 22,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.22",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 23,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.23",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 24,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.24",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 25,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.25",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 26,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.26",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 27,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.27",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 28,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.28",
"pgs": 0,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 5.482391,
"depth": 2,
"device_class": "hdd",
"id": 29,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.29",
"pgs": 0,
"pool_weights": {},
"reweight": 1.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
},
{
"crush_weight": 0.232788,
"depth": 2,
"device_class": "ssd",
"id": 32,
"kb": 0,
"kb_avail": 0,
"kb_used": 0,
"name": "osd.32",
"pgs": 0,
"pool_weights": {},
"reweight": 0.0,
"type": "osd",
"type_id": 0,
"utilization": 0.0,
"var": 0.0
}
],
"stray": [],
"summary": {
"average_utilization": 6.575872,
"dev": 2.305638,
"max_var": 1.144177,
"min_var": 0.0,
"total_kb": 118232581944,
"total_kb_avail": 110457758200,
"total_kb_used": 7774823744
}
}

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,158 @@
{
"osd_perf_infos": [
{
"id": 2,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 1,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 12,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 19,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 11,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 4,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 10,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 0,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 13,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 15,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 9,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 17,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 6,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 14,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 3,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 7,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 30,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 8,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 31,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 18,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 5,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
},
{
"id": 16,
"perf_stats": {
"apply_latency_ms": 0,
"commit_latency_ms": 0
}
}
]
}

View File

@ -0,0 +1,79 @@
[
{
"client_io_rate": {},
"pool_id": 1,
"pool_name": "cephfs_metadata",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 2,
"pool_name": "cephfs_data",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 13,
"pool_name": ".rgw.root",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 14,
"pool_name": "default.rgw.control",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 15,
"pool_name": "default.rgw.meta",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 16,
"pool_name": "default.rgw.log",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 17,
"pool_name": "default.rgw.buckets.index",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 18,
"pool_name": "default.rgw.buckets.data",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 19,
"pool_name": "kubernetes",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 20,
"pool_name": "volumes",
"recovery": {},
"recovery_rate": {}
},
{
"client_io_rate": {},
"pool_id": 21,
"pool_name": "benchmark",
"recovery": {},
"recovery_rate": {}
}
]

View File

@ -0,0 +1,176 @@
{
"election_epoch": 1768,
"fsid": "03d30763-0641-46ce-adfa-c7ff69d0101b",
"fsmap": {
"by_rank": [
{
"filesystem_id": 1,
"name": "mon1",
"rank": 0,
"status": "up:active"
}
],
"epoch": 91,
"id": 1,
"in": 1,
"max": 1,
"up": 1
},
"health": {
"checks": {
"OSD_DOWN": {
"severity": "HEALTH_WARN",
"summary": {
"message": "2 osds down"
}
},
"OSD_HOST_DOWN": {
"severity": "HEALTH_WARN",
"summary": {
"message": "1 host (11 osds) down"
}
},
"PG_DEGRADED": {
"severity": "HEALTH_WARN",
"summary": {
"message": "Degraded data redundancy: 9196074/70487094 objects degraded (13.046%), 2361 pgs unclean, 2361 pgs degraded, 2361 pgs undersized"
}
},
"POOL_APP_NOT_ENABLED": {
"severity": "HEALTH_WARN",
"summary": {
"message": "application not enabled on 2 pool(s)"
}
}
},
"overall_status": "HEALTH_WARN",
"status": "HEALTH_WARN"
},
"mgrmap": {
"active_addr": "10.4.99.100:6800/706",
"active_gid": 112994172,
"active_name": "admin",
"available": true,
"available_modules": [
"dashboard",
"prometheus",
"restful",
"status",
"zabbix"
],
"epoch": 88,
"modules": [
"restful",
"status"
],
"standbys": [
{
"available_modules": [
"dashboard",
"prometheus",
"restful",
"status",
"zabbix"
],
"gid": 112972449,
"name": "mon1"
},
{
"available_modules": [
"dashboard",
"prometheus",
"restful",
"status",
"zabbix"
],
"gid": 112994109,
"name": "osd1"
}
]
},
"monmap": {
"created": "2018-02-27 16:45:45.008887",
"epoch": 3,
"features": {
"optional": [],
"persistent": [
"kraken",
"luminous"
]
},
"fsid": "03d30763-0641-46ce-adfa-c7ff69d0101b",
"modified": "2018-02-27 16:46:36.506498",
"mons": [
{
"addr": "10.4.99.100:6789/0",
"name": "admin",
"public_addr": "10.4.99.100:6789/0",
"rank": 0
},
{
"addr": "10.4.99.101:6789/0",
"name": "mon1",
"public_addr": "10.4.99.101:6789/0",
"rank": 1
},
{
"addr": "10.4.99.102:6789/0",
"name": "osd1",
"public_addr": "10.4.99.102:6789/0",
"rank": 2
}
]
},
"osdmap": {
"osdmap": {
"epoch": 1884,
"full": false,
"nearfull": false,
"num_in_osds": 24,
"num_osds": 33,
"num_remapped_pgs": 14,
"num_up_osds": 22
}
},
"pgmap": {
"bytes_avail": 113108744396800,
"bytes_total": 121070163910656,
"bytes_used": 7961419513856,
"data_bytes": 1781122124601,
"degraded_objects": 9196074,
"degraded_ratio": 0.130465,
"degraded_total": 70487094,
"num_objects": 34566784,
"num_pgs": 3248,
"num_pools": 11,
"pgs_by_state": [
{
"count": 2361,
"state_name": "active+undersized+degraded"
},
{
"count": 873,
"state_name": "active+clean"
},
{
"count": 14,
"state_name": "active+clean+remapped"
}
]
},
"quorum": [
0,
1,
2
],
"quorum_names": [
"admin",
"mon1",
"osd1"
],
"servicemap": {
"epoch": 340,
"modified": "2019-03-26 10:59:13.523804",
"services": {}
}
}

View File

@ -0,0 +1,25 @@
cluster:
id: 03d30763-0641-46ce-adfa-c7ff69d0101b
health: HEALTH_WARN
2 osds down
1 host (11 osds) down
Degraded data redundancy: 9168695/70427256 objects degraded (13.019%), 2375 pgs unclean, 2375 pgs degraded, 2375 pgs undersized
application not enabled on 2 pool(s)
services:
mon: 3 daemons, quorum admin,mon1,osd1
mgr: admin(active), standbys: mon1, osd1
mds: cephfs-1/1/1 up {0=mon1=up:active}
osd: 33 osds: 22 up, 24 in
data:
pools: 11 pools, 3248 pgs
objects: 33737k objects, 1636 GB
usage: 7372 GB used, 102 TB / 110 TB avail
pgs: 9168695/70427256 objects degraded (13.019%)
2375 active+undersized+degraded
873 active+clean
io:
client: 4246 B/s rd, 374 kB/s wr, 2 op/s rd, 16 op/s wr

View File

@ -0,0 +1 @@
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)

File diff suppressed because it is too large Load Diff