docs/doc/source/fault-mgmt/kubernetes/800-series-alarm-messages.rst
Ron Stone ba85b5cb3b Alarm and event updates
Update Fault doc based on delta between r6 and master
events.yaml
Add |PTS| abbrev to 100.119 alarm

Signed-off-by: Ron Stone <ronald.stone@windriver.com>
Change-Id: If811b019fb0067641ac68e8fca4b1e343c48bf3f
2022-06-27 10:33:32 -04:00

152 lines
3.9 KiB
ReStructuredText

.. rww1579702317136
.. _800-series-alarm-messages:
=========================
800 Series Alarm Messages
=========================
The system inventory and maintenance service reports system changes with
different degrees of severity. Use the reported alarms to monitor the overall
health of the system.
.. include:: /_includes/x00-series-alarm-messages.rest
.. _800-series-alarm-messages-table-zrd-tg5-v5:
.. list-table::
:widths: 6 25
:header-rows: 0
* - **Alarm ID: 800.001**
- Storage Alarm Condition:
1 mons down, quorum 1,2 controller-1,storage-0
* - Entity Instance
- cluster=<dist-fs-uuid>
* - Degrade Affecting Severity:
- None
* - Severity:
- C/M\*
* - Proposed Repair Action
- If problem persists, contact next level of support.
.. -----
..
.. .. list-table::
.. :widths: 6 25
.. :header-rows: 0
..
.. * - **Alarm ID: 800.003**
.. - Storage Alarm Condition: Quota/Space mismatch for the <tiername> tier.
.. The sum of Ceph pool quotas does not match the tier size.
.. * - Entity Instance
.. - cluster=<dist-fs-uuid>.tier=<tiername>
.. * - Degrade Affecting Severity:
.. - None
.. * - Severity:
.. - m
.. * - Proposed Repair Action
.. - Update ceph storage pool quotas to use all available tier space.
-----
.. list-table::
:widths: 6 25
:header-rows: 0
* - **Alarm ID: 800.010**
- Potential data loss. No available OSDs in storage replication group.
* - Entity Instance
- cluster=<dist-fs-uuid>.peergroup=<group-x>
* - Degrade Affecting Severity:
- None
* - Severity:
- C\*
* - Proposed Repair Action
- Ensure storage hosts from replication group are unlocked and available.
Check if OSDs of each storage host are up and running. If problem
persists contact next level of support.
-----
.. list-table::
:widths: 6 25
:header-rows: 0
* - **Alarm ID: 800.011**
- Loss of replication in peergroup.
* - Entity Instance
- cluster=<dist-fs-uuid>.peergroup=<group-x>
* - Degrade Affecting Severity:
- None
* - Severity:
- M\*
* - Proposed Repair Action
- Ensure storage hosts from replication group are unlocked and available.
Check if OSDs of each storage host are up and running. If problem
persists contact next level of support.
-----
.. list-table::
:widths: 6 25
:header-rows: 0
* - **Alarm ID: 800.102**
- Storage Alarm Condition:
PV configuration <error/failed to apply\> on <hostname>.
Reason: <detailed reason\>.
* - Entity Instance
- pv=<pv\_uuid>
* - Degrade Affecting Severity:
- None
* - Severity:
- C/M\*
* - Proposed Repair Action
- Remove failed PV and associated Storage Device then recreate them.
-----
.. list-table::
:widths: 6 25
:header-rows: 0
* - **Alarm ID: 800.103**
- Storage Alarm Condition:
\[ Metadata usage for LVM thin pool <VG name>/<Pool name> exceeded
threshold and automatic extension failed.
Metadata usage for LVM thin pool <VG name>/<Pool name> exceeded
threshold \]; threshold x%, actual y%.
* - Entity Instance
- <hostname>.lvmthinpool=<VG name>/<Pool name>
* - Degrade Affecting Severity:
- None
* - Severity:
- C\*
* - Proposed Repair Action
- Increase Storage Space Allotment for Cinder on the 'lvm' backend.
Consult the user documentation for more details. If problem persists,
contact next level of support.
-----
.. list-table::
:widths: 6 25
:header-rows: 0
* - **Alarm ID: 800.104**
- Storage Alarm Condition:
<storage-backend-name> configuration failed to apply on host: <host-uuid>.
* - Degrade Affecting Severity:
- None
* - Severity:
- C\*
* - Proposed Repair Action
- Update backend setting to reapply configuration. Consult the user
documentation for more details. If problem persists, contact next level
of support.