Fault Management guide
Removed OSX and MS Visual Code metadata files and directories. Change-Id: I4e6024767cc072fcb3fc8d88f0724f02819cebc8 Signed-off-by: Stone <ronald.stone@windriver.com>
This commit is contained in:
parent
f6f21a4056
commit
10e4b9ac86
1
doc/source/_includes/data-networks-overview.rest
Normal file
1
doc/source/_includes/data-networks-overview.rest
Normal file
@ -0,0 +1 @@
|
||||
.. This file must exist to satisfy build requirements.
|
36
doc/source/_includes/openstack-alarm-messages-xxxs.rest
Normal file
36
doc/source/_includes/openstack-alarm-messages-xxxs.rest
Normal file
@ -0,0 +1,36 @@
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
For more information, see :ref:`Overview <openstack-fault-management-overview>`.
|
||||
|
||||
In the following tables, the severity of the alarms is represented by one or
|
||||
more letters, as follows:
|
||||
|
||||
.. _alarm-messages-300s-ul-jsd-jkg-vp:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
A slash-separated list of letters is used when the alarm can be triggered with
|
||||
one of several severity levels.
|
||||
|
||||
An asterisk \(\*\) indicates the management-affecting severity, if any. A
|
||||
management-affecting alarm is one that cannot be ignored at the indicated
|
||||
severity level or higher by using relaxed alarm rules during an orchestrated
|
||||
patch or upgrade operation.
|
||||
|
||||
|
||||
Differences exist between the terminology emitted by some alarms and that
|
||||
used in the CLI, GUI, and elsewhere in the documentations:
|
||||
|
||||
- References to provider networks in alarms refer to data networks.
|
||||
|
||||
- References to data networks in alarms refer to physical networks.
|
||||
|
||||
- References to tenant networks in alarms refer to project networks.
|
13
doc/source/_includes/openstack-customer-log-messages-xxxs
Normal file
13
doc/source/_includes/openstack-customer-log-messages-xxxs
Normal file
@ -0,0 +1,13 @@
|
||||
The Customer Logs include events that do not require immediate user action.
|
||||
|
||||
The following types of events are included in the Customer Logs. The severity of the events is represented in the table by one or more letters, as follows:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
- NA: Not applicable
|
@ -0,0 +1,15 @@
|
||||
The Customer Logs include events that do not require immediate user action.
|
||||
|
||||
The following types of events are included in the Customer Logs. The severity of the events is represented in the table by one or more letters, as follows:
|
||||
|
||||
.. _customer-log-messages-401s-services-ul-jsd-jkg-vp:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
- NA: Not applicable
|
1
doc/source/_includes/troubleshooting-log-collection.rest
Normal file
1
doc/source/_includes/troubleshooting-log-collection.rest
Normal file
@ -0,0 +1 @@
|
||||
.. This file must exist to satisfy build requirements.
|
33
doc/source/_includes/x00-series-alarm-messages.rest
Normal file
33
doc/source/_includes/x00-series-alarm-messages.rest
Normal file
@ -0,0 +1,33 @@
|
||||
|
||||
.. rsg1586183719424
|
||||
.. _alarm-messages-overview:
|
||||
|
||||
Alarm messages are numerically coded by the type of alarm.
|
||||
|
||||
For more information, see
|
||||
:ref:`Fault Management Overview <fault-management-overview>`.
|
||||
|
||||
In the alarm description tables, the severity of the alarms is represented by
|
||||
one or more letters, as follows:
|
||||
|
||||
.. _alarm-messages-overview-ul-jsd-jkg-vp:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
A slash-separated list of letters is used when the alarm can be triggered with
|
||||
one of several severity levels.
|
||||
|
||||
An asterisk \(\*\) indicates the management-affecting severity, if any. A
|
||||
management-affecting alarm is one that cannot be ignored at the indicated
|
||||
severity level or higher by using relaxed alarm rules during an orchestrated
|
||||
patch or upgrade operation.
|
||||
|
||||
.. note::
|
||||
**Degrade Affecting Severity: Critical** indicates a node will be
|
||||
degraded if the alarm reaches a Critical level.
|
336
doc/source/fault/100-series-alarm-messages.rst
Normal file
336
doc/source/fault/100-series-alarm-messages.rst
Normal file
@ -0,0 +1,336 @@
|
||||
|
||||
.. jsy1579701868527
|
||||
.. _100-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
100 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _100-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.101**
|
||||
- Platform CPU threshold exceeded; threshold x%, actual y%.
|
||||
CRITICAL @ 95%
|
||||
|
||||
MAJOR @ 90%
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- Critical
|
||||
* - Severity:
|
||||
- C/M\*
|
||||
* - Proposed Repair Action
|
||||
- Monitor and if condition persists, contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.103**
|
||||
- Memory threshold exceeded; threshold x%, actual y% .
|
||||
|
||||
CRITICAL @ 90%
|
||||
|
||||
MAJOR @ 80%
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- Critical
|
||||
* - Severity:
|
||||
- C/M
|
||||
* - Proposed Repair Action
|
||||
- Monitor and if condition persists, contact next level of support; may
|
||||
require additional memory on Host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.104**
|
||||
- File System threshold exceeded; threshold x%, actual y%
|
||||
|
||||
CRITICAL @ 90%
|
||||
|
||||
MAJOR @ 80%
|
||||
* - Entity Instance
|
||||
- host=<hostname>.filesystem=<mount-dir>
|
||||
* - Degrade Affecting Severity:
|
||||
- Critical
|
||||
* - Severity:
|
||||
- C\*/M
|
||||
* - Proposed Repair Action
|
||||
- Monitor and if condition persists, consider adding additional physical
|
||||
volumes to the volume group.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.105**
|
||||
- <fs\_name\> filesystem is not added on both controllers and/or does not
|
||||
have the same size: <hostname\>.
|
||||
* - Entity Instance
|
||||
- fs\_name=<image-conversion>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C/M\*
|
||||
* - Proposed Repair Action
|
||||
- Add image-conversion filesystem on both controllers.
|
||||
|
||||
Consult the System Administration Manual for more details.
|
||||
|
||||
If problem persists, contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.106**
|
||||
- 'OAM' Port failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.port=<port-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.107**
|
||||
- 'OAM' Interface degraded.
|
||||
|
||||
or
|
||||
|
||||
'OAM' Interface failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.interface=<if-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C or M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.108**
|
||||
- 'MGMT' Port failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.port=<port-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.109**
|
||||
- 'OAM' Interface degraded.
|
||||
|
||||
or
|
||||
|
||||
'OAM' Interface failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.interface=<if-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C or M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.110**
|
||||
- 'CLUSTER-HOST' Port failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.port=<port-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C or M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.111**
|
||||
- 'CLUSTER-HOST' Interface degraded.
|
||||
|
||||
OR
|
||||
|
||||
'CLUSTER-HOST' Interface failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.interface=<if-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C or M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.112**
|
||||
- 'DATA-VRS' Port down.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.port=<port-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.113**
|
||||
- 'DATA-VRS' Interface degraded.
|
||||
|
||||
or
|
||||
|
||||
'DATA-VRS' Interface down.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.interface=<if-name>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C or M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.114**
|
||||
- NTP configuration does not contain any valid or reachable NTP servers.
|
||||
The alarm is raised regardless of NTP enabled/disabled status.
|
||||
|
||||
NTP address <IP address> is not a valid or a reachable NTP server.
|
||||
|
||||
Connectivity to external PTP Clock Synchronization is lost.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.ntp
|
||||
|
||||
host=<hostname>.ntp=<IP address>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M or m
|
||||
* - Proposed Repair Action
|
||||
- Monitor and if condition persists, contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.118**
|
||||
- Controller cannot establish connection with remote logging server.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- m
|
||||
* - Proposed Repair Action
|
||||
- Ensure Remote Log Server IP is reachable from Controller through OAM
|
||||
interface; otherwise contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 100.119**
|
||||
- Major: PTP configuration or out-of-tolerance time-stamping conditions.
|
||||
|
||||
Minor: PTP out-of-tolerance time-stamping condition.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.ptp OR host=<hostname>.ptp=no-lock
|
||||
|
||||
OR
|
||||
|
||||
host=<hostname>.ptp=<interface>.unsupported=hardware-timestamping
|
||||
|
||||
OR
|
||||
|
||||
host=<hostname>.ptp=<interface>.unsupported=software-timestamping
|
||||
|
||||
OR
|
||||
|
||||
host=<hostname>.ptp=<interface>.unsupported=legacy-timestamping
|
||||
|
||||
OR
|
||||
|
||||
host=<hostname>.ptp=out-of-tolerance
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M or m
|
||||
* - Proposed Repair Action
|
||||
- Monitor and, if condition persists, contact next level of support.
|
402
doc/source/fault/200-series-alarm-messages.rst
Normal file
402
doc/source/fault/200-series-alarm-messages.rst
Normal file
@ -0,0 +1,402 @@
|
||||
|
||||
.. uof1579701912856
|
||||
.. _200-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
200 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _200-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.001**
|
||||
- <hostname> was administratively locked to take it out-of-service.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W\*
|
||||
* - Proposed Repair Action
|
||||
- Administratively unlock Host to bring it back in-service.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.004**
|
||||
- <hostname> experienced a service-affecting failure.
|
||||
|
||||
Host is being auto recovered by Reboot.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- If auto-recovery is consistently unable to recover host to the
|
||||
unlocked-enabled state contact next level of support or lock and replace
|
||||
failing host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.005**
|
||||
- Degrade:
|
||||
|
||||
<hostname> is experiencing an intermittent 'Management Network'
|
||||
communication failures that have exceeded its lower alarming threshold.
|
||||
|
||||
Failure:
|
||||
|
||||
<hostname> is experiencing a persistent Critical 'Management Network'
|
||||
communication failure.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\* (Degrade) or C\* (Failure)
|
||||
* - Proposed Repair Action
|
||||
- Check 'Management Network' connectivity and support for multicast
|
||||
messaging. If problem consistently occurs after that and Host is reset,
|
||||
then contact next level of support or lock and replace failing host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.006**
|
||||
- Main Process Monitor Daemon Failure \(Major\)
|
||||
|
||||
<hostname> 'Process Monitor' \(pmond\) process is not running or
|
||||
functioning properly. The system is trying to recover this process.
|
||||
|
||||
Monitored Process Failure \(Critical/Major/Minor\)
|
||||
|
||||
Critical: <hostname> Critical '<processname>' process has failed and
|
||||
could not be auto-recovered gracefully. Auto-recovery progression by
|
||||
host reboot is required and in progress.
|
||||
|
||||
Major: <hostname> is degraded due to the failure of its '<processname>'
|
||||
process. Auto recovery of this Major process is in progress.
|
||||
|
||||
Minor:
|
||||
|
||||
<hostname> '<processname>' process has failed. Auto recovery of this
|
||||
Minor process is in progress.
|
||||
|
||||
<hostname> '<processname>' process has failed. Manual recovery is required.
|
||||
|
||||
tp4l/phc2sys process failure. Manual recovery is required.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.process=<processname>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C/M/m\*
|
||||
* - Proposed Repair Action
|
||||
- If this alarm does not automatically clear after some time and continues
|
||||
to be asserted after Host is locked and unlocked then contact next level
|
||||
of support for root cause analysis and recovery.
|
||||
|
||||
If problem consistently occurs after Host is locked and unlocked then
|
||||
contact next level of support for root cause analysis and recovery.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.007**
|
||||
- Critical: \(with host degrade\):
|
||||
|
||||
Host is degraded due to a 'Critical' out-of-tolerance reading from the
|
||||
'<sensorname>' sensor
|
||||
|
||||
Major: \(with host degrade\)
|
||||
|
||||
Host is degraded due to a 'Major' out-of-tolerance reading from the
|
||||
'<sensorname>' sensor
|
||||
|
||||
Minor:
|
||||
|
||||
Host is reporting a 'Minor' out-of-tolerance reading from the
|
||||
'<sensorname>' sensor
|
||||
* - Entity Instance
|
||||
- host=<hostname>.sensor=<sensorname>
|
||||
* - Degrade Affecting Severity:
|
||||
- Critical
|
||||
* - Severity:
|
||||
- C/M/m
|
||||
* - Proposed Repair Action
|
||||
- If problem consistently occurs after Host is power cycled and or reset,
|
||||
contact next level of support or lock and replace failing host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.009**
|
||||
- Degrade:
|
||||
|
||||
<hostname> is experiencing an intermittent 'Cluster-host Network'
|
||||
communication failures that have exceeded its lower alarming threshold.
|
||||
|
||||
Failure:
|
||||
|
||||
<hostname> is experiencing a persistent Critical 'Cluster-host Network'
|
||||
communication failure.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\* (Degrade) or C\* (Critical)
|
||||
* - Proposed Repair Action
|
||||
- Check 'Cluster-host Network' connectivity and support for multicast
|
||||
messaging. If problem consistently occurs after that and Host is reset,
|
||||
then contact next level of support or lock and replace failing host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.010**
|
||||
- <hostname> access to board management module has failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W
|
||||
* - Proposed Repair Action
|
||||
- Check Host's board management configuration and connectivity.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.011**
|
||||
- <hostname> experienced a configuration failure during initialization.
|
||||
Host is being re-configured by Reboot.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- If auto-recovery is consistently unable to recover host to the
|
||||
unlocked-enabled state contact next level of support or lock and
|
||||
replace failing host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.012**
|
||||
- <hostname> controller function has in-service failure while compute
|
||||
services remain healthy.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Lock and then Unlock host to recover. Avoid using 'Force Lock' action
|
||||
as that will impact compute services running on this host. If lock action
|
||||
fails then contact next level of support to investigate and recover.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.013**
|
||||
- <hostname> compute service of the only available controller is not
|
||||
operational. Auto-recovery is disabled. Degrading host instead.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- Major
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Enable second controller and Switch Activity \(Swact\) over to it as
|
||||
soon as possible. Then Lock and Unlock host to recover its local compute
|
||||
service.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.014**
|
||||
- The Hardware Monitor was unable to load, configure and monitor one
|
||||
or more hardware sensors.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- m
|
||||
* - Proposed Repair Action
|
||||
- Check Board Management Controller provisioning. Try reprovisioning the
|
||||
BMC. If problem persists try power cycling the host and then the entire
|
||||
server including the BMC power. If problem persists then contact next
|
||||
level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 200.015**
|
||||
- Unable to read one or more sensor groups from this host's board
|
||||
management controller.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Check board management connectivity and try rebooting the board
|
||||
management controller. If problem persists contact next level of
|
||||
support or lock and replace failing host.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 210.001**
|
||||
- System Backup in progress.
|
||||
* - Entity Instance
|
||||
- host=controller
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- m\*
|
||||
* - Proposed Repair Action
|
||||
- No action required.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 250.001**
|
||||
- <hostname> Configuration is out-of-date.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Administratively lock and unlock <hostname> to update config.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 250.003**
|
||||
- Kubernetes certificates rotation failed on host <hostname>.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M/w
|
||||
* - Proposed Repair Action
|
||||
- Rotate kubernetes certificates manually.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 270.001**
|
||||
- Host <host\_name> compute services failure\[, reason = <reason\_text>\]
|
||||
* - Entity Instance
|
||||
- host=<host\_name>.services=compute
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for host services recovery to complete; if problem persists contact
|
||||
next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 280.001**
|
||||
- <subcloud> is offline.
|
||||
* - Entity Instance
|
||||
- subcloud=<subcloud>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for subcloud to become online; if problem persists contact next
|
||||
level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 280.001**
|
||||
- <subcloud><resource> sync status is out-of-sync.
|
||||
* - Entity Instance
|
||||
- \[subcloud=<subcloud>.resource=<compute> \| <network> \| <platform>
|
||||
\| <volumev2>\]
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- If problem persists contact next level of support.
|
@ -0,0 +1,120 @@
|
||||
|
||||
.. lzz1579291773073
|
||||
.. _200-series-maintenance-customer-log-messages:
|
||||
|
||||
============================================
|
||||
200 Series Maintenance Customer Log Messages
|
||||
============================================
|
||||
|
||||
The Customer Logs include events that do not require immediate user action.
|
||||
|
||||
The following types of events are included in the Customer Logs. The severity
|
||||
of the events is represented in the table by one or more letters, as follows:
|
||||
|
||||
.. _200-series-maintenance-customer-log-messages-ul-jsd-jkg-vp:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
- NA: Not applicable
|
||||
|
||||
.. _200-series-maintenance-customer-log-messages-table-zgf-jvw-v5:
|
||||
|
||||
|
||||
.. table:: Table 1. Customer Log Messages
|
||||
:widths: auto
|
||||
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| Log ID | Description | Severity |
|
||||
+ +------------------------------------------------------------------+----------+
|
||||
| | Entity Instance ID | |
|
||||
+=================+==================================================================+==========+
|
||||
| 200.020 | <hostname> has been 'discovered' on the network | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.event=discovered | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.020 | <hostname> has been 'added' to the system | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.event=add | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.020 | <hostname> has 'entered' multi-node failure avoidance | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.event=mnfa\_enter | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.020 | <hostname> has 'exited' multi-node failure avoidance | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.event=mnfa\_exit | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> board management controller has been 'provisioned' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=provision | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> board management controller has been 're-provisioned' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=reprovision | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> board management controller has been 'de-provisioned' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=deprovision | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'unlock' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=unlock | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'reboot' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=reboot | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'reset' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=reset | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'power-off' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=power-off | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'power-on' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=power-on | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'reinstall' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=reinstall | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'force-lock' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=force-lock | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'delete' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=delete | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.021 | <hostname> manual 'controller switchover' request | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.command=swact | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.022 | <hostname> is now 'disabled' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.state=disabled | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.022 | <hostname> is now 'enabled' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.state=enabled | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.022 | <hostname> is now 'online' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.status=online | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.022 | <hostname> is now 'offline' | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.status=offline | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
||||
| 200.022 | <hostname> is 'disabled-failed' to the system | NA |
|
||||
| | | |
|
||||
| | host=<hostname>.status=failed | |
|
||||
+-----------------+------------------------------------------------------------------+----------+
|
53
doc/source/fault/300-series-alarm-messages.rst
Normal file
53
doc/source/fault/300-series-alarm-messages.rst
Normal file
@ -0,0 +1,53 @@
|
||||
|
||||
.. zwe1579701930425
|
||||
.. _300-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
300 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the
|
||||
overall health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _300-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.001**
|
||||
- 'Data' Port failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.port=<port-uuid>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.002**
|
||||
- 'Data' Interface degraded.
|
||||
|
||||
or
|
||||
|
||||
'Data' Interface failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.interface=<if-uuid>
|
||||
* - Degrade Affecting Severity:
|
||||
- Critical
|
||||
* - Severity:
|
||||
- C/M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
69
doc/source/fault/400-series-alarm-messages.rst
Normal file
69
doc/source/fault/400-series-alarm-messages.rst
Normal file
@ -0,0 +1,69 @@
|
||||
|
||||
.. ots1579702138430
|
||||
.. _400-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
400 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _400-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.003**
|
||||
- License key is not installed; a valid license key is required for
|
||||
operation.
|
||||
|
||||
or
|
||||
|
||||
License key has expired or is invalid; a valid license key is required
|
||||
for operation.
|
||||
|
||||
or
|
||||
|
||||
Evaluation license key will expire on <date>; there are <num\_days> days
|
||||
remaining in this evaluation.
|
||||
|
||||
or
|
||||
|
||||
Evaluation license key will expire on <date>; there is only 1 day
|
||||
remaining in this evaluation.
|
||||
* - Entity Instance:
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Contact next level of support to obtain a new license key.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.003**
|
||||
- Communication failure detected with peer over port <linux-ifname>.
|
||||
|
||||
or
|
||||
|
||||
Communication failure detected with peer over port <linux-ifname>
|
||||
within the last 30 seconds.
|
||||
* - Entity Instance:
|
||||
- host=<hostname>.network=<mgmt \| oam \| cluster-host>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent
|
||||
equipment.
|
81
doc/source/fault/400-series-customer-log-messages.rst
Normal file
81
doc/source/fault/400-series-customer-log-messages.rst
Normal file
@ -0,0 +1,81 @@
|
||||
|
||||
.. pgb1579292662158
|
||||
.. _400-series-customer-log-messages:
|
||||
|
||||
================================
|
||||
400 Series Customer Log Messages
|
||||
================================
|
||||
|
||||
The Customer Logs include events that do not require immediate user action.
|
||||
|
||||
The following types of events are included in the Customer Logs. The severity
|
||||
of the events is represented in the table by one or more letters, as follows:
|
||||
|
||||
.. _400-series-customer-log-messages-ul-jsd-jkg-vp:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
- NA: Not applicable
|
||||
|
||||
.. _400-series-customer-log-messages-table-zgf-jvw-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.003**
|
||||
- License key has expired or is invalid
|
||||
|
||||
or
|
||||
|
||||
Evaluation license key will expire on <date>
|
||||
|
||||
or
|
||||
|
||||
License key is valid
|
||||
* - Entity Instance
|
||||
- host=<host\_name>
|
||||
* - Severity:
|
||||
- C
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.005**
|
||||
- Communication failure detected with peer over port <port> on host
|
||||
<host name>
|
||||
|
||||
or
|
||||
|
||||
Communication failure detected with peer over port <port> on host
|
||||
<host name> within the last <X> seconds
|
||||
|
||||
or
|
||||
|
||||
Communication established with peer over port <port> on host <host name>
|
||||
* - Entity Instance
|
||||
- host=<host\_name>.network=<network>
|
||||
* - Severity:
|
||||
- C
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.007**
|
||||
- Swact or swact-force
|
||||
* - Entity Instance
|
||||
- host=<host\_name>
|
||||
* - Severity:
|
||||
- C
|
49
doc/source/fault/500-series-alarm-messages.rst
Normal file
49
doc/source/fault/500-series-alarm-messages.rst
Normal file
@ -0,0 +1,49 @@
|
||||
|
||||
.. xpx1579702157578
|
||||
.. _500-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
500 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _500-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 500.100**
|
||||
- TPM initialization failed on host.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Reinstall HTTPS certificate; if problem persists contact next level of
|
||||
support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 500.101**
|
||||
- Developer patch certificate enabled.
|
||||
* - Entity Instance
|
||||
- host=controller
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Reinstall system to disable developer certificate and remove untrusted
|
||||
patches.
|
118
doc/source/fault/750-series-alarm-messages.rst
Normal file
118
doc/source/fault/750-series-alarm-messages.rst
Normal file
@ -0,0 +1,118 @@
|
||||
|
||||
.. cta1579702173704
|
||||
.. _750-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
750 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _750-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 750.001**
|
||||
- Application upload failure.
|
||||
* - Entity Instance
|
||||
- k8s\_application=<appname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W
|
||||
* - Proposed Repair Action
|
||||
- Check the system inventory log for the cause.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 750.002**
|
||||
- Application apply failure.
|
||||
* - Entity Instance
|
||||
- k8s\_application=<appname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Retry applying the application. If the issue persists, please check the
|
||||
system inventory log for cause.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 750.003**
|
||||
- Application remove failure.
|
||||
* - Entity Instance
|
||||
- k8s\_application=<appname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Retry removing the application. If the issue persists, please the check
|
||||
system inventory log for cause.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 750.004**
|
||||
- Application apply in progress.
|
||||
* - Entity Instance
|
||||
- k8s\_application=<appname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W
|
||||
* - Proposed Repair Action
|
||||
- No action is required.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 750.005**
|
||||
- Application update in progress.
|
||||
* - Entity Instance
|
||||
- k8s\_application=<appname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W
|
||||
* - Proposed Repair Action
|
||||
- No action is required.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 750.006**
|
||||
- Automatic application re-apply is pending.
|
||||
* - Entity Instance
|
||||
- k8s\_application=<appname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W
|
||||
* - Proposed Repair Action
|
||||
- Ensure all hosts are either locked or unlocked. When the system is
|
||||
stable the application will automatically be reapplied.
|
152
doc/source/fault/800-series-alarm-messages.rst
Normal file
152
doc/source/fault/800-series-alarm-messages.rst
Normal file
@ -0,0 +1,152 @@
|
||||
|
||||
.. rww1579702317136
|
||||
.. _800-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
800 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _800-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.001**
|
||||
- Storage Alarm Condition:
|
||||
|
||||
1 mons down, quorum 1,2 controller-1,storage-0
|
||||
* - Entity Instance
|
||||
- cluster=<dist-fs-uuid>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C/M\*
|
||||
* - Proposed Repair Action
|
||||
- If problem persists, contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.003**
|
||||
- Storage Alarm Condition: Quota/Space mismatch for the <tiername> tier.
|
||||
The sum of Ceph pool quotas does not match the tier size.
|
||||
* - Entity Instance
|
||||
- cluster=<dist-fs-uuid>.tier=<tiername>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- m
|
||||
* - Proposed Repair Action
|
||||
- Update ceph storage pool quotas to use all available tier space.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.010**
|
||||
- Potential data loss. No available OSDs in storage replication group.
|
||||
* - Entity Instance
|
||||
- cluster=<dist-fs-uuid>.peergroup=<group-x>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Ensure storage hosts from replication group are unlocked and available.
|
||||
Check if OSDs of each storage host are up and running. If problem
|
||||
persists contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.011**
|
||||
- Loss of replication in peergroup.
|
||||
* - Entity Instance
|
||||
- cluster=<dist-fs-uuid>.peergroup=<group-x>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Ensure storage hosts from replication group are unlocked and available.
|
||||
Check if OSDs of each storage host are up and running. If problem
|
||||
persists contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.102**
|
||||
- Storage Alarm Condition:
|
||||
|
||||
PV configuration <error/failed to apply\> on <hostname>.
|
||||
Reason: <detailed reason\>.
|
||||
* - Entity Instance
|
||||
- pv=<pv\_uuid>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C/M\*
|
||||
* - Proposed Repair Action
|
||||
- Remove failed PV and associated Storage Device then recreate them.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.103**
|
||||
- Storage Alarm Condition:
|
||||
|
||||
\[ Metadata usage for LVM thin pool <VG name>/<Pool name> exceeded
|
||||
threshold and automatic extension failed
|
||||
|
||||
Metadata usage for LVM thin pool <VG name>/<Pool name> exceeded
|
||||
threshold \]; threshold x%, actual y%.
|
||||
* - Entity Instance
|
||||
- <hostname>.lvmthinpool=<VG name>/<Pool name>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Increase Storage Space Allotment for Cinder on the 'lvm' backend.
|
||||
Consult the user documentation for more details. If problem persists,
|
||||
contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.104**
|
||||
- Storage Alarm Condition:
|
||||
|
||||
<storage-backend-name> configuration failed to apply on host: <host-uuid>.
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Update backend setting to reapply configuration. Consult the user
|
||||
documentation for more details. If problem persists, contact next level
|
||||
of support.
|
260
doc/source/fault/900-series-alarm-messages.rst
Normal file
260
doc/source/fault/900-series-alarm-messages.rst
Normal file
@ -0,0 +1,260 @@
|
||||
|
||||
.. pti1579702342696
|
||||
.. _900-series-alarm-messages:
|
||||
|
||||
=========================
|
||||
900 Series Alarm Messages
|
||||
=========================
|
||||
|
||||
The system inventory and maintenance service reports system changes with
|
||||
different degrees of severity. Use the reported alarms to monitor the overall
|
||||
health of the system.
|
||||
|
||||
.. include:: ../_includes/x00-series-alarm-messages.rest
|
||||
|
||||
.. _900-series-alarm-messages-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.001**
|
||||
- Patching operation in progress.
|
||||
* - Entity Instance
|
||||
- host=controller
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- m\*
|
||||
* - Proposed Repair Action
|
||||
- Complete reboots of affected hosts.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.002**
|
||||
- Obsolete patch in system.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- W\*
|
||||
* - Proposed Repair Action
|
||||
- Remove and delete obsolete patches.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.003**
|
||||
- Patch host install failure.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Undo patching operation.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.004**
|
||||
- Host version mismatch.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Reinstall host to update applied load.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.005**
|
||||
- System Upgrade in progress.
|
||||
* - Entity Instance
|
||||
- host=controller
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- m\*
|
||||
* - Proposed Repair Action
|
||||
- No action required.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.101**
|
||||
- Software update auto-apply in progress.
|
||||
* - Entity Instance
|
||||
- sw-update
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for software update auto-apply to complete; if problem persists
|
||||
contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.102**
|
||||
- Software update auto-apply aborting.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for software update auto-apply abort to complete; if problem
|
||||
persists contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.103**
|
||||
- Software update auto-apply failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Attempt to apply software updates manually; if problem persists contact
|
||||
next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.201**
|
||||
- Software upgrade auto-apply in progress.
|
||||
* - Entity Instance
|
||||
- orchestration=sw-upgrade
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for software upgrade auto-apply to complete; if problem persists
|
||||
contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.202**
|
||||
- Software upgrade auto-apply aborting
|
||||
* - Entity Instance
|
||||
- orchestration=sw-upgrade
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for software upgrade auto-apply abort to complete; if problem
|
||||
persists contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.203**
|
||||
- Software update auto-apply failed.
|
||||
* - Entity Instance
|
||||
- orchestration=sw-upgrade
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Attempt to apply software upgrade manually; if problem persists contact
|
||||
next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.301**
|
||||
- Firmware Update auto-apply in progress.
|
||||
* - Entity Instance
|
||||
- orchestration=fw-upgrade
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for firmware update auto-apply to complete; if problem persists
|
||||
contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.302**
|
||||
- Firmware Update auto-apply aborting.
|
||||
* - Entity Instance
|
||||
- orchestration=fw-upgrade
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for firmware update auto-apply abort to complete; if problem
|
||||
persists contact next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 900.303**
|
||||
- Firmware Update auto-apply failed.
|
||||
* - Entity Instance
|
||||
- orchestration=fw-upgrade
|
||||
* - Degrade Affecting Severity:
|
||||
- None
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Attempt to apply firmware update manually; if problem persists
|
||||
contact next level of support.
|
@ -0,0 +1,168 @@
|
||||
|
||||
.. bdq1579700719122
|
||||
.. _900-series-orchestration-customer-log-messages:
|
||||
|
||||
==============================================
|
||||
900 Series Orchestration Customer Log Messages
|
||||
==============================================
|
||||
|
||||
The Customer Logs include events that do not require immediate user action.
|
||||
|
||||
The following types of events are included in the Customer Logs. The severity
|
||||
of the events is represented in the table by one or more letters, as follows:
|
||||
|
||||
.. _900-series-orchestration-customer-log-messages-ul-jsd-jkg-vp:
|
||||
|
||||
- C: Critical
|
||||
|
||||
- M: Major
|
||||
|
||||
- m: Minor
|
||||
|
||||
- W: Warning
|
||||
|
||||
- NA: Not applicable
|
||||
|
||||
.. _900-series-orchestration-customer-log-messages-table-zgf-jvw-v5:
|
||||
|
||||
.. table:: Table 1. Customer Log Messages
|
||||
:widths: auto
|
||||
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| Log ID | Description | Severity |
|
||||
+ +--------------------------------------------+----------+
|
||||
| | Entity Instance ID |
|
||||
+===================+============================================+==========+
|
||||
| 900.111 | Software update auto-apply start | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.112 | Software update auto-apply inprogress | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.113 | Software update auto-apply rejected | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.114 | Software update auto-apply canceled | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.115 | Software update auto-apply failed | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.116 | Software update auto-apply completed | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.117 | Software update auto-apply abort | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.118 | Software update auto-apply aborting | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.119 | Software update auto-apply abort rejected | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.120 | Software update auto-apply abort failed | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.121 | Software update auto-apply aborted | C |
|
||||
| | | |
|
||||
| | orchestration=sw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.211 | Software upgrade auto-apply start | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.212 | Software upgrade auto-apply inprogress | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.213 | Software upgrade auto-apply rejected | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.214 | Software upgrade auto-apply canceled | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.215 | Software upgrade auto-apply failed | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.216 | Software upgrade auto-apply completed | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.217 | Software upgrade auto-apply abort | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.218 | Software upgrade auto-apply aborting | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.219 | Software upgrade auto-apply abort rejected | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.220 | Software upgrade auto-apply abort failed | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.221 | Software upgrade auto-apply aborted | C |
|
||||
| | | |
|
||||
| | orchestration=sw-upgrade | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.311 | Firmware update auto-apply | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.312 | Firmware update auto-apply in progress | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.313 | Firmware update auto-apply rejected | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.314 | Firmware update auto-apply canceled | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.315 | Firmware update auto-apply failed | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.316 | Firmware update auto-apply completed | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.317 | Firmware update auto-apply aborted | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.318 | Firmware update auto-apply aborting | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.319 | Firmware update auto-apply abort rejected | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.320 | Firmware update auto-apply abort failed | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
| 900.321 | Firmware update auto-apply aborted | C |
|
||||
| | | |
|
||||
| | orchestration=fw-update | |
|
||||
+-------------------+--------------------------------------------+----------+
|
||||
|
@ -0,0 +1,111 @@
|
||||
|
||||
.. xti1552680491532
|
||||
.. _adding-an-snmp-community-string-using-the-cli:
|
||||
|
||||
==========================================
|
||||
Add an SNMP Community String Using the CLI
|
||||
==========================================
|
||||
|
||||
To enable SNMP services you need to define one or more SNMP community strings
|
||||
using the command line interface.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
No default community strings are defined on |prod| after the initial
|
||||
commissioning of the cluster. This means that no SNMP operations are enabled
|
||||
by default.
|
||||
|
||||
The following exercise illustrates the system commands available to manage and
|
||||
query SNMP community strings. It uses the string **commstr1** as an example.
|
||||
|
||||
.. caution::
|
||||
For security, do not use the string **public**, or other community strings
|
||||
that could easily be guessed.
|
||||
|
||||
.. rubric:: |prereq|
|
||||
|
||||
All commands must be executed on the active controller's console, which can be
|
||||
accessed using the OAM floating IP address. You must acquire Keystone **admin**
|
||||
credentials in order to execute the commands.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Add the SNMP community string commstr1 to the system.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-comm-add -c commstr1
|
||||
+-----------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+-----------+--------------------------------------+
|
||||
| access | ro |
|
||||
| uuid | eccf5729-e400-4305-82e2-bdf344eb868d |
|
||||
| community | commstr1 |
|
||||
| view | .1 |
|
||||
+-----------+--------------------------------------+
|
||||
|
||||
|
||||
The following are attributes associated with the new community string:
|
||||
|
||||
**access**
|
||||
The SNMP access type. In |prod| all community strings provide read-only
|
||||
access.
|
||||
|
||||
**uuid**
|
||||
The UUID associated with the community string.
|
||||
|
||||
**community**
|
||||
The community string value.
|
||||
|
||||
**view**
|
||||
The is always the full MIB tree.
|
||||
|
||||
#. List available community strings.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-comm-list
|
||||
+----------------+--------------------+--------+
|
||||
| SNMP community | View | Access |
|
||||
+----------------+--------------------+--------+
|
||||
| commstr1 | .1 | ro |
|
||||
+----------------+--------------------+--------+
|
||||
|
||||
#. Query details of a specific community string.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-comm-show commstr1
|
||||
+------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+------------+--------------------------------------+
|
||||
| access | ro |
|
||||
| created_at | 2014-08-14T21:12:10.037637+00:00 |
|
||||
| uuid | eccf5729-e400-4305-82e2-bdf344eb868d |
|
||||
| community | commstr1 |
|
||||
| view | .1 |
|
||||
+------------+--------------------------------------+
|
||||
|
||||
#. Delete a community string.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-comm-delete commstr1
|
||||
Deleted community commstr1
|
||||
|
||||
.. rubric:: |result|
|
||||
|
||||
Community strings in |prod| provide query access to any SNMP monitor
|
||||
workstation that can reach the controller's OAM address on UDP port 161.
|
||||
|
||||
You can verify SNMP access using any monitor tool. For example, the freely
|
||||
available command :command:`snmpwalk` can be issued from any host to list
|
||||
the state of all SNMP Object Identifiers \(OID\):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ snmpwalk -v 2c -c commstr1 10.10.10.100 > oids.txt
|
||||
|
||||
In this example, 10.10.10.100 is the |prod| OAM floating IP address. The output,
|
||||
which is a large file, is redirected to the file oids.txt.
|
||||
|
61
doc/source/fault/cli-commands-and-paged-output.rst
Normal file
61
doc/source/fault/cli-commands-and-paged-output.rst
Normal file
@ -0,0 +1,61 @@
|
||||
|
||||
.. idb1552680603462
|
||||
.. _cli-commands-and-paged-output:
|
||||
|
||||
=============================
|
||||
CLI Commands and Paged Output
|
||||
=============================
|
||||
|
||||
There are some CLI commands that perform paging, and you can use options to
|
||||
limit the paging or to disable it, which is useful for scripts.
|
||||
|
||||
CLI fault management commands that perform paging include:
|
||||
|
||||
.. _cli-commands-and-paged-output-ul-wjz-y4q-bw:
|
||||
|
||||
- :command:`fm event-list`
|
||||
|
||||
- :command:`fm event-suppress`
|
||||
|
||||
- :command:`fm event-suppress-list`
|
||||
|
||||
- :command:`fm event-unsuppress`
|
||||
|
||||
- :command:`fm event-unsuppress-all`
|
||||
|
||||
|
||||
To turn paging off, use the --nopaging option for the above commands. The
|
||||
--nopaging option is useful for bash script writers.
|
||||
|
||||
.. _cli-commands-and-paged-output-section-N10074-N1001C-N10001:
|
||||
|
||||
--------
|
||||
Examples
|
||||
--------
|
||||
|
||||
The following examples demonstrate the resulting behavior from the use and
|
||||
non-use of the paging options.
|
||||
|
||||
This produces a paged list of events.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-list
|
||||
|
||||
This produces a list of events without paging.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-list --nopaging
|
||||
|
||||
This produces a paged list of 50 events.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-list --limit 50
|
||||
|
||||
This will produce a list of 50 events without paging.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-list --limit 50 --nopaging
|
89
doc/source/fault/configuring-snmp-trap-destinations.rst
Normal file
89
doc/source/fault/configuring-snmp-trap-destinations.rst
Normal file
@ -0,0 +1,89 @@
|
||||
|
||||
.. sjb1552680530874
|
||||
.. _configuring-snmp-trap-destinations:
|
||||
|
||||
================================
|
||||
Configure SNMP Trap Destinations
|
||||
================================
|
||||
|
||||
SNMP trap destinations are hosts configured in |prod| to receive unsolicited
|
||||
SNMP notifications.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
Destination hosts are specified by IP address, or by host name if it can be
|
||||
properly resolved by |prod|. Notifications are sent to the hosts using a
|
||||
designated community string so that they can be validated.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Configure IP address 10.10.10.1 to receive SNMP notifications using the
|
||||
community string commstr1.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-trapdest-add -c commstr1 --ip_address 10.10.10.1
|
||||
+------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+------------+--------------------------------------+
|
||||
| uuid | c7b6774e-7f45-40f5-bcca-3668de2a186f |
|
||||
| ip_address | 10.10.10.1 |
|
||||
| community | commstr1 |
|
||||
| type | snmpv2c_trap |
|
||||
| port | 162 |
|
||||
| transport | udp |
|
||||
+------------+--------------------------------------+
|
||||
|
||||
The following are attributes associated with the new community string:
|
||||
|
||||
**uuid**
|
||||
The UUID associated with the trap destination object.
|
||||
|
||||
**ip\_address**
|
||||
The trap destination IP address.
|
||||
|
||||
**community**
|
||||
The community string value to be associated with the notifications.
|
||||
|
||||
**type**
|
||||
snmpv2c\_trap, the only supported message type for SNMP traps.
|
||||
|
||||
**port**
|
||||
The destination UDP port that SNMP notifications are sent to.
|
||||
|
||||
**transport**
|
||||
The transport protocol used to send notifications.
|
||||
|
||||
#. List defined trap destinations.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-trapdest-list
|
||||
+------------+----------------+------+--------------+-----------+
|
||||
| IP Address | SNMP Community | Port | Type | Transport |
|
||||
+-------------+----------------+------+--------------+-----------+
|
||||
| 10.10.10.1 | commstr1 | 162 | snmpv2c_trap | udp |
|
||||
+-------------+----------------+------+--------------+-----------+
|
||||
|
||||
#. Query access details of a specific trap destination.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-trapdest-show 10.10.10.1
|
||||
+------------+--------------------------------------+
|
||||
| Property | Value |
|
||||
+------------+--------------------------------------+
|
||||
| uuid | c7b6774e-7f45-40f5-bcca-3668de2a186f |
|
||||
| ip_address | 10.10.10.1 |
|
||||
| community | commstr1 |
|
||||
| type | snmpv2c_trap |
|
||||
| port | 162 |
|
||||
| transport | udp |
|
||||
+------------+--------------------------------------+
|
||||
|
||||
#. Disable the sending of SNMP notifications to a specific IP address.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system snmp-trapdest-delete 10.10.10.1
|
||||
Deleted ip 10.10.10.1
|
34
doc/source/fault/deleting-an-alarm-using-the-cli.rst
Normal file
34
doc/source/fault/deleting-an-alarm-using-the-cli.rst
Normal file
@ -0,0 +1,34 @@
|
||||
|
||||
.. cpy1552680695138
|
||||
.. _deleting-an-alarm-using-the-cli:
|
||||
|
||||
=============================
|
||||
Delete an Alarm Using the CLI
|
||||
=============================
|
||||
|
||||
You can manually delete an alarm that is not automatically cleared by the
|
||||
system.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
Manually deleting an alarm should not be done unless it is absolutely
|
||||
clear that there is no reason for the alarm to be active.
|
||||
|
||||
You can use the command :command:`fm alarm-delete` to manually delete an alarm
|
||||
that remains active/set for no apparent reason, which may happen in rare
|
||||
conditions. Alarms usually clear automatically when the related trigger or
|
||||
fault condition is corrected.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _deleting-an-alarm-using-the-cli-steps-clp-fzw-nkb:
|
||||
|
||||
- To delete an alarm, use the :command:`fm alarm-delete` command.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-delete 4ab5698a-19cb-4c17-bd63-302173fef62c
|
||||
|
||||
Substitute the UUID of the alarm you wish to delete.
|
26
doc/source/fault/enabling-snmp-support.rst
Normal file
26
doc/source/fault/enabling-snmp-support.rst
Normal file
@ -0,0 +1,26 @@
|
||||
|
||||
.. nat1580220934509
|
||||
.. _enabling-snmp-support:
|
||||
|
||||
===================
|
||||
Enable SNMP Support
|
||||
===================
|
||||
|
||||
SNMP support must be enabled before you can begin using it to monitor a system.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
In order to have a workable SNMP configuration you must use the command line
|
||||
interface on the active controller to complete the following steps.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Define at least one SNMP community string.
|
||||
|
||||
See |fault-doc|: :ref:`Adding an SNMP Community String Using the CLI <adding-an-snmp-community-string-using-the-cli>` for details.
|
||||
|
||||
#. Configure at least one SNMP trap destination.
|
||||
|
||||
This will allow alarms and logs to be reported as they happen.
|
||||
|
||||
For more information, see :ref:`Configuring SNMP Trap Destinations <configuring-snmp-trap-destinations>`.
|
33
doc/source/fault/events-suppression-overview.rst
Normal file
33
doc/source/fault/events-suppression-overview.rst
Normal file
@ -0,0 +1,33 @@
|
||||
|
||||
.. pmt1552680681730
|
||||
.. _events-suppression-overview:
|
||||
|
||||
===========================
|
||||
Events Suppression Overview
|
||||
===========================
|
||||
|
||||
All alarms are unsuppressed by default. A suppressed alarm is excluded from the
|
||||
Active Alarm and Events displays by setting the **Suppression Status** filter,
|
||||
on the Horizon Web interface, the CLI, or REST APIs, and will not be included
|
||||
in the Active Alarm Counts.
|
||||
|
||||
.. warning::
|
||||
Suppressing an alarm will result in the system NOT notifying the operator
|
||||
of this particular fault.
|
||||
|
||||
The Events Suppression page, available from **Admin** \> **Fault Management**
|
||||
\> **Events Suppression** in the left-hand pane, provides the suppression
|
||||
status of each event type and functionality for suppressing or unsuppressing
|
||||
each event type.
|
||||
|
||||
As shown below, the Events Suppression page lists each event type by ID, and
|
||||
provides a description of the event and a current status indicator. Each event
|
||||
can be suppressed using the **Suppress Event** button.
|
||||
|
||||
You can sort events by clicking the **Event ID**, **Description**, and
|
||||
**Status** column headers. You can also use these as filtering criteria
|
||||
from the **Search** field.
|
||||
|
||||
.. figure:: figures/uty1463514747661.png
|
||||
:scale: 70 %
|
||||
:alt: Event Suppression
|
69
doc/source/fault/fault-management-overview.rst
Normal file
69
doc/source/fault/fault-management-overview.rst
Normal file
@ -0,0 +1,69 @@
|
||||
|
||||
.. yrq1552337051689
|
||||
.. _fault-management-overview:
|
||||
|
||||
=========================
|
||||
Fault Management Overview
|
||||
=========================
|
||||
|
||||
An admin user can view |prod-long| fault management alarms and logs in order
|
||||
to monitor and respond to fault conditions.
|
||||
|
||||
See :ref:`Alarm Messages <100-series-alarm-messages>` for the list of
|
||||
alarms and :ref:`Customer Log Messages
|
||||
<200-series-maintenance-customer-log-messages>`
|
||||
for the list of customer logs reported by |prod|.
|
||||
|
||||
You can access active and historical alarms, and customer logs using the CLI,
|
||||
GUI, REST APIs and SNMP.
|
||||
|
||||
To use the CLI, see
|
||||
:ref:`Viewing Active Alarms Using the CLI
|
||||
<viewing-active-alarms-using-the-cli>`
|
||||
and :ref:`Viewing the Event Log Using the CLI
|
||||
<viewing-the-event-log-using-the-cli>`.
|
||||
|
||||
Using the GUI, you can obtain fault management information in a number of
|
||||
places.
|
||||
|
||||
.. _fault-management-overview-ul-nqw-hbp-mx:
|
||||
|
||||
- The Fault Management pages, available from
|
||||
**Admin** \> **Fault Management** in the left-hand pane, provide access to
|
||||
the following:
|
||||
|
||||
- The Global Alarm Banner in the page header of all screens provides the
|
||||
active alarm counts for all alarm severities, see
|
||||
:ref:`The Global Alarm Banner <the-global-alarm-banner>`.
|
||||
|
||||
- **Admin** \> **Fault Management** \> **Active Alarms**—Alarms that are
|
||||
currently set, and require user action to clear them. For more
|
||||
information about active alarms, see
|
||||
:ref:`Viewing Active Alarms Using the CLI
|
||||
<viewing-active-alarms-using-the-cli>`
|
||||
and :ref:`Deleting an Alarm Using the CLI
|
||||
<deleting-an-alarm-using-the-cli>`.
|
||||
|
||||
- **Admin** \> **Fault Management** \> **Events**—The event log
|
||||
consolidates historical alarms that have occurred in the past, that
|
||||
is, both set and clear events of active alarms, as well as customer
|
||||
logs.
|
||||
|
||||
For more about the event log, which includes historical alarms and
|
||||
customer logs, see
|
||||
:ref:`Viewing the Event Log Using Horizon
|
||||
<viewing-the-event-log-using-horizon>`.
|
||||
|
||||
- **Admin** \> **Fault Management** \> **Events Suppression**—Individual
|
||||
events can be put into a suppressed state or an unsuppressed state. A
|
||||
suppressed alarm is excluded from the Active Alarm and Events displays.
|
||||
All alarms are unsuppressed by default. An event can be suppressed or
|
||||
unsuppressed using the Horizon Web interface, the CLI, or REST APIs.
|
||||
|
||||
- The Data Network Topology view provides real-time alarm information for
|
||||
data networks and associated worker hosts and data/pci-passthru/pci-sriov
|
||||
interfaces.
|
||||
|
||||
.. xreflink For more information, see |datanet-doc|: :ref:`The Data Network Topology View <the-data-network-topology-view>`.
|
||||
|
||||
To use SNMP, see :ref:`SNMP Overview <snmp-overview>`.
|
BIN
doc/source/fault/figures/nlc1463584178366.png
Normal file
BIN
doc/source/fault/figures/nlc1463584178366.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 18 KiB |
BIN
doc/source/fault/figures/psa1567524091300.png
Normal file
BIN
doc/source/fault/figures/psa1567524091300.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 83 KiB |
BIN
doc/source/fault/figures/uty1463514747661.png
Normal file
BIN
doc/source/fault/figures/uty1463514747661.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 36 KiB |
BIN
doc/source/fault/figures/xyj1558447807645.png
Normal file
BIN
doc/source/fault/figures/xyj1558447807645.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 3.7 KiB |
71
doc/source/fault/index.rs1
Normal file
71
doc/source/fault/index.rs1
Normal file
@ -0,0 +1,71 @@
|
||||
============================
|
||||
|prod-long| Fault Management
|
||||
============================
|
||||
|
||||
- Fault Management Overview
|
||||
|
||||
- :ref:`Fault Management Overview <fault-management-overview>`
|
||||
|
||||
- The Global Alarm Banner
|
||||
|
||||
- :ref:`The Global Alarm Banner <the-global-alarm-banner>`
|
||||
|
||||
- Viewing Active Alarms
|
||||
|
||||
- :ref:`Viewing Active Alarms Using Horizon <viewing-active-alarms-using-horizon>`
|
||||
- :ref:`Viewing Active Alarms Using the CLI <viewing-active-alarms-using-the-cli>`
|
||||
- :ref:`Viewing Alarm Details Using the CLI <viewing-alarm-details-using-the-cli>`
|
||||
|
||||
- Viewing the Event Log
|
||||
|
||||
- :ref:`Viewing the Event Log Using Horizon <viewing-the-event-log-using-horizon>`
|
||||
- :ref:`Viewing the Event Log Using the CLI <viewing-the-event-log-using-the-cli>`
|
||||
|
||||
- Deleting an Alarm
|
||||
|
||||
- :ref:`Deleting an Alarm Using the CLI <deleting-an-alarm-using-the-cli>`
|
||||
|
||||
- Events Suppression
|
||||
|
||||
- :ref:`Events Suppression Overview <events-suppression-overview>`
|
||||
- :ref:`Suppressing and Unsuppressing Events <suppressing-and-unsuppressing-events>`
|
||||
- :ref:`Viewing Suppressed Alarms Using the CLI <viewing-suppressed-alarms-using-the-cli>`
|
||||
- :ref:`Suppressing an Alarm Using the CLI <suppressing-an-alarm-using-the-cli>`
|
||||
- :ref:`Unsuppressing an Alarm Using the CLI <unsuppressing-an-alarm-using-the-cli>`
|
||||
|
||||
- CLI Commands and Paged Output
|
||||
|
||||
- :ref:`CLI Commands and Paged Output <cli-commands-and-paged-output>`
|
||||
|
||||
- SNMP
|
||||
|
||||
- :ref:`SNMP Overview <snmp-overview>`
|
||||
- :ref:`Enabling SNMP Support <enabling-snmp-support>`
|
||||
- :ref:`Traps <traps>`
|
||||
|
||||
- :ref:`Configuring SNMP Trap Destinations <configuring-snmp-trap-destinations>`
|
||||
|
||||
- :ref:`SNMP Event Table <snmp-event-table>`
|
||||
- :ref:`Adding an SNMP Community String Using the CLI <adding-an-snmp-community-string-using-the-cli>`
|
||||
- :ref:`Setting SNMP Identifying Information <setting-snmp-identifying-information>`
|
||||
|
||||
- :ref:`Troubleshooting Log Collection <troubleshooting-log-collection>`
|
||||
- Cloud Platform Alarm Messages
|
||||
|
||||
- :ref:`Alarm Messages Overview <alarm-messages-overview>`
|
||||
- :ref:`100 Series Alarm Messages <100-series-alarm-messages>`
|
||||
- :ref:`200 Series Alarm Messages <200-series-alarm-messages>`
|
||||
- :ref:`300 Series Alarm Messages <300-series-alarm-messages>`
|
||||
- :ref:`400 Series Alarm Messages <400-series-alarm-messages>`
|
||||
- :ref:`500 Series Alarm Messages <500-series-alarm-messages>`
|
||||
- :ref:`750 Series Alarm Messages <750-series-alarm-messages>`
|
||||
- :ref:`800 Series Alarm Messages <800-series-alarm-messages>`
|
||||
- :ref:`900 Series Alarm Messages <900-series-alarm-messages>`
|
||||
|
||||
- Cloud Platform Customer Log Messages
|
||||
|
||||
- :ref:`200 Series Maintenance Customer Log Messages <200-series-maintenance-customer-log-messages>`
|
||||
- :ref:`400 Series Customer Log Messages <400-series-customer-log-messages>`
|
||||
- :ref:`900 Series Orchestration Customer Log Messages <900-series-orchestration-customer-log-messages>`
|
||||
|
||||
|
161
doc/source/fault/index.rst
Normal file
161
doc/source/fault/index.rst
Normal file
@ -0,0 +1,161 @@
|
||||
.. Fault Management file, created by
|
||||
sphinx-quickstart on Thu Sep 3 15:14:59 2020.
|
||||
You can adapt this file completely to your liking, but it should at least
|
||||
contain the root `toctree` directive.
|
||||
|
||||
================
|
||||
Fault Management
|
||||
================
|
||||
|
||||
--------------------
|
||||
StarlingX Kubernetes
|
||||
--------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
fault-management-overview
|
||||
|
||||
*****************
|
||||
The global banner
|
||||
*****************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
the-global-alarm-banner
|
||||
|
||||
*********************
|
||||
Viewing active alarms
|
||||
*********************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
viewing-active-alarms-using-horizon
|
||||
viewing-active-alarms-using-the-cli
|
||||
viewing-alarm-details-using-the-cli
|
||||
|
||||
*********************
|
||||
Viewing the event log
|
||||
*********************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
viewing-the-event-log-using-horizon
|
||||
viewing-the-event-log-using-the-cli
|
||||
|
||||
*****************
|
||||
Deleting an alarm
|
||||
*****************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
deleting-an-alarm-using-the-cli
|
||||
|
||||
*****************
|
||||
Event suppression
|
||||
*****************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
events-suppression-overview
|
||||
suppressing-and-unsuppressing-events
|
||||
viewing-suppressed-alarms-using-the-cli
|
||||
suppressing-an-alarm-using-the-cli
|
||||
unsuppressing-an-alarm-using-the-cli
|
||||
|
||||
*****************************
|
||||
CLI commands and paged output
|
||||
*****************************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
cli-commands-and-paged-output
|
||||
|
||||
****
|
||||
SNMP
|
||||
****
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
snmp-overview
|
||||
enabling-snmp-support
|
||||
traps
|
||||
configuring-snmp-trap-destinations
|
||||
snmp-event-table
|
||||
adding-an-snmp-community-string-using-the-cli
|
||||
setting-snmp-identifying-information
|
||||
|
||||
******************************
|
||||
Troubleshooting log collection
|
||||
******************************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
troubleshooting-log-collection
|
||||
|
||||
**************
|
||||
Alarm messages
|
||||
**************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
100-series-alarm-messages
|
||||
200-series-alarm-messages
|
||||
300-series-alarm-messages
|
||||
400-series-alarm-messages
|
||||
500-series-alarm-messages
|
||||
750-series-alarm-messages
|
||||
800-series-alarm-messages
|
||||
900-series-alarm-messages
|
||||
|
||||
************
|
||||
Log messages
|
||||
************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
200-series-maintenance-customer-log-messages
|
||||
400-series-customer-log-messages
|
||||
900-series-orchestration-customer-log-messages
|
||||
|
||||
-------------------
|
||||
StarlingX OpenStack
|
||||
-------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
openstack-fault-management-overview
|
||||
|
||||
************************
|
||||
OpenStack alarm messages
|
||||
************************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
openstack-alarm-messages-300s
|
||||
openstack-alarm-messages-400s
|
||||
openstack-alarm-messages-700s
|
||||
openstack-alarm-messages-800s
|
||||
|
||||
*******************************
|
||||
OpenStack customer log messages
|
||||
*******************************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
openstack-customer-log-messages-270s-virtual-machines
|
||||
openstack-customer-log-messages-401s-services
|
||||
openstack-customer-log-messages-700s-virtual-machines
|
135
doc/source/fault/openstack-alarm-messages-300s.rst
Normal file
135
doc/source/fault/openstack-alarm-messages-300s.rst
Normal file
@ -0,0 +1,135 @@
|
||||
|
||||
.. slf1579788051430
|
||||
.. _alarm-messages-300s:
|
||||
|
||||
=====================
|
||||
Alarm Messages - 300s
|
||||
=====================
|
||||
|
||||
.. include:: ../_includes/openstack-alarm-messages-xxxs.rest
|
||||
|
||||
.. _alarm-messages-300s-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.003**
|
||||
- Networking Agent not responding.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.agent=<agent-uuid>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- If condition persists, attempt to clear issue by administratively locking and unlocking the Host.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.004**
|
||||
- No enabled compute host with connectivity to provider network.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.providernet=<pnet-uuid>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Enable compute hosts with required provider network connectivity.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.005**
|
||||
- Communication failure detected over provider network x% for ranges y% on host z%.
|
||||
|
||||
or
|
||||
|
||||
Communication failure detected over provider network x% on host z%.
|
||||
* - Entity Instance
|
||||
- providernet=<pnet-uuid>.host=<hostname>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Check neighbor switch port VLAN assignments.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.010**
|
||||
- ML2 Driver Agent non-reachable
|
||||
|
||||
or
|
||||
|
||||
ML2 Driver Agent reachable but non-responsive
|
||||
|
||||
or
|
||||
|
||||
ML2 Driver Agent authentication failure
|
||||
|
||||
or
|
||||
|
||||
ML2 Driver Agent is unable to sync Neutron database
|
||||
* - Entity Instance
|
||||
- host=<hostname>.ml2driver=<driver>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Monitor and if condition persists, contact next level of support.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.012**
|
||||
- Openflow Controller connection failed.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.openflow-controller=<uri>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent equipment.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.013**
|
||||
- No active Openflow controller connections found for this network.
|
||||
|
||||
or
|
||||
|
||||
One or more Openflow controller connections in disconnected state for this network.
|
||||
* - Entity Instance
|
||||
- host=<hostname>.openflow-network=<name>
|
||||
* - Severity:
|
||||
- C, M\*
|
||||
* - Proposed Repair Action
|
||||
- host=<hostname>.openflow-network=<name>
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.015**
|
||||
- No active OVSDB connections found.
|
||||
* - Entity Instance
|
||||
- host=<hostname>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Check cabling and far-end port configuration and status on adjacent equipment.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 300.016**
|
||||
- Dynamic routing agent x% lost connectivity to peer y%
|
||||
* - Entity Instance
|
||||
- host=<hostname>,agent=<agent-uuid>,bgp-peer=<bgp-peer>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- If condition persists, fix connectivity to peer.
|
55
doc/source/fault/openstack-alarm-messages-400s.rst
Normal file
55
doc/source/fault/openstack-alarm-messages-400s.rst
Normal file
@ -0,0 +1,55 @@
|
||||
|
||||
.. msm1579788069384
|
||||
.. _alarm-messages-400s:
|
||||
|
||||
=====================
|
||||
Alarm Messages - 400s
|
||||
=====================
|
||||
|
||||
.. include:: ../_includes/openstack-alarm-messages-xxxs.rest
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.001**
|
||||
- Service group failure; <list\_of\_affected\_services>.
|
||||
|
||||
or
|
||||
|
||||
Service group degraded; <list\_of\_affected\_services>
|
||||
|
||||
or
|
||||
|
||||
Service group Warning; <list\_of\_affected\_services>.
|
||||
* - Entity Instance
|
||||
- service\_domain=<domain\_name>.service\_group=<group\_name>.host=<hostname>
|
||||
* - Severity:
|
||||
- C/M/m\*
|
||||
* - Proposed Repair Action
|
||||
- Contact next level of support.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 400.002**
|
||||
- Service group loss of redundancy; expected <num> standby member<s> but only <num> standby member<s> available.
|
||||
|
||||
or
|
||||
|
||||
Service group loss of redundancy; expected <num> standby member<s> but only <num> standby member<s> available.
|
||||
|
||||
or
|
||||
|
||||
Service group loss of redundancy; expected <num> active member<s> but no active members available.
|
||||
|
||||
or
|
||||
|
||||
Service group loss of redundancy; expected <num> active member<s> but only <num> active member<s> available.
|
||||
* - Entity Instance
|
||||
- service\_domain=<domain\_name>.service\_group=<group\_name>
|
||||
* - Severity:
|
||||
- M\*
|
||||
* - Proposed Repair Action
|
||||
- Bring a controller node back in to service, otherwise contact next level of support.
|
275
doc/source/fault/openstack-alarm-messages-700s.rst
Normal file
275
doc/source/fault/openstack-alarm-messages-700s.rst
Normal file
@ -0,0 +1,275 @@
|
||||
|
||||
.. uxo1579788086872
|
||||
.. _alarm-messages-700s:
|
||||
|
||||
=====================
|
||||
Alarm Messages - 700s
|
||||
=====================
|
||||
|
||||
.. include:: ../_includes/openstack-alarm-messages-xxxs.rest
|
||||
|
||||
.. _alarm-messages-700s-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.001**
|
||||
- Instance <instance\_name> owned by <tenant\_name> has failed on host
|
||||
<host\_name>
|
||||
|
||||
Instance <instance\_name> owned by <tenant\_name> has failed to
|
||||
schedule
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- The system will attempt recovery; no repair action required.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.002**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is paused on host
|
||||
<host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Unpause the instance.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.003**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is suspended on host
|
||||
<host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Resume the instance.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.004**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is stopped on host
|
||||
<host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Start the instance.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.005**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is rebooting on host
|
||||
<host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for reboot to complete; if problem persists contact next level of
|
||||
support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.006**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is rebuilding on host
|
||||
<host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for rebuild to complete; if problem persists contact next level of
|
||||
support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.007**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is evacuating from host
|
||||
<host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for evacuate to complete; if problem persists contact next level of
|
||||
support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.008**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is live migrating from
|
||||
host <host\_name>
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- W\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for live migration to complete; if problem persists contact next
|
||||
level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.009**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is cold migrating from
|
||||
host <host\_name>
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for cold migration to complete; if problem persists contact next
|
||||
level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.010**
|
||||
- Instance <instance\_name> owned by <tenant\_name> has been cold-migrated
|
||||
to host <host\_name> waiting for confirmation.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Confirm or revert cold-migrate of instance.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.011**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is reverting cold
|
||||
migrate to host <host\_name>
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for cold migration revert to complete; if problem persists contact
|
||||
next level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.012**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is resizing on host
|
||||
<host\_name>
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for resize to complete; if problem persists contact next level of
|
||||
support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.013**
|
||||
- Instance <instance\_name> owned by <tenant\_name> has been resized on
|
||||
host <host\_name> waiting for confirmation.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Confirm or revert resize of instance.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.014**
|
||||
- Instance <instance\_name> owned by <tenant\_name> is reverting resize
|
||||
on host <host\_name>.
|
||||
* - Entity Instance
|
||||
- tenant=<tenant-uuid>.instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for resize revert to complete; if problem persists contact next
|
||||
level of support.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.016**
|
||||
- Multi-Node Recovery Mode
|
||||
* - Entity Instance
|
||||
- subsystem=vim
|
||||
* - Severity:
|
||||
- m\*
|
||||
* - Proposed Repair Action
|
||||
- Wait for the system to exit out of this mode.
|
||||
|
||||
-----
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 700.017**
|
||||
- Server group <server\_group\_name> <policy> policy was not satisfied.
|
||||
* - Entity Instance
|
||||
- server-group<server-group-uuid>
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Migrate instances in an attempt to satisfy the policy; if problem
|
||||
persists contact next level of support.
|
98
doc/source/fault/openstack-alarm-messages-800s.rst
Normal file
98
doc/source/fault/openstack-alarm-messages-800s.rst
Normal file
@ -0,0 +1,98 @@
|
||||
|
||||
.. tsh1579788106505
|
||||
.. _alarm-messages-800s:
|
||||
|
||||
=====================
|
||||
Alarm Messages - 800s
|
||||
=====================
|
||||
|
||||
.. include:: ../_includes/openstack-alarm-messages-xxxs.rest
|
||||
|
||||
.. _alarm-messages-800s-table-zrd-tg5-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.002**
|
||||
- Image storage media is full: There is not enough disk space on the image storage media.
|
||||
|
||||
or
|
||||
|
||||
Instance <instance name\> snapshot failed: There is not enough disk space on the image storage media.
|
||||
|
||||
or
|
||||
|
||||
Supplied <attrs\> \(<supplied\>\) and <attrs\> generated from uploaded image \(<actual\>\) did not match. Setting image status to 'killed'.
|
||||
|
||||
or
|
||||
|
||||
Error in store configuration. Adding images to store is disabled.
|
||||
|
||||
or
|
||||
|
||||
Forbidden upload attempt: <exception\>
|
||||
|
||||
or
|
||||
|
||||
Insufficient permissions on image storage media: <exception\>
|
||||
|
||||
or
|
||||
|
||||
Denying attempt to upload image larger than <size\> bytes.
|
||||
|
||||
or
|
||||
|
||||
Denying attempt to upload image because it exceeds the quota: <exception\>
|
||||
|
||||
or
|
||||
|
||||
Received HTTP error while uploading image <image\_id\>
|
||||
|
||||
or
|
||||
|
||||
Client disconnected before sending all data to backend
|
||||
|
||||
or
|
||||
|
||||
Failed to upload image <image\_id\>
|
||||
* - Entity Instance
|
||||
- image=<image-uuid>, instance=<instance-uuid>
|
||||
|
||||
or
|
||||
|
||||
image=<tenant-uuid>, instance=<instance-uuid>
|
||||
* - Severity:
|
||||
- W\*
|
||||
* - Proposed Repair Action
|
||||
- If problem persists, contact next level of support.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.100**
|
||||
- Storage Alarm Condition:
|
||||
|
||||
Cinder I/O Congestion is above normal range and is building
|
||||
* - Entity Instance
|
||||
- cinder\_io\_monitor
|
||||
* - Severity:
|
||||
- M
|
||||
* - Proposed Repair Action
|
||||
- Reduce the I/O load on the Cinder LVM backend. Use Cinder QoS mechanisms on high usage volumes.
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Alarm ID: 800.101**
|
||||
- Storage Alarm Condition:
|
||||
|
||||
Cinder I/O Congestion is high and impacting guest performance
|
||||
* - Entity Instance
|
||||
- cinder\_io\_monitor
|
||||
* - Severity:
|
||||
- C\*
|
||||
* - Proposed Repair Action
|
||||
- Reduce the I/O load on the Cinder LVM backend. Cinder actions may fail until congestion is reduced. Use Cinder QoS mechanisms on high usage volumes.
|
@ -0,0 +1,38 @@
|
||||
|
||||
.. ftb1579789103703
|
||||
.. _customer-log-messages-270s-virtual-machines:
|
||||
|
||||
=============================================
|
||||
Customer Log Messages 270s - Virtual Machines
|
||||
=============================================
|
||||
|
||||
.. include:: ../_includes/openstack-customer-log-messages-xxxs.rest
|
||||
|
||||
.. _customer-log-messages-270s-virtual-machines-table-zgf-jvw-v5:
|
||||
|
||||
.. table:: Table 1. Customer Log Messages - Virtual Machines
|
||||
:widths: auto
|
||||
|
||||
+-----------+----------------------------------------------------------------------------------+----------+
|
||||
| Log ID | Description | Severity |
|
||||
+ +----------------------------------------------------------------------------------+----------+
|
||||
| | Entity Instance ID | |
|
||||
+===========+==================================================================================+==========+
|
||||
| 270.101 | Host <host\_name> compute services failure\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+-----------+----------------------------------------------------------------------------------+----------+
|
||||
| 270.102 | Host <host\_name> compute services enabled | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+-----------+----------------------------------------------------------------------------------+----------+
|
||||
| 270.103 | Host <host\_name> compute services disabled | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+-----------+----------------------------------------------------------------------------------+----------+
|
||||
| 275.001 | Host <host\_name> hypervisor is now <administrative\_state>-<operational\_state> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+-----------+----------------------------------------------------------------------------------+----------+
|
||||
|
||||
See also :ref:`Customer Log Messages 700s - Virtual Machines <customer-log-messages-700s-virtual-machines>`
|
@ -0,0 +1,45 @@
|
||||
|
||||
.. hwr1579789203684
|
||||
.. _customer-log-messages-401s-services:
|
||||
|
||||
=====================================
|
||||
Customer Log Messages 401s - Services
|
||||
=====================================
|
||||
|
||||
.. include:: ../_includes/openstack-customer-log-messages-xxxs.rest
|
||||
|
||||
.. _customer-log-messages-401s-services-table-zgf-jvw-v5:
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Log Message: 401.001**
|
||||
- Service group <group> state change from <state> to <state> on host <host\_name>
|
||||
* - Entity Instance
|
||||
- service\_domain=<domain>.service\_group=<group>.host=<host\_name>
|
||||
* - Severity:
|
||||
- C
|
||||
|
||||
.. list-table::
|
||||
:widths: 6 15
|
||||
:header-rows: 0
|
||||
|
||||
* - **Log Message: 401.002**
|
||||
- Service group <group> loss of redundancy; expected <X> standby member but no standby members available.
|
||||
|
||||
or
|
||||
|
||||
Service group <group> loss of redundancy; expected <X> standby member but only <Y> standby member\(s\) available.
|
||||
|
||||
or
|
||||
|
||||
Service group <group> has no active members available; expected <X> active member\(s\)
|
||||
|
||||
or
|
||||
|
||||
Service group <group> loss of redundancy; expected <X> active member\(s\) but only <Y> active member\(s\) available.
|
||||
* - Entity Instance
|
||||
- service\_domain=<domain>.service\_group=<group>
|
||||
* - Severity:
|
||||
- C
|
@ -0,0 +1,480 @@
|
||||
|
||||
.. qfy1579789227230
|
||||
.. _customer-log-messages-700s-virtual-machines:
|
||||
|
||||
=============================================
|
||||
Customer Log Messages 700s - Virtual Machines
|
||||
=============================================
|
||||
|
||||
.. include:: ../_includes/openstack-customer-log-messages-xxxs.rest
|
||||
|
||||
.. _customer-log-messages-700s-virtual-machines-table-zgf-jvw-v5:
|
||||
|
||||
.. table:: Table 1. Customer Log Messages
|
||||
:widths: auto
|
||||
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| Log ID | Description | Severity |
|
||||
+ +------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| | Entity Instance ID | |
|
||||
+==========+====================================================================================================================================================================================+==========+
|
||||
| 700.101 | Instance <instance\_name> is enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.102 | Instance <instance\_name> owned by <tenant\_name> has failed\[, reason = <reason\_text>\]. | C |
|
||||
| | Instance <instance\_name> owned by <tenant\_name> has failed to schedule\[, reason = <reason\_text>\] | |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.103 | Create issued by <tenant\_name\> or by the system against <instance\_name> owned by <tenant\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.104 | Creating instance <instance\_name> owned by <tenant\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.105 | Create rejected for instance <instance\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.106 | Create canceled for instance <instance\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.107 | Create failed for instance <instance\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.108 | Instance <instance\_name> owned by <tenant\_name> has been created | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.109 | Delete issued by <tenant\_name\> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.110 | Deleting instance <instance\_name> owned by <tenant\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.111 | Delete rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.112 | Delete canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.113 | Delete failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.114 | Deleted instance <instance\_name> owned by <tenant\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.115 | Pause issued by <tenant\_name\> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.116 | Pause inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.117 | Pause rejected for instance <instance\_name> enabled on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.118 | Pause canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.119 | Pause failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.120 | Pause complete for instance <instance\_name> now paused on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.121 | Unpause issued by <tenant\_name\> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.122 | Unpause inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.123 | Unpause rejected for instance <instance\_name> paused on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.124 | Unpause canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.125 | Unpause failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.126 | Unpause complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.127 | Suspend issued by <tenant\_name\> or by the system> against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.128 | Suspend inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.129 | Suspend rejected for instance <instance\_name> enabled on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.130 | Suspend canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.131 | Suspend failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.132 | Suspend complete for instance <instance\_name> now suspended on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.133 | Resume issued by <tenant\_name\> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.134 | Resume inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.135 | Resume rejected for instance <instance\_name> suspended on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.136 | Resume canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.137 | Resume failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.138 | Resume complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.139 | Start issued by <tenant\_name\> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.140 | Start inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.141 | Start rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.142 | Start canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.143 | Start failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.144 | Start complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.145 | Stop issued by <tenant\_name>\ or by the system or by the instance against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.146 | Stop inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.147 | Stop rejected for instance <instance\_name> enabled on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.148 | Stop canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.149 | Stop failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.150 | Stop complete for instance <instance\_name> now disabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.151 | Live-Migrate issued by <tenant\_name> or by the system against instance <instance\_name> owned by <tenant\_name> from host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.152 | Live-Migrate inprogress for instance <instance\_name> from host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.153 | Live-Migrate rejected for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.154 | Live-Migrate canceled for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.155 | Live-Migrate failed for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.156 | Live-Migrate complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.157 | Cold-Migrate issued by <tenant\_name> or by the system against instance <instance\_name> owned by <tenant\_name> from host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.158 | Cold-Migrate inprogress for instance <instance\_name> from host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.159 | Cold-Migrate rejected for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.160 | Cold-Migrate canceled for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.161 | Cold-Migrate failed for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.162 | Cold-Migrate complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.163 | Cold-Migrate-Confirm issued by <tenant\_name> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.164 | Cold-Migrate-Confirm inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.165 | Cold-Migrate-Confirm rejected for instance <instance\_name> now enabled on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.166 | Cold-Migrate-Confirm canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.167 | Cold-Migrate-Confirm failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.168 | Cold-Migrate-Confirm complete for instance <instance\_name> enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.169 | Cold-Migrate-Revert issued by <tenant\_name> or by the system\> against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.170 | Cold-Migrate-Revert inprogress for instance <instance\_name> from host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.171 | Cold-Migrate-Revert rejected for instance <instance\_name> now on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.172 | Cold-Migrate-Revert canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.173 | Cold-Migrate-Revert failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.174 | Cold-Migrate-Revert complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.175 | Evacuate issued by <tenant\_name> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.176 | Evacuating instance <instance\_name> owned by <tenant\_name> from host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.177 | Evacuate rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.178 | Evacuate canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.179 | Evacuate failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.180 | Evacuate complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.181 | Reboot <\(soft-reboot\) or \(hard-reboot\)> issued by <tenant\_name> or by the system or by the instance against instance <instance\_name> owned by | C |
|
||||
| | <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.182 | Reboot inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.183 | Reboot rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.184 | Reboot canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.185 | Reboot failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.186 | Reboot complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.187 | Rebuild issued by <tenant\_name> or by the system against instance <instance\_name> using image <image\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.188 | Rebuild inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.189 | Rebuild rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.190 | Rebuild canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.191 | Rebuild failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.192 | Rebuild complete for instance <instance\_name> now enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.193 | Resize issued by <tenant\_name\> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.194 | Resize inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.195 | Resize rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.196 | Resize canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.197 | Resize failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.198 | Resize complete for instance <instance\_name> enabled on host <host\_name> waiting for confirmation | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.199 | Resize-Confirm issued by <tenant\_name> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.200 | Resize-Confirm inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.201 | Resize-Confirm rejected for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.202 | Resize-Confirm canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.203 | Resize-Confirm failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.204 | Resize-Confirm complete for instance <instance\_name> enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.205 | Resize-Revert issued by <tenant\_name> or by the system against instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.206 | Resize-Revert inprogress for instance <instance\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.207 | Resize-Revert rejected for instance <instance\_name> owned by <tenant\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.208 | Resize-Revert canceled for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.209 | Resize-Revert failed for instance <instance\_name> on host <host\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.210 | Resize-Revert complete for instance <instance\_name> enabled on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.214 | Instance <instance\_name> has been renamed to <new\_instance\_name> owned by <tenant\_name> on host <host\_name> | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.215 | Guest Health Check failed for instance <instance\_name>\[, reason = <reason\_text>\] | C |
|
||||
| | | |
|
||||
| | tenant=<tenant-uuid>.instance=<instance-uuid> | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.216 | Entered Multi-Node Recovery Mode | C |
|
||||
| | | |
|
||||
| | subsystem-vim | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
| 700.217 | Exited Multi-Node Recovery Mode | C |
|
||||
| | | |
|
||||
| | subsystem-vim | |
|
||||
+----------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------+
|
||||
|
||||
See also :ref:`Customer Log Messages 270s - Virtual Machines <customer-log-messages-270s-virtual-machines>`
|
19
doc/source/fault/openstack-fault-management-overview.rst
Normal file
19
doc/source/fault/openstack-fault-management-overview.rst
Normal file
@ -0,0 +1,19 @@
|
||||
|
||||
.. ekn1458933172232
|
||||
.. _openstack-fault-management-overview:
|
||||
|
||||
========
|
||||
Overview
|
||||
========
|
||||
|
||||
|prod-os| is a containerized application running on top of |prod|.
|
||||
|
||||
All Fault Management related interfaces for displaying alarms and logs,
|
||||
suppressing/unsuppressing events, enabling SNMP and enabling remote log
|
||||
collection are available on the |prod| REST APIs, CLIs and/or GUIs.
|
||||
|
||||
.. xreflink See :ref:`Fault Management Overview <platform-fault-management-overview>` for details on these interfaces.
|
||||
|
||||
This section provides the list of OpenStack related Alarms and Customer Logs
|
||||
that are monitored and reported for the |prod-os| application through the
|
||||
|prod| fault management interfaces.
|
30
doc/source/fault/setting-snmp-identifying-information.rst
Normal file
30
doc/source/fault/setting-snmp-identifying-information.rst
Normal file
@ -0,0 +1,30 @@
|
||||
|
||||
.. tie1580219717420
|
||||
.. _setting-snmp-identifying-information:
|
||||
|
||||
================================
|
||||
Set SNMP Identifying Information
|
||||
================================
|
||||
|
||||
You can set SNMP system information including name, location and contact
|
||||
details.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
- Use the following command syntax to set the **sysContact** attribute.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system modify --contact <site-contact>
|
||||
|
||||
- Use the following command syntax to set the **sysLocation** attribute.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system modify --location <location>
|
||||
|
||||
- Use the following command syntax to set the **sysName** attribute.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ system modify --location <system-name>
|
44
doc/source/fault/snmp-event-table.rst
Normal file
44
doc/source/fault/snmp-event-table.rst
Normal file
@ -0,0 +1,44 @@
|
||||
|
||||
.. rdr1552680506097
|
||||
.. _snmp-event-table:
|
||||
|
||||
================
|
||||
SNMP Event Table
|
||||
================
|
||||
|
||||
|prod| supports SNMP active and historical alarms, and customer logs, in an
|
||||
event table.
|
||||
|
||||
The event table contains historical alarms \(sets and clears\) alarms and
|
||||
customer logs. It does not contain active alarms. Each entry in the table
|
||||
includes the following variables:
|
||||
|
||||
.. _snmp-event-table-ul-y1w-4lk-qq:
|
||||
|
||||
- <UUID>
|
||||
|
||||
- <EventID>
|
||||
|
||||
- <State>
|
||||
|
||||
- <EntityInstanceID>
|
||||
|
||||
- <DateAndTime>
|
||||
|
||||
- <EventSeverity>
|
||||
|
||||
- <ReasonText>
|
||||
|
||||
- <EventType>
|
||||
|
||||
- <ProbableCause>
|
||||
|
||||
- <ProposedRepairAction>
|
||||
|
||||
- <ServiceAffecting>
|
||||
|
||||
- <SuppressionAllowed>
|
||||
|
||||
.. note::
|
||||
The previous SNMP Historical Alarm Table and the SNMP Customer Log Table
|
||||
are still supported but marked as deprecated in the MIB.
|
136
doc/source/fault/snmp-overview.rst
Normal file
136
doc/source/fault/snmp-overview.rst
Normal file
@ -0,0 +1,136 @@
|
||||
|
||||
.. gzl1552680561274
|
||||
.. _snmp-overview:
|
||||
|
||||
=============
|
||||
SNMP Overview
|
||||
=============
|
||||
|
||||
|prod| can generate SNMP traps for |prod| Alarm Events and Customer Log Events.
|
||||
|
||||
This includes alarms based on hardware sensors monitored by board management
|
||||
controllers.
|
||||
|
||||
.. xreflink For more information, see |node-doc|: :ref:`Sensors Tab <sensors-tab>`.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
.. _snmp-overview-section-N10027-N1001F-N10001:
|
||||
|
||||
------------------
|
||||
About SNMP Support
|
||||
------------------
|
||||
|
||||
Support for Simple Network Management Protocol \(SNMP\) is implemented as follows:
|
||||
|
||||
.. _snmp-overview-ul-bjv-cjd-cp:
|
||||
|
||||
- access is disabled by default, must be enabled manually from the command
|
||||
line interface
|
||||
|
||||
- available using the controller's node floating OAM IP address, over the
|
||||
standard SNMP UDP port 161
|
||||
|
||||
- supported version is SNMPv2c
|
||||
|
||||
- access is read-only for all SNMP communities
|
||||
|
||||
- all SNMP communities have access to the entire OID tree, there is no
|
||||
support for VIEWS
|
||||
|
||||
- supported SNMP operations are GET, GETNEXT, GETBULK, and SNMPv2C-TRAP2
|
||||
|
||||
- the SNMP SET operation is not supported
|
||||
|
||||
For information on enabling SNMP support, see
|
||||
:ref:`Enabling SNMP Support <enabling-snmp-support>`.
|
||||
|
||||
.. _snmp-overview-section-N10099-N1001F-N10001:
|
||||
|
||||
-----------------------
|
||||
SNMPv2-MIB \(RFC 3418\)
|
||||
-----------------------
|
||||
|
||||
Support for the basic standard MIB for SNMP entities is limited to the System
|
||||
and SNMP groups, as follows:
|
||||
|
||||
.. _snmp-overview-ul-ulb-ypl-hp:
|
||||
|
||||
- System Group, **.iso.org.dod.internet.mgmt.mib-2.system**
|
||||
|
||||
- SNMP Group, **.iso.org.dod.internet.mgmt.mib-2.snmp**
|
||||
|
||||
- coldStart and warmStart Traps
|
||||
|
||||
The following system attributes are used in support of the SNMP implementation.
|
||||
They can be displayed using the :command:`system show` command.
|
||||
|
||||
**contact**
|
||||
A read-write system attribute used to populate the **sysContact** attribute
|
||||
of the SNMP System group.
|
||||
|
||||
**location**
|
||||
A read-write system attribute used to populate the **sysLocation** attribute
|
||||
of the SNMP System group.
|
||||
|
||||
**name**
|
||||
A read-write system attribute used to populate the **sysName** attribute of
|
||||
the SNMP System group.
|
||||
|
||||
**software\_version**
|
||||
A read-only system attribute set automatically by the system. Its value is
|
||||
used to populate the **sysDescr** attribute of the SNMP System group.
|
||||
|
||||
For information on setting the **sysContact**, **sysLocation**, and **sysName**
|
||||
attributes, see
|
||||
:ref:`Setting SNMP Identifying Information <setting-snmp-identifying-information>`.
|
||||
|
||||
The following SNMP attributes are used as follows:
|
||||
|
||||
**sysObjectId**
|
||||
Set to **iso.org.dod.internet.private.enterprise.wrs.titanium** \(1.3.6.1.4.1.1.2\).
|
||||
|
||||
**sysUpTime**
|
||||
Set to the up time of the active controller.
|
||||
|
||||
**sysServices**
|
||||
Set to the nominal value of 72 to indicate that the host provides services at layers 1 to 7.
|
||||
|
||||
.. _snmp-overview-section-N100C9-N1001F-N10001:
|
||||
|
||||
--------------------------
|
||||
Wind River Enterprise MIBs
|
||||
--------------------------
|
||||
|
||||
|prod| supports the Wind River Enterprise Registration and Alarm MIBs.
|
||||
|
||||
**Enterprise Registration MIB, wrsEnterpriseReg.mib**
|
||||
Defines the Wind River Systems \(WRS\) hierarchy underneath the
|
||||
**iso\(1\).org\(3\).dod\(6\).internet\(1\).private\(4\).enterprise\(1\)**.
|
||||
This hierarchy is administered as follows:
|
||||
|
||||
- **.wrs\(731\)**, the IANA-registered enterprise code for Wind River
|
||||
Systems
|
||||
|
||||
- **.wrs\(731\).wrsCommon\(1\).wrs<Module\>\(1-...\)**,
|
||||
defined in wrsCommon<Module\>.mib.
|
||||
|
||||
- **.wrs\(731\).wrsProduct\(2-...\)**, defined in wrs<Product\>.mib.
|
||||
|
||||
**Alarm MIB, wrsAlarmMib.mib**
|
||||
Defines the common TRAP and ALARM MIBs for |org| products.
|
||||
The definition includes textual conventions, an active alarm table, a
|
||||
historical alarm table, a customer log table, and traps.
|
||||
|
||||
**Textual Conventions**
|
||||
Semantic statements used to simplify definitions in the active alarm
|
||||
table and traps components of the MIB.
|
||||
|
||||
**Tables**
|
||||
See :ref:`SNMP Event Table <snmp-event-table>` for detailed
|
||||
descriptions.
|
||||
|
||||
**Traps**
|
||||
See :ref:`Traps <traps>` for detailed descriptions.
|
47
doc/source/fault/suppressing-an-alarm-using-the-cli.rst
Normal file
47
doc/source/fault/suppressing-an-alarm-using-the-cli.rst
Normal file
@ -0,0 +1,47 @@
|
||||
|
||||
.. ani1552680633324
|
||||
.. _suppressing-an-alarm-using-the-cli:
|
||||
|
||||
===============================
|
||||
Suppress an Alarm Using the CLI
|
||||
===============================
|
||||
|
||||
You can use the CLI to prevent a monitored system parameter from generating
|
||||
unnecessary alarms.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Use the :command:`fm event-suppress` to suppress a single alarm or
|
||||
multiple alarms by ID.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-suppress [--nowrap] --alarm id <alarm_ id>[,<alarm-id>] \
|
||||
[--nopaging] [--uuid]
|
||||
|
||||
where
|
||||
|
||||
**<alarm-id>**
|
||||
is a comma separated list of alarm UUIDs.
|
||||
|
||||
**--nowrap**
|
||||
disables output wrapping
|
||||
|
||||
**--nopaging**
|
||||
disables paged output
|
||||
|
||||
**--uuid**
|
||||
includes the alarm type UUIDs in the output
|
||||
|
||||
An error message is generated in the case of an invalid
|
||||
<alarm-id>: **Alarm ID not found: <alarm-id\>**.
|
||||
|
||||
If the specified number of Alarm IDs is greater than 1, and at least 1 is
|
||||
wrong, then the suppress command is not applied \(none of the specified
|
||||
Alarm IDs are suppressed\).
|
||||
|
||||
.. note::
|
||||
Suppressing an Alarm will result in the system NOT notifying the
|
||||
operator of this particular fault.
|
||||
|
||||
|
37
doc/source/fault/suppressing-and-unsuppressing-events.rst
Normal file
37
doc/source/fault/suppressing-and-unsuppressing-events.rst
Normal file
@ -0,0 +1,37 @@
|
||||
|
||||
.. sla1552680666298
|
||||
.. _suppressing-and-unsuppressing-events:
|
||||
|
||||
=================================
|
||||
Suppress and Unsuppressing Events
|
||||
=================================
|
||||
|
||||
You can set events to a suppressed state and toggle them back to unsuppressed.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Open the Events Suppression page, available from **Admin** \>
|
||||
**Fault Management** \> **Events Suppression** in the left-hand pane.
|
||||
|
||||
The Events Suppression page appears. It provides the suppression status of
|
||||
each event type and functionality for suppressing or unsuppressing each
|
||||
event, depending on the current status of the event.
|
||||
|
||||
#. Locate the event ID that you want to suppress.
|
||||
|
||||
#. Click the **Suppress Event** button for that event.
|
||||
|
||||
You are prompted to confirm that you want to suppress the event.
|
||||
|
||||
.. caution::
|
||||
Suppressing an Alarm will result in the system *not* notifying the
|
||||
operator of this particular fault.
|
||||
|
||||
#. Click **Suppress Event** in the Confirm Suppress Event dialog box.
|
||||
|
||||
The Events Suppression tab is refreshed to show the selected event ID with
|
||||
a status of Suppressed, as shown below. The **Suppress Event** button is
|
||||
replaced by **Unsuppress Event**, providing a way to toggle the event back
|
||||
to unsuppressed.
|
||||
|
||||
.. image:: figures/nlc1463584178366.png
|
25
doc/source/fault/the-global-alarm-banner.rst
Normal file
25
doc/source/fault/the-global-alarm-banner.rst
Normal file
@ -0,0 +1,25 @@
|
||||
|
||||
.. wtg1552680748451
|
||||
.. _the-global-alarm-banner:
|
||||
|
||||
=======================
|
||||
The Global Alarm Banner
|
||||
=======================
|
||||
|
||||
The |prod| Horizon Web interface provides an active alarm counts banner in the
|
||||
page header of all screens.
|
||||
|
||||
The global alarm banner provides a high-level indicator of faults on the system,
|
||||
that is always visible, regardless of what page you are on in the GUI. The
|
||||
banner provides a color-coded snapshot of current active alarm counts for each
|
||||
alarm severity.
|
||||
|
||||
.. image:: figures/xyj1558447807645.png
|
||||
|
||||
.. note::
|
||||
Suppressed alarms are not shown. For more about suppressed alarms, see
|
||||
:ref:`Events Suppression Overview <events-suppression-overview>`.
|
||||
|
||||
Clicking on the alarm banner opens the Fault Management page, where more
|
||||
detailed information about the alarms is provided.
|
||||
|
63
doc/source/fault/traps.rst
Normal file
63
doc/source/fault/traps.rst
Normal file
@ -0,0 +1,63 @@
|
||||
|
||||
.. lmy1552680547012
|
||||
.. _traps:
|
||||
|
||||
=====
|
||||
Traps
|
||||
=====
|
||||
|
||||
|prod| supports SNMP traps. Traps send unsolicited information to monitoring
|
||||
software when significant events occur.
|
||||
|
||||
The following traps are defined.
|
||||
|
||||
.. _traps-ul-p1j-tvn-c5:
|
||||
|
||||
- **wrsAlarmCritical**
|
||||
|
||||
- **wrsAlarmMajor**
|
||||
|
||||
- **wrsAlarmMinor**
|
||||
|
||||
- **wrsAlarmWarning**
|
||||
|
||||
- **wrsAlarmMessage**
|
||||
|
||||
- **wrsAlarmClear**
|
||||
|
||||
- **wrsAlarmHierarchicalClear**
|
||||
|
||||
.. note::
|
||||
Customer Logs always result in **wrsAlarmMessage** traps.
|
||||
|
||||
For Critical, Major, Minor, Warning, and Message traps, all variables in the
|
||||
active alarm table are included as varbinds \(variable bindings\), where each
|
||||
varbind is a pair of fields consisting of an object identifier and a value
|
||||
for the object.
|
||||
|
||||
For the Clear trap, varbinds include only the following variables:
|
||||
|
||||
.. _traps-ul-uks-byn-nkb:
|
||||
|
||||
- <AlarmID>
|
||||
|
||||
- <EntityInstanceID>
|
||||
|
||||
- <DateAndTime>
|
||||
|
||||
- <ReasonText>
|
||||
|
||||
For the HierarchicalClear trap, varbinds include only the following variables:
|
||||
|
||||
.. _traps-ul-isn-fyn-nkb:
|
||||
|
||||
- <EntityInstanceID>
|
||||
|
||||
- <DateAndTime>
|
||||
|
||||
- <ReasonText>
|
||||
|
||||
For all alarms, the Notification Type is based on the severity of the trap or
|
||||
alarm. This is done to facilitate the interaction with most SNMP trap viewers
|
||||
which typically use the Notification Type to drive the coloring of traps, that
|
||||
is, red for critical, yellow for minor, and so on.
|
99
doc/source/fault/troubleshooting-log-collection.rst
Normal file
99
doc/source/fault/troubleshooting-log-collection.rst
Normal file
@ -0,0 +1,99 @@
|
||||
|
||||
.. ley1552581824091
|
||||
.. _troubleshooting-log-collection:
|
||||
|
||||
===========================
|
||||
Troubleshoot Log Collection
|
||||
===========================
|
||||
|
||||
The |prod| log collection tool gathers detailed information.
|
||||
|
||||
.. contents::
|
||||
:local:
|
||||
:depth: 1
|
||||
|
||||
.. _troubleshooting-log-collection-section-N10061-N1001C-N10001:
|
||||
|
||||
------------------------------
|
||||
Collect Tool Caveats and Usage
|
||||
------------------------------
|
||||
|
||||
.. _troubleshooting-log-collection-ul-dpj-bxp-jdb:
|
||||
|
||||
- Log in as **sysadmin**, NOT as root, on the active controller and use the
|
||||
:command:`collect` command.
|
||||
|
||||
- All usage options can be found by using the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect --help
|
||||
|
||||
- For |prod| Simplex or Duplex systems, use the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect --all
|
||||
|
||||
- For |prod| Standard systems, use the following commands:
|
||||
|
||||
|
||||
- For a small deployment \(less than two worker nodes\):
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect -–all
|
||||
|
||||
- For large deployments:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect --list host1 host2 host3
|
||||
|
||||
|
||||
- For systems with an up-time of more than 2 months, use the date range options.
|
||||
|
||||
Use --start-date for the collection of logs on and after a given date:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect [--start-date | -s] <YYYYMMDD>
|
||||
|
||||
Use --end-date for the collection of logs on and before a given date :
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect [--end-date | -s] <YYYYMMDD>
|
||||
|
||||
- To prefix the collect tar ball name and easily identify the
|
||||
:command:`collect` when several are present, use the following command.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect [--name | -n] <prefix>
|
||||
|
||||
For example, the following prepends **TEST1** to the name of the tarball:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect --name TEST1
|
||||
[sudo] password for sysadmin:
|
||||
collecting data from 1 host(s): controller-0
|
||||
collecting controller-0_20200316.155805 ... done (00:01:39 56M)
|
||||
creating user-named tarball /scratch/TEST1_20200316.155805.tar ... done (00:01:39 56M)
|
||||
|
||||
- Prior to using the :command:`collect` command, the nodes need to be
|
||||
unlocked-enabled or disabled online and are required to be unlocked at
|
||||
least once.
|
||||
|
||||
- Lock the node and wait for the node to reach the disabled-online state
|
||||
before collecting logs for a node that is rebooting indefinitely.
|
||||
|
||||
- You may be required to run the local :command:`collect` command if the
|
||||
collect tool running from the active controller node fails to collect
|
||||
logs from one of the system nodes. Execute the :command:`collect` command
|
||||
using the console or BMC connection on the node that displays the failure.
|
||||
|
||||
.. only:: partner
|
||||
|
||||
.. include:: ../_includes/troubleshooting-log-collection.rest
|
41
doc/source/fault/unsuppressing-an-alarm-using-the-cli.rst
Normal file
41
doc/source/fault/unsuppressing-an-alarm-using-the-cli.rst
Normal file
@ -0,0 +1,41 @@
|
||||
|
||||
.. maj1552680619436
|
||||
.. _unsuppressing-an-alarm-using-the-cli:
|
||||
|
||||
====================================
|
||||
Unsuppressing an Alarm Using the CLI
|
||||
====================================
|
||||
|
||||
If you need to reactivate a suppressed alarm, you can do so using the CLI.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
- Use the :command:`fm event-unsuppress` CLI command to unsuppress a
|
||||
currently suppressed alarm.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-unsuppress [--nowrap] --alarm_id <alarm-id>[,<alarm-id>] \
|
||||
[--nopaging] [--uuid]
|
||||
|
||||
where
|
||||
|
||||
**<alarm-id>**
|
||||
is a comma separated list of **Alarm ID** s of alarms to unsuppress.
|
||||
|
||||
**--nowrap**
|
||||
disables output wrapping.
|
||||
|
||||
**--nopaging**
|
||||
disables paged output
|
||||
|
||||
**--uuid**
|
||||
includes the alarm type UUIDs in the output.
|
||||
|
||||
Alarm type\(s\) with the specified <alarm-id\(s\)> will be unsuppressed.
|
||||
|
||||
You can unsuppress all currently suppressed alarms using the following command:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-unsuppress -all [--nopaging] [--uuid]
|
47
doc/source/fault/viewing-active-alarms-using-horizon.rst
Normal file
47
doc/source/fault/viewing-active-alarms-using-horizon.rst
Normal file
@ -0,0 +1,47 @@
|
||||
|
||||
.. sqv1552680735693
|
||||
.. _viewing-active-alarms-using-horizon:
|
||||
|
||||
================================
|
||||
View Active Alarms Using Horizon
|
||||
================================
|
||||
|
||||
The |prod| Horizon Web interface provides a page for viewing active alarms.
|
||||
|
||||
Alarms are fault conditions that have a state; they are set and cleared by the
|
||||
system as a result of monitoring and detecting a change in a fault condition.
|
||||
Active alarms are alarms that are in the set condition. Active alarms typically
|
||||
require user action to be cleared, for example, replacing a faulty cable, or
|
||||
removing files from a nearly full filesystem, etc.
|
||||
|
||||
.. note::
|
||||
For data networks and worker host data interfaces, you can also use the
|
||||
Data Network Topology view to monitor active alarms.
|
||||
|
||||
.. xreflink For more information, see |datanet-doc|: :ref:`The Data Network Topology View <the-data-network-topology-view>`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _viewing-active-alarms-using-horizon-steps-n43-ssf-pkb:
|
||||
|
||||
#. Select **Admin** \> **Fault Management** \> **Active Alarms** in the left pane.
|
||||
|
||||
The currently Active Alarms are displayed in a table, by default sorted by
|
||||
severity with the most critical alarms at the top. A color-coded summary
|
||||
count of active alarms is shown at the top of the active alarm tab as well.
|
||||
|
||||
You can change the sorting of entries by clicking on the column titles.
|
||||
For example, to sort the table by timestamp severity, click
|
||||
**Timestamp**. The entries are re-sorted by timestamp.
|
||||
|
||||
Suppressed alarms are excluded by default from the table. Suppressed alarms
|
||||
can be included or excluded in the table with the **Show Suppressed** and
|
||||
**Hide Suppressed** filter buttons at the top right of the table. The
|
||||
suppression filter buttons are only shown when one or more alarms are
|
||||
suppressed.
|
||||
|
||||
The **Suppression Status** column is only shown in the table when the
|
||||
**Show Suppressed** filter button is selected.
|
||||
|
||||
#. Click the Alarm ID of an alarm entry in the table to display the details
|
||||
of the alarm.
|
192
doc/source/fault/viewing-active-alarms-using-the-cli.rst
Normal file
192
doc/source/fault/viewing-active-alarms-using-the-cli.rst
Normal file
@ -0,0 +1,192 @@
|
||||
|
||||
.. pdd1551804388161
|
||||
.. _viewing-active-alarms-using-the-cli:
|
||||
|
||||
================================
|
||||
View Active Alarms Using the CLI
|
||||
================================
|
||||
|
||||
You can use the CLI to find information about currently active system alarms.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
.. note::
|
||||
You can also use the command :command:`fm alarm-summary` to view the count
|
||||
of alarms and warnings for the system.
|
||||
|
||||
To review detailed information about a specific alarm instance, see
|
||||
:ref:`Viewing Alarm Details Using the CLI <viewing-alarm-details-using-the-cli>`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _viewing-active-alarms-using-the-cli-steps-gsj-prg-pkb:
|
||||
|
||||
#. Log in with administrative privileges.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ source /etc/platform/openrc
|
||||
|
||||
#. Run the :command:`fm alarm-list` command to view alarms.
|
||||
|
||||
The command syntax is:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
fm alarm-list [--nowrap] [-q <QUERY>] [--uuid] [--include_suppress] [--mgmt_affecting] [--degrade_affecting]
|
||||
|
||||
**--nowrap**
|
||||
Prevent word-wrapping of output. This option is useful when output will
|
||||
be piped to another process.
|
||||
|
||||
**-q**
|
||||
<QUERY> is a query string to filter the list output. The typical
|
||||
OpenStack CLI syntax for this query string is used. The syntax is a
|
||||
combination of attribute, operator and value. For example:
|
||||
severity=warning would filter alarms with a severity of warning. More
|
||||
complex queries can be built. See the upstream OpenStack CLI syntax
|
||||
for more details on <QUERY> string syntax. Also see additional query
|
||||
examples below.
|
||||
|
||||
You can use one of the following --query command filters to view
|
||||
specific subsets of alarms, or a particular alarm:
|
||||
|
||||
.. table::
|
||||
:widths: auto
|
||||
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| Query Filter | Comment |
|
||||
+============================================================================+============================================================================+
|
||||
| :command:`uuid=<uuid\>` | Query alarms by UUID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query uuid=4ab5698a-19cb... |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`alarm\_id=<alarm id\>` | Query alarms by alarm ID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query alarm_id=100.104 |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`alarm\_type=<type\>` | Query alarms by type, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | alarm_type=operational-violation |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`entity\_type\_id=<type id\>` | Query alarms by entity type ID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | entity_type_id=system.host |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`entity\_instance\_id=<instance id\>` | Query alarms by entity instance id, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | entity_instance_id=host=worker-0 |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`severity=<severity\>` | Query alarms by severity type, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query severity=warning |
|
||||
| | |
|
||||
| | The valid severity types are critical, major, minor, and warning. |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
|
||||
Query command filters can be combined into a single expression
|
||||
separated by semicolons, as illustrated in the following example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list -q 'alarm_id=400.002;entity_instance_id=service_domain=controller.service_group=directory-services'
|
||||
|
||||
This option indicates that all active alarms should be displayed,
|
||||
including suppressed alarms. Suppressed alarms are displayed with
|
||||
their Alarm ID set to S<\(alarm-id\)>.
|
||||
|
||||
**--uuid**
|
||||
The --uuid option on the :command:`fm alarm-list` command lists the
|
||||
active alarm list with unique UUIDs for each alarm such that this
|
||||
UUID can be used in display alarm details with the
|
||||
:command:`fm alarm-show` <UUID> command.
|
||||
|
||||
**--include\_suppress**
|
||||
Use this option to include suppressed alarms in the list.
|
||||
|
||||
**--mgmt\_affecting**
|
||||
Management affecting alarms prevent some critical administrative
|
||||
actions from being performed. For example, software upgrades. Using the
|
||||
--mgmt\_affecting option will list an additional column in the output,
|
||||
'Management Affecting', which indicates whether the alarm is management
|
||||
affecting or not.
|
||||
|
||||
**--degrade\_affecting**
|
||||
Include degrade affecting status in output.
|
||||
|
||||
The following example shows alarm UUIDs.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --uuid
|
||||
+--------------+-------+------------------+---------------+----------+-----------+
|
||||
| UUID | Alarm | Reason Text | Entity ID | Severity | Time |
|
||||
| | ID | | | | Stamp |
|
||||
+--------------+-------+------------------+---------------+----------+-----------+
|
||||
| 6056e290- | 200. | compute-0 was | host= | warning | 2019 |
|
||||
| 2e56- | 001 | administratively | compute-0 | | -08-29T |
|
||||
| 4e22-b07a- | | locked to take | | | 17:00:16. |
|
||||
| ff9cf4fbd81a | | it out-of | | | 363072 |
|
||||
| | | -service. | | | |
|
||||
| | | | | | |
|
||||
| | | | | | |
|
||||
| 0a8a4aec- | 100. | NTP address | host= | minor | 2019 |
|
||||
| a2cb- | 114 | 2607:5300:201:3 | controller-1. | | -08-29T |
|
||||
| 46aa-8498- | | is not a valid | ntp= | | 15:44:44. |
|
||||
| 9ed9b6448e0c | | or a reachable | 2607:5300: | | 773704 |
|
||||
| | | NTP server. | 201:3 | | |
|
||||
| | | | | | |
|
||||
| | | | | | |
|
||||
+--------------+-------+------------------+---------------+----------+-----------+
|
||||
|
||||
This command shows a column to track the management affecting severity of each alarm type.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --mgmt_affecting
|
||||
+-------+-------------------+---------------+----------+------------+-------------+
|
||||
| Alarm | Reason Text | Entity ID | Severity | Management | Time Stamp |
|
||||
| ID | | | | Affecting | |
|
||||
+-------+-------------------+---------------+----------+------------+-------------+
|
||||
| 100. | Platform Memory | host= | major | False | 2019-05-21T |
|
||||
| 103 | threshold | controller-0. | | | 13:15:26. |
|
||||
| | exceeded ; | numa=node0 | | | 464231 |
|
||||
| | threshold 80%, | | | | |
|
||||
| | actual 80% | | | | |
|
||||
| | | | | | |
|
||||
| 100. | Platform Memory | host= | major | False | 2019-05-21T |
|
||||
| 103 | threshold | controller-0 | | | 13:15:26. |
|
||||
| | exceeded ; | | | | 456738 |
|
||||
| | threshold 80%, | | | | |
|
||||
| | actual 80% | | | | |
|
||||
| | | | | | |
|
||||
| 200. | controller-0 is | host= | major | True | 2019-05-20T |
|
||||
| 006 | degraded due to | controller-0. | | | 23:56:51. |
|
||||
| | the failure of | process=ceph | | | 557509 |
|
||||
| | its 'ceph (osd.0, | (osd.0, ) | | | |
|
||||
| | )' process. Auto | | | | |
|
||||
| | recovery of this | | | | |
|
||||
| | major process is | | | | |
|
||||
| | in progress. | | | | |
|
||||
| | | | | | |
|
||||
| 200. | controller-0 was | host= | warning | True | 2019-05-17T |
|
||||
| 001 | administratively | controller-0 | | | 14:17:32. |
|
||||
| | locked to take it | | | | 794640 |
|
||||
| | out-of-service. | | | | |
|
||||
| | | | | | |
|
||||
+-------+-------------------+---------------+----------+------------+-------------+
|
56
doc/source/fault/viewing-alarm-details-using-the-cli.rst
Normal file
56
doc/source/fault/viewing-alarm-details-using-the-cli.rst
Normal file
@ -0,0 +1,56 @@
|
||||
|
||||
.. kfs1580755127017
|
||||
.. _viewing-alarm-details-using-the-cli:
|
||||
|
||||
================================
|
||||
View Alarm Details Using the CLI
|
||||
================================
|
||||
|
||||
You can view detailed information to help troubleshoot an alarm.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
- Use the following command to view details about an alarm.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
fm alarm-show <uuid>
|
||||
|
||||
<uuid> is the ID of the alarm to query. Use the :command:`fm alarm-list`
|
||||
to obtain UUIDs as described in
|
||||
:ref:`Viewing Active Alarms Using the CLI <viewing-active-alarms-using-the-cli>`.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-show 4ab5698a-19cb-4c17-bd63-302173fef62c
|
||||
+------------------------+-------------------------------------------------+
|
||||
| Property | Value |
|
||||
+------------------------+-------------------------------------------------+
|
||||
| alarm_id | 100.104 |
|
||||
| alarm_state | set |
|
||||
| alarm_type | operational-violation |
|
||||
| entity_instance_id | system=hp380-1_4.host=controller-0 |
|
||||
| entity_type_id | system.host |
|
||||
| probable_cause | threshold-crossed |
|
||||
| proposed_repair_action | /dev/sda3 check usage |
|
||||
| reason_text | /dev/sda3 critical threshold set (0.00 MB left) |
|
||||
| service_affecting | False |
|
||||
| severity | critical |
|
||||
| suppression | True |
|
||||
| timestamp | 2014-06-25T16:58:57.324613 |
|
||||
| uuid | 4ab5698a-19cb-4c17-bd63-302173fef62c |
|
||||
+------------------------+-------------------------------------------------+
|
||||
|
||||
The pair of attributes **\(alarm\_id, entity\_instance\_id\)** uniquely
|
||||
identifies an active alarm:
|
||||
|
||||
**alarm\_id**
|
||||
An ID identifying the particular alarm condition. Note that there are
|
||||
some alarm conditions, such as *administratively locked*, that can be
|
||||
raised by more than one entity-instance-id.
|
||||
|
||||
**entity\_instance\_id**
|
||||
Type and instance information of the object raising the alarm. A
|
||||
period-separated list of \(key, value\) pairs, representing the
|
||||
containment structure of the overall entity instance. This structure
|
||||
is used for processing hierarchical clearing of alarms.
|
49
doc/source/fault/viewing-suppressed-alarms-using-the-cli.rst
Normal file
49
doc/source/fault/viewing-suppressed-alarms-using-the-cli.rst
Normal file
@ -0,0 +1,49 @@
|
||||
|
||||
.. ohs1552680649558
|
||||
.. _viewing-suppressed-alarms-using-the-cli:
|
||||
|
||||
====================================
|
||||
View Suppressed Alarms Using the CLI
|
||||
====================================
|
||||
|
||||
Alarms may be suppressed. List them to determine if any need to be unsuppressed
|
||||
or otherwise managed.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _viewing-suppressed-alarms-using-the-cli-steps-hyn-g1x-nkb:
|
||||
|
||||
- Use the :command:`fm event-suppress-list` CLI command to view a list of
|
||||
all currently suppressed alarms.
|
||||
|
||||
This command shows all alarm IDs along with their suppression status.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm event-suppress-list [--nopaging] [--uuid] [--include-unsuppressed]
|
||||
|
||||
where
|
||||
|
||||
**--nopaging**
|
||||
disables paged output, see :ref:`CLI Commands and Paged Output <cli-commands-and-paged-output>`
|
||||
|
||||
**--uuid**
|
||||
includes the alarm type UUIDs in the output
|
||||
|
||||
**--include-unsuppressed**
|
||||
includes unsuppressed alarm types in the output. By default only
|
||||
suppressed alarm types are shown.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)] fm event-suppress-list
|
||||
+----------+-------------+
|
||||
| Event ID | Status |
|
||||
+----------+-------------+
|
||||
| 100.101 | suppressed |
|
||||
| 100.103 | suppressed |
|
||||
| 100.105 | suppressed |
|
||||
| ... | ... |
|
||||
+----------+-------------+
|
55
doc/source/fault/viewing-the-event-log-using-horizon.rst
Normal file
55
doc/source/fault/viewing-the-event-log-using-horizon.rst
Normal file
@ -0,0 +1,55 @@
|
||||
|
||||
.. ubf1552680722858
|
||||
.. _viewing-the-event-log-using-horizon:
|
||||
|
||||
================================
|
||||
View the Event Log Using Horizon
|
||||
================================
|
||||
|
||||
The |prod| Horizon Web interface provides a convenient way to work with
|
||||
historical alarms, and customer logs.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
The event log consolidates historical alarms events, that is, the sets and
|
||||
clears of alarms that have occurred in the past, and customer logs.
|
||||
|
||||
Customer logs capture important system events and provide useful information
|
||||
to the administrator for the purposes of overall fault management. Customer
|
||||
log events do not have a state and do not typically require administrator
|
||||
actions, for example, they may be reporting a failed login attempt or the fact
|
||||
that a container was evacuated to another host.
|
||||
|
||||
Customer logs and historical alarms' set and clear actions are held in a
|
||||
buffer, with older entries discarded as needed to release logging space.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
#. Select **Admin** \> **Fault Management** \> **Events** in the left pane.
|
||||
|
||||
The Events window appears. By default, the Events screen shows all events,
|
||||
including both historical set/clear alarms and logs, with the most recent
|
||||
events at the top.
|
||||
|
||||
#. Use the filter selections from the search field to select the information
|
||||
you want to view.
|
||||
|
||||
Use the **All Events**, **Alarm Events** and **Log Events** filter buttons
|
||||
to select all events, only historical alarms set/clear events or only
|
||||
customer log events to be displayed. By default, all events are displayed.
|
||||
Suppressed events are by default excluded from the table. Suppressed events
|
||||
can be included or excluded in the table with the **Show Suppressed and Hide
|
||||
Suppressed** filter buttons at the top right of table. The suppression filter
|
||||
buttons are only shown when one or more events are suppressed.
|
||||
|
||||
The **Suppression Status** column is only shown in the table when
|
||||
**Show Suppressed** filter button is selected.
|
||||
|
||||
.. image:: figures/psa1567524091300.png
|
||||
|
||||
You can sort the entries by clicking on the column titles. For example, to
|
||||
sort the view of the entries by severity, click **Severity**; the entries
|
||||
are resorted and grouped by severity.
|
||||
|
||||
#. Click the arrow to the left of an event entry in the table for an expanded
|
||||
view of event details.
|
183
doc/source/fault/viewing-the-event-log-using-the-cli.rst
Normal file
183
doc/source/fault/viewing-the-event-log-using-the-cli.rst
Normal file
@ -0,0 +1,183 @@
|
||||
|
||||
.. fcv1552680708686
|
||||
.. _viewing-the-event-log-using-the-cli:
|
||||
|
||||
================================
|
||||
View the Event Log Using the CLI
|
||||
================================
|
||||
|
||||
You can use CLI commands to work with historical alarms and logs in the event log.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
.. _viewing-the-event-log-using-the-cli-steps-v3r-stf-pkb:
|
||||
|
||||
#. Log in with administrative privileges.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ source /etc/platform/openrc
|
||||
|
||||
#. Use the :command:`fm event-list` command to view historical alarms'
|
||||
sets/clears and logs. By default, only unsuppressed events are shown.
|
||||
|
||||
For more about event suppression, see
|
||||
:ref:`Events Suppression Overview <events-suppression-overview>`.
|
||||
|
||||
The syntax of the command is:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
fm event-list [-q <QUERY>] [-l <NUMBER>] [--alarms] [--logs] [--include_suppress]
|
||||
|
||||
Optional arguments:
|
||||
|
||||
**-q QUERY, --query QUERY**
|
||||
\- key\[op\]data\_type::value; list. data\_type is optional, but if
|
||||
supplied must be string, integer, float, or boolean.
|
||||
|
||||
**-l NUMBER, --limit NUMBER**
|
||||
Maximum number of event logs to return.
|
||||
|
||||
**--alarms**
|
||||
Show historical alarms set/clears only.
|
||||
|
||||
**--logs**
|
||||
Show customer logs only.
|
||||
|
||||
**--include\_suppress**
|
||||
Show suppressed alarms as well as unsuppressed alarms.
|
||||
|
||||
**--uuid**
|
||||
Include the unique event UUID in the listing such that it can be used
|
||||
in displaying event details with :command:`fm event-show` <uuid>.
|
||||
|
||||
**-nopaging**
|
||||
Disable output paging.
|
||||
|
||||
For details on CLI paging, see
|
||||
:ref:`CLI Commands and Paged Output <cli-commands-and-paged-output>`.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ fm event-list -l 5
|
||||
+-----------+-----+-----+--------------------+-----------------+---------+
|
||||
|Time Stamp |State|Event|Reason Text |Entity Instance |Severity |
|
||||
| | |Log | |ID | |
|
||||
| | |ID | | | |
|
||||
+-----------+-----+-----+--------------------+-----------------+---------+
|
||||
|2019-05-21T| set |100. |Platform Memory |host=controller-0|major |
|
||||
| 13:15:26. | |103 |threshold exceeded ;|numa=node0 | |
|
||||
| 464231 | | |threshold 80%,actual| | |
|
||||
| | | |80% | | |
|
||||
| | | | | | |
|
||||
|2019-05-21T| set | 100.|Platform Memory |host=controller-0|major |
|
||||
| 13:15:26. | | 103 |threshold exceeded; | | |
|
||||
| 456738 | | |threshold 80%,actual| | |
|
||||
| | | |80% | | |
|
||||
| | | | | | |
|
||||
|2019-05-21T|clear| 100.|Platform Memory |host=controller-0|major |
|
||||
| 13:07:26. | | 103 |threshold exceeded; |numa=node0 | |
|
||||
| 658374 | | |threshold 80%,actual| | |
|
||||
| | | |79% | | |
|
||||
| | | | | | |
|
||||
|2019-05-21T|clear| 100.|Platform Memory |host=controller-0|major |
|
||||
| 13:07:26. | | 103 |threshold exceeded; | | |
|
||||
| 656608 | | |threshold 80%,actual| | |
|
||||
| | | |79% | | |
|
||||
| | | | | | |
|
||||
|2019-05-21T| set | 100 |Platform Memory |host=controller-0|major |
|
||||
| 13:05:26. | | 103 |threshold exceeded; |numa=node0 | |
|
||||
| 481240 | | |threshold 80%,actual| | |
|
||||
| | | |79% | | |
|
||||
| | | | | | |
|
||||
+-----------+-----+-----+--------------------+-----------------+---------+
|
||||
|
||||
.. note::
|
||||
You can also use the --nopaging option to avoid paging long event
|
||||
lists.
|
||||
|
||||
In the following example, the :command:`fm event-list` command shows
|
||||
alarms only; the **State** column indicates either **set** or **clear**.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ fm event-list -l 5 --alarms
|
||||
+-------------+-------+-------+--------------------+---------------+----------+
|
||||
| Time Stamp | State | Event | Reason Text | Entity | Severity |
|
||||
| | | Log | | Instance ID | |
|
||||
| | | ID | | | |
|
||||
+-------------+-------+-------+--------------------+---------------+----------+
|
||||
| 2019-05-21T | set | 100. | Platform Memory | host= | major |
|
||||
| 13:15:26. | | 103 | threshold exceeded | controller-0. | |
|
||||
| 464231 | | | ; threshold 80%, | numa=node0 | |
|
||||
| | | | actual 80% | | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | set | 100. | Platform Memory | host= | |
|
||||
| 13:15:26. | | 103 | threshold exceeded | controller-0 | major |
|
||||
| 456738 | | | ; threshold 80%, | | |
|
||||
| | | | actual 80% | | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | clear | 100. | Platform Memory | host= | |
|
||||
| 13:07:26. | | 103 | threshold exceeded | controller-0. | major |
|
||||
| 658374 | | | ; threshold 80%, | numa=node0 | |
|
||||
| | | | actual 79% | | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | clear | 100. | Platform Memory | host= | |
|
||||
| 13:07:26. | | 103 | threshold exceeded | controller-0 | major |
|
||||
| 656608 | | | ; threshold 80%, | | |
|
||||
| | | | actual 79% | | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | set | 100. | Platform Memory | host= | |
|
||||
| 13:05:26. | | 103 | threshold exceeded | controller-0. | major |
|
||||
| 481240 | | | ; threshold 80%, | numa=node0 | |
|
||||
| | | | actual 79% | | |
|
||||
| | | | | | |
|
||||
+-------------+-------+-------+--------------------+---------------+----------+
|
||||
|
||||
|
||||
In the following example, the :command:`fm event-list` command shows logs
|
||||
only; the **State** column indicates **log**.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
[sysadmin@controller-0 ~(keystone_admin)]$ fm event-list -l 5 --logs
|
||||
+-------------+-------+-------+---------------------+---------------+----------+
|
||||
| Time Stamp | State | Event | Reason Text | Entity | Severity |
|
||||
| | | Log | | Instance ID | |
|
||||
| | | ID | | | |
|
||||
+-------------+-------+-------+---------------------+---------------+----------+
|
||||
| 2019-05-21T | log | 700. | Exited Multi-Node | subsystem=vim | critical |
|
||||
| 00:50:29. | | 217 | Recovery Mode | | |
|
||||
| 525068 | | | | | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | log | 700. | Entered Multi-Node | subsystem=vim | critical |
|
||||
| 00:49:49. | | 216 | Recovery Mode | | |
|
||||
| 979021 | | | | | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | log | 401. | Service group vim- | service | |
|
||||
| 00:49:31. | | 002 | services redundancy | _domain= | critical |
|
||||
| 205116 | | | restored | controller. | |
|
||||
| | | | | service_group | |
|
||||
| | | | | =vim- | |
|
||||
| | | | | services | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | log | 401. | Service group vim- | service | |
|
||||
| 00:49:30. | | 001 | services state | _domain= | critical |
|
||||
| 003221 | | | change from go- | controller. | |
|
||||
| | | | active to active on | service_group | |
|
||||
| | | | host controller-0 | =vim-services | |
|
||||
| | | | | .host= | |
|
||||
| | | | | controller-0 | |
|
||||
| | | | | | |
|
||||
| 2019-05-21T | log | 401. | Service group | service | |
|
||||
| 00:49:29. | | 002 | controller-services | _domain= | critical |
|
||||
| 950524 | | | redundancy restored | controller. | |
|
||||
| | | | | service | |
|
||||
| | | | | _group= | |
|
||||
| | | | | controller | |
|
||||
| | | | | -services | |
|
||||
| | | | | | |
|
||||
+-------------+-------+-------+---------------------+---------------+----------+
|
@ -57,6 +57,15 @@ Configuration
|
||||
|
||||
configuration/index
|
||||
|
||||
----------------
|
||||
Fault Management
|
||||
----------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
fault/index
|
||||
|
||||
----------------
|
||||
Operation guides
|
||||
----------------
|
||||
@ -91,18 +100,13 @@ General information
|
||||
Governance
|
||||
----------
|
||||
|
||||
StarlingX is a top-level Open Infrastructure Foundation confirmed project that
|
||||
is governed by two separate bodies: The `Open Infrastructure Foundation Board of
|
||||
Directors`_ and the `StarlingX Technical Steering Committee`_.
|
||||
StarlingX is a top-level OpenStack Foundation pilot project that is governed by
|
||||
two separate bodies: The `OpenStack Foundation Board of Directors`_ and the
|
||||
`StarlingX Technical Steering Committee`_.
|
||||
|
||||
See `StarlingX Governance`_ for additional information about StarlingX project
|
||||
governance.
|
||||
|
||||
.. _`Open Infrastructure Foundation Board of Directors`: https://openinfra.dev/about/board/
|
||||
.. _`OpenStack Foundation Board of Directors`: https://wiki.openstack.org/wiki/Governance/Foundation
|
||||
.. _`StarlingX Technical Steering Committee`: https://docs.starlingx.io/governance/reference/tsc/
|
||||
.. _`StarlingX Governance`: https://docs.starlingx.io/governance/
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
.. _`StarlingX Governance`: https://docs.starlingx.io/governance/
|
Loading…
Reference in New Issue
Block a user