Fault Management doc
Added Data Networks toctree Changed case on doc title in top level index - changed doc directory to fault-mgmt. Added Distributed Cloud section. Broke out "OpenStack Fault Management Overview" statement about remote log collection to conditionally included file. Incorporated patch 6 review comments. Also implemented rST :abbr: for first instance of SNMP in each file. Changed port number and community string in two SNMP walk examples. Change-Id: I1afd71265e752c4c9a54bf2dc9a173b3e17332a7 Signed-off-by: Stone <ronald.stone@windriver.com>
@ -6,8 +6,8 @@
|
||||
Add an SNMP Community String Using the CLI
|
||||
==========================================
|
||||
|
||||
To enable SNMP services you need to define one or more SNMP community strings
|
||||
using the command line interface.
|
||||
To enable :abbr:`SNMP (Simple Network Management Protocol)` services you need
|
||||
to define one or more SNMP community strings using the command line interface.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
@ -0,0 +1,24 @@
|
||||
|
||||
.. gge1558616301307
|
||||
.. _alarms-management-for-distributed-cloud:
|
||||
|
||||
=======================================
|
||||
Alarms Management for Distributed Cloud
|
||||
=======================================
|
||||
|
||||
The System Controller collects alarm summaries from subclouds.
|
||||
|
||||
You can monitor and review a summary count of alarms from all systems by using
|
||||
either the CLI or the Horizon Web interface.
|
||||
|
||||
The System Controller polls all subclouds periodically for alarm summaries.
|
||||
|
||||
Alarm summaries are gathered if a subcloud is online. However, they are not
|
||||
gathered for a subcloud that has never been moved to the Managed state. In
|
||||
this case, alarm counts are not available for the subcloud and dashes are shown
|
||||
instead.
|
||||
|
||||
You can access detailed alarm information for a subcloud from the System
|
||||
Controller page by clicking **Alarm and Event Details** for the subcloud from
|
||||
Horizon. This action automatically switches from the interface from the System
|
||||
Controller page to the subcloud page.
|
@ -24,8 +24,8 @@ CLI fault management commands that perform paging include:
|
||||
- :command:`fm event-unsuppress-all`
|
||||
|
||||
|
||||
To turn paging off, use the --nopaging option for the above commands. The
|
||||
--nopaging option is useful for bash script writers.
|
||||
To turn paging off, use the ``--nopaging`` option for the above commands. The
|
||||
``--nopaging`` option is useful for bash script writers.
|
||||
|
||||
.. _cli-commands-and-paged-output-section-N10074-N1001C-N10001:
|
||||
|
@ -0,0 +1,90 @@
|
||||
|
||||
.. hmg1558616220923
|
||||
.. _cli-commands-for-dc-alarms-management:
|
||||
|
||||
===================================================
|
||||
CLI Commands for Distributed Cloud Alarm Management
|
||||
===================================================
|
||||
|
||||
You can use the CLI to review alarm summaries for the Distributed Cloud.
|
||||
|
||||
.. _cli-commands-for-alarms-management-ul-ncv-m4y-fdb:
|
||||
|
||||
- To show the status of all subclouds, as well as a summary count of alarms
|
||||
and warnings for each one, use the :command:`alarm summary` command.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ dcmanager alarm summary
|
||||
+------------+-----------------+--------------+--------------+----------+----------+
|
||||
| NAME | CRITICAL_ALARMS | MAJOR_ALARMS | MINOR_ALARMS | WARNINGS | STATUS |
|
||||
+------------+-----------------+--------------+--------------+----------+----------+
|
||||
| subcloud-5 | 0 | 2 | 0 | 0 | degraded |
|
||||
| subcloud-1 | 0 | 0 | 0 | 0 | OK |
|
||||
+------------+-----------------+--------------+--------------+----------+----------+
|
||||
|
||||
System Controller alarms and warnings are not included.
|
||||
|
||||
The status is one of the following:
|
||||
|
||||
**OK**
|
||||
There are no alarms or warnings, or only warnings.
|
||||
|
||||
**degraded**
|
||||
There are minor or major alarms.
|
||||
|
||||
**critical**
|
||||
There are critical alarms.
|
||||
|
||||
- To show the count of alarms and warnings for the System Controller, use the
|
||||
:command:`fm alarm-summary` command.
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-summary
|
||||
+-----------------+--------------+--------------+----------+
|
||||
| Critical Alarms | Major Alarms | Minor Alarms | Warnings |
|
||||
+-----------------+--------------+--------------+----------+
|
||||
| 0 | 0 | 0 | 0 |
|
||||
+-----------------+--------------+--------------+----------+
|
||||
|
||||
The following command is equivalent to the :command:`fm alarm-summary`,
|
||||
providing a count of alarms and warnings for the System Controller:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
fm --os-region-name RegionOne alarm-summary
|
||||
|
||||
- To show the alarm and warning count for a specific subcloud only, add the
|
||||
``--os-region-name`` parameter and supply the region name:
|
||||
|
||||
For example:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm --os-region-name subcloud2 --os-auth-url http://192.168.121.2:5000/v3 alarm-summary
|
||||
+-----------------+--------------+--------------+----------+
|
||||
| Critical Alarms | Major Alarms | Minor Alarms | Warnings |
|
||||
+-----------------+--------------+--------------+----------+
|
||||
| 0 | 0 | 0 | 0 |
|
||||
+-----------------+--------------+--------------+----------+
|
||||
|
||||
- To list the alarms for a subcloud:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm --os-region-name subcloud2 --os-auth-url http://192.168.121.2:5000/v3 alarm-list
|
||||
+----------+--------------------------------------------+-------------------+----------+-------------------+
|
||||
| Alarm ID | Reason Text | Entity ID | Severity | Time Stamp |
|
||||
+----------+--------------------------------------------+-------------------+----------+-------------------+
|
||||
| 250.001 | controller-0 Configuration is out-of-date. | host=controller-0 | major | 2018-02-06T21:37: |
|
||||
| | | | | 32.650217 |
|
||||
| | | | | |
|
||||
| 250.001 | controller-1 Configuration is out-of-date. | host=controller-1 | major | 2018-02-06T21:37: |
|
||||
| | | | | 29.121674 |
|
||||
| | | | | |
|
||||
+----------+--------------------------------------------+-------------------+----------+-------------------+
|
@ -6,8 +6,8 @@
|
||||
Configure SNMP Trap Destinations
|
||||
================================
|
||||
|
||||
SNMP trap destinations are hosts configured in |prod| to receive unsolicited
|
||||
SNMP notifications.
|
||||
:abbr:`SNMP (Simple Network Management Protocol)` trap destinations are hosts
|
||||
configured in |prod| to receive unsolicited SNMP notifications.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
@ -11,8 +11,10 @@ system.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
Manually deleting an alarm should not be done unless it is absolutely
|
||||
clear that there is no reason for the alarm to be active.
|
||||
.. warning::
|
||||
|
||||
Manually deleting an alarm should not be done unless it is absolutely
|
||||
clear that there is no reason for the alarm to be active.
|
||||
|
||||
You can use the command :command:`fm alarm-delete` to manually delete an alarm
|
||||
that remains active/set for no apparent reason, which may happen in rare
|
@ -6,7 +6,8 @@
|
||||
Enable SNMP Support
|
||||
===================
|
||||
|
||||
SNMP support must be enabled before you can begin using it to monitor a system.
|
||||
:abbr:`SNMP (Simple Network Management Protocol)` support must be enabled
|
||||
before you can begin using it to monitor a system.
|
||||
|
||||
.. rubric:: |context|
|
||||
|
||||
@ -17,10 +18,12 @@ interface on the active controller to complete the following steps.
|
||||
|
||||
#. Define at least one SNMP community string.
|
||||
|
||||
See |fault-doc|: :ref:`Adding an SNMP Community String Using the CLI <adding-an-snmp-community-string-using-the-cli>` for details.
|
||||
See |fault-doc|: :ref:`Adding an SNMP Community String Using the CLI
|
||||
<adding-an-snmp-community-string-using-the-cli>` for details.
|
||||
|
||||
#. Configure at least one SNMP trap destination.
|
||||
|
||||
This will allow alarms and logs to be reported as they happen.
|
||||
|
||||
For more information, see :ref:`Configuring SNMP Trap Destinations <configuring-snmp-trap-destinations>`.
|
||||
For more information, see :ref:`Configuring SNMP Trap Destinations
|
||||
<configuring-snmp-trap-destinations>`.
|
@ -15,7 +15,7 @@ alarms and :ref:`Customer Log Messages
|
||||
for the list of customer logs reported by |prod|.
|
||||
|
||||
You can access active and historical alarms, and customer logs using the CLI,
|
||||
GUI, REST APIs and SNMP.
|
||||
GUI, REST APIs and :abbr:`SNMP (Simple Network Management Protocol)`.
|
||||
|
||||
To use the CLI, see
|
||||
:ref:`Viewing Active Alarms Using the CLI
|
Before Width: | Height: | Size: 18 KiB After Width: | Height: | Size: 18 KiB |
BIN
doc/source/fault-mgmt/figures/psa1420475905055.png
Normal file
After Width: | Height: | Size: 39 KiB |
Before Width: | Height: | Size: 83 KiB After Width: | Height: | Size: 83 KiB |
Before Width: | Height: | Size: 36 KiB After Width: | Height: | Size: 36 KiB |
Before Width: | Height: | Size: 3.7 KiB After Width: | Height: | Size: 3.7 KiB |
@ -88,10 +88,21 @@ SNMP
|
||||
enabling-snmp-support
|
||||
traps
|
||||
configuring-snmp-trap-destinations
|
||||
snmp-active-alarm-table
|
||||
snmp-event-table
|
||||
adding-an-snmp-community-string-using-the-cli
|
||||
setting-snmp-identifying-information
|
||||
|
||||
**********************************
|
||||
Distributed Cloud alarm management
|
||||
**********************************
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
alarms-management-for-distributed-cloud
|
||||
cli-commands-for-dc-alarms-management
|
||||
|
||||
******************************
|
||||
Troubleshooting log collection
|
||||
******************************
|
@ -2,18 +2,23 @@
|
||||
.. ekn1458933172232
|
||||
.. _openstack-fault-management-overview:
|
||||
|
||||
========
|
||||
Overview
|
||||
========
|
||||
===================================
|
||||
OpenStack Fault Management Overview
|
||||
===================================
|
||||
|
||||
|prod-os| is a containerized application running on top of |prod|.
|
||||
|
||||
All Fault Management related interfaces for displaying alarms and logs,
|
||||
suppressing/unsuppressing events, enabling SNMP and enabling remote log
|
||||
collection are available on the |prod| REST APIs, CLIs and/or GUIs.
|
||||
|
||||
.. xreflink See :ref:`Fault Management Overview <platform-fault-management-overview>` for details on these interfaces.
|
||||
|
||||
This section provides the list of OpenStack related Alarms and Customer Logs
|
||||
that are monitored and reported for the |prod-os| application through the
|
||||
|prod| fault management interfaces.
|
||||
|prod| fault management interfaces.
|
||||
|
||||
All Fault Management related interfaces for displaying alarms and logs,
|
||||
suppressing/unsuppressing events, and enabling :abbr:`SNMP (Simple Network
|
||||
Management Protocol)` are available on the |prod| REST APIs, :abbr:`CLIs
|
||||
(Command Line Interfaces)` and/or GUIs.
|
||||
|
||||
.. :only: partner
|
||||
|
||||
.. include:: ../_includes/openstack-fault-management-overview.rest
|
@ -6,8 +6,8 @@
|
||||
Set SNMP Identifying Information
|
||||
================================
|
||||
|
||||
You can set SNMP system information including name, location and contact
|
||||
details.
|
||||
You can set :abbr:`SNMP (Simple Network Management Protocol)` system
|
||||
information including name, location and contact details.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
82
doc/source/fault-mgmt/snmp-active-alarm-table.rst
Normal file
@ -0,0 +1,82 @@
|
||||
|
||||
.. rst1448309104743
|
||||
.. _rst1448309104743:
|
||||
|
||||
========================
|
||||
SNMP Active Alarms Table
|
||||
========================
|
||||
|
||||
|prod| supports the :abbr:`SNMP (Simple Network Management Protocol)` Active
|
||||
alarm table from the Wind River Alarm MIB via SNMP.
|
||||
|
||||
The active alarm table contains a list of all active or set alarms in the
|
||||
system. Each entry in the table includes the following variables:
|
||||
|
||||
- <UUID>
|
||||
|
||||
- <AlarmID>
|
||||
|
||||
- <EntityInstanceID>
|
||||
|
||||
- <DateAndTime>
|
||||
|
||||
- <AlarmSeverity>
|
||||
|
||||
- <ReasonText>
|
||||
|
||||
- <EventType>
|
||||
|
||||
- <ProbableCause>
|
||||
|
||||
- <ProposedRepairAction>
|
||||
|
||||
- <ServiceAffecting>
|
||||
|
||||
- <SuppressionAllowed>
|
||||
|
||||
An external SNMP Manager can examine the Active Alarm table contents by doing
|
||||
an SNMP Walk of the table.
|
||||
|
||||
For example, below is the output for a simple :command:`snmpwalk` cli tool
|
||||
showing a table with three rows (i.e. three active alarms).
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ snmpwalk -v2c -c public udp:10.10.10.2:161 WRS-ALARM-MIB::wrsAlarmActiveTable
|
||||
|
||||
WRS-ALARM-MIB::wrsAlarmActiveIndex.1 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 1
|
||||
WRS-ALARM-MIB::wrsAlarmActiveIndex.2 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 2
|
||||
WRS-ALARM-MIB::wrsAlarmActiveIndex.3 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 3
|
||||
WRS-ALARM-MIB::wrsAlarmActiveUuid.1 = STRING: 742c2d64-df2e-4feb-8607-1ae6de11f15
|
||||
WRS-ALARM-MIB::wrsAlarmActiveUuid.2 = STRING: 742c2d64-df2e-4feb-8607-1ae6de11f15
|
||||
WRS-ALARM-MIB::wrsAlarmActiveUuid.3 = STRING: 742c2d64-df2e-4feb-8607-1ae6de11f15
|
||||
WRS-ALARM-MIB::wrsAlarmActiveAlarmId.1 = STRING: "100.114"
|
||||
WRS-ALARM-MIB::wrsAlarmActiveAlarmId.2 = STRING: "100.114"
|
||||
WRS-ALARM-MIB::wrsAlarmActiveAlarmId.3 = STRING: "100.114"
|
||||
WRS-ALARM-MIB::wrsAlarmActiveEntityInstanceId.1 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.host=controller-0.ntp
|
||||
WRS-ALARM-MIB::wrsAlarmActiveEntityInstanceId.2 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.host=controller-0.ntp=162.159.200.123
|
||||
WRS-ALARM-MIB::wrsAlarmActiveEntityInstanceId.3 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.host=controller-0.ntp=213.199.225.40
|
||||
WRS-ALARM-MIB::wrsAlarmActiveDateAndTime.1 = STRING: 2020-11-11,13:8:4.0,+0:0
|
||||
WRS-ALARM-MIB::wrsAlarmActiveDateAndTime.2 = STRING: 2020-11-13,13:13:53.0,+0:0
|
||||
WRS-ALARM-MIB::wrsAlarmActiveDateAndTime.3 = STRING: 2020-11-13,13:13:53.0,+0:0
|
||||
WRS-ALARM-MIB::wrsAlarmActiveAlarmSeverity.1 = INTEGER: major(3)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveAlarmSeverity.2 = INTEGER: minor(2)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveAlarmSeverity.3 = INTEGER: minor(2)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveReasonText.1 = STRING: NTP configuration does not contain any valid or reachable NTP servers.
|
||||
WRS-ALARM-MIB::wrsAlarmActiveReasonText.2 = STRING: NTP address 162.159.200.123 is not a valid or a reachable NTP server.
|
||||
WRS-ALARM-MIB::wrsAlarmActiveReasonText.3 = STRING: NTP address 213.199.225.40 is not a valid or a reachable NTP server.
|
||||
WRS-ALARM-MIB::wrsAlarmActiveEventType.1 = INTEGER: operationalViolation(7)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveEventType.2 = INTEGER: operationalViolation(7)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveEventType.3 = INTEGER: operationalViolation(7)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveProbableCause.1 = INTEGER: threshold-crossed(50)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveProbableCause.2 = INTEGER: threshold-crossed(50)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveProbableCause.3 = INTEGER: threshold-crossed(50)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveProposedRepairAction.1 = STRING: Monitor and if condition persists, contact next level of support.
|
||||
WRS-ALARM-MIB::wrsAlarmActiveProposedRepairAction.2 = STRING: Monitor and if condition persists, contact next level of support.
|
||||
WRS-ALARM-MIB::wrsAlarmActiveProposedRepairAction.3 = STRING: Monitor and if condition persists, contact next level of support.
|
||||
WRS-ALARM-MIB::wrsAlarmActiveServiceAffecting.1 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveServiceAffecting.2 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveServiceAffecting.3 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveSuppressionAllowed.1 = INTEGER: true(1)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveSuppressionAllowed.2 = INTEGER: true(1)
|
||||
WRS-ALARM-MIB::wrsAlarmActiveSuppressionAllowed.3 = INTEGER: true(1)
|
128
doc/source/fault-mgmt/snmp-event-table.rst
Normal file
@ -0,0 +1,128 @@
|
||||
|
||||
.. rdr1552680506097
|
||||
.. _snmp-event-table:
|
||||
|
||||
================
|
||||
SNMP Event Table
|
||||
================
|
||||
|
||||
|prod| supports the Event table from the Wind River Alarm MIB via :abbr:`SNMP
|
||||
(Simple Network Management Protocol)`.
|
||||
|
||||
The Event table contains a historic list of all alarm events (SETs and CLEARs)
|
||||
and customer log events.
|
||||
|
||||
Each entry in the table includes the following variables:
|
||||
|
||||
.. _snmp-event-table-ul-y1w-4lk-qq:
|
||||
|
||||
- <UUID>
|
||||
|
||||
- <EventID>
|
||||
|
||||
- <State>
|
||||
|
||||
- <EntityInstanceID>
|
||||
|
||||
- <DateAndTime>
|
||||
|
||||
- <EventSeverity>
|
||||
|
||||
- <ReasonText>
|
||||
|
||||
- <EventType>
|
||||
|
||||
- <ProbableCause>
|
||||
|
||||
- <ProposedRepairAction>
|
||||
|
||||
- <ServiceAffecting>
|
||||
|
||||
- <SuppressionAllowed>
|
||||
|
||||
An external SNMP Manager can examine the Event table contents by doing an SNMP
|
||||
Walk of the table.
|
||||
|
||||
For example, below is the output for a simple :command:`snmpwalk` cli tool.
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
$ snmpwalk -v2c -c public udp:10.10.10.2:161 WRS-ALARM-MIB::wrsEventTable
|
||||
|
||||
WRS-ALARM-MIB::wrsEventIndex.1 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 1
|
||||
WRS-ALARM-MIB::wrsEventIndex.2 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 2
|
||||
WRS-ALARM-MIB::wrsEventIndex.3 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 3
|
||||
WRS-ALARM-MIB::wrsEventIndex.4 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 4
|
||||
WRS-ALARM-MIB::wrsEventIndex.5 = Wrong Type (should be Gauge32 or Unsigned32): INTEGER: 5
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventUuid.1 = STRING:
|
||||
WRS-ALARM-MIB::wrsEventUuid.2 = STRING: a8711827-ca55-420e-bac5-d5ad6598275
|
||||
WRS-ALARM-MIB::wrsEventUuid.3 = STRING: a8711827-ca55-420e-bac5-d5ad6598275
|
||||
WRS-ALARM-MIB::wrsEventUuid.4 = STRING: a8711827-ca55-420e-bac5-d5ad6598275
|
||||
WRS-ALARM-MIB::wrsEventUuid.5 = STRING: a8711827-ca55-420e-bac5-d5ad6598275
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventEventId.1 = STRING: "200.022"
|
||||
WRS-ALARM-MIB::wrsEventEventId.2 = STRING: "750.004"
|
||||
WRS-ALARM-MIB::wrsEventEventId.3 = STRING: "750.004"
|
||||
WRS-ALARM-MIB::wrsEventEventId.4 = STRING: "750.004"
|
||||
WRS-ALARM-MIB::wrsEventEventId.5 = STRING: "750.004"
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventState.1 = INTEGER: log(3)
|
||||
WRS-ALARM-MIB::wrsEventState.2 = INTEGER: set(1)
|
||||
WRS-ALARM-MIB::wrsEventState.3 = INTEGER: clear(0)
|
||||
WRS-ALARM-MIB::wrsEventState.4 = INTEGER: set(1)
|
||||
WRS-ALARM-MIB::wrsEventState.5 = INTEGER: clear(0)
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventEntityInstanceId.1 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.host=controller-0.status=online
|
||||
WRS-ALARM-MIB::wrsEventEntityInstanceId.2 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.k8s_application=nginx-ingress-controller
|
||||
WRS-ALARM-MIB::wrsEventEntityInstanceId.3 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.k8s_application=nginx-ingress-controller
|
||||
WRS-ALARM-MIB::wrsEventEntityInstanceId.4 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.k8s_application=cert-manager
|
||||
WRS-ALARM-MIB::wrsEventEntityInstanceId.5 = STRING: system=7dd633ba-96f9-47ef-8531-983e4ca89fa3.k8s_application=cert-manager
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventDateAndTime.1 = STRING: 2020-11-7,21:31:32.0,+0:0
|
||||
WRS-ALARM-MIB::wrsEventDateAndTime.2 = STRING: 2020-11-7,21:34:33.0,+0:0
|
||||
WRS-ALARM-MIB::wrsEventDateAndTime.3 = STRING: 2020-11-7,21:41:24.0,+0:0
|
||||
WRS-ALARM-MIB::wrsEventDateAndTime.4 = STRING: 2020-11-7,21:41:45.0,+0:0
|
||||
WRS-ALARM-MIB::wrsEventDateAndTime.5 = STRING: 2020-11-7,21:43:4.0,+0:0
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventSeverity.1 = INTEGER: not-applicable(0)
|
||||
WRS-ALARM-MIB::wrsEventSeverity.2 = INTEGER: warning(1)
|
||||
WRS-ALARM-MIB::wrsEventSeverity.3 = INTEGER: warning(1)
|
||||
WRS-ALARM-MIB::wrsEventSeverity.4 = INTEGER: warning(1)
|
||||
WRS-ALARM-MIB::wrsEventSeverity.5 = INTEGER: warning(1)
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventReasonText.1 = STRING: controller-0 is now 'online'
|
||||
WRS-ALARM-MIB::wrsEventReasonText.2 = STRING: Application Apply In Progress
|
||||
WRS-ALARM-MIB::wrsEventReasonText.3 = STRING: Application Apply In Progress
|
||||
WRS-ALARM-MIB::wrsEventReasonText.4 = STRING: Application Apply In Progress
|
||||
WRS-ALARM-MIB::wrsEventReasonText.5 = STRING: Application Apply In Progress
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventEventType.1 = INTEGER: other(0)
|
||||
WRS-ALARM-MIB::wrsEventEventType.2 = INTEGER: other(0)
|
||||
WRS-ALARM-MIB::wrsEventEventType.3 = INTEGER: other(0)
|
||||
WRS-ALARM-MIB::wrsEventEventType.4 = INTEGER: other(0)
|
||||
WRS-ALARM-MIB::wrsEventEventType.5 = INTEGER: other(0)
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventProbableCause.1 = INTEGER: not-applicable(0)
|
||||
WRS-ALARM-MIB::wrsEventProbableCause.2 = INTEGER: not-applicable(0)
|
||||
WRS-ALARM-MIB::wrsEventProbableCause.3 = INTEGER: not-applicable(0)
|
||||
WRS-ALARM-MIB::wrsEventProbableCause.4 = INTEGER: not-applicable(0)
|
||||
WRS-ALARM-MIB::wrsEventProbableCause.5 = INTEGER: not-applicable(0)
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventProposedRepairAction.1 = STRING:
|
||||
WRS-ALARM-MIB::wrsEventProposedRepairAction.2 = STRING: No action required.
|
||||
WRS-ALARM-MIB::wrsEventProposedRepairAction.3 = STRING: No action required.
|
||||
WRS-ALARM-MIB::wrsEventProposedRepairAction.4 = STRING: No action required.
|
||||
WRS-ALARM-MIB::wrsEventProposedRepairAction.5 = STRING: No action required.
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventServiceAffecting.1 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsEventServiceAffecting.2 = INTEGER: true(1)
|
||||
WRS-ALARM-MIB::wrsEventServiceAffecting.3 = INTEGER: true(1)
|
||||
WRS-ALARM-MIB::wrsEventServiceAffecting.4 = INTEGER: true(1)
|
||||
WRS-ALARM-MIB::wrsEventServiceAffecting.5 = INTEGER: true(1)
|
||||
...
|
||||
WRS-ALARM-MIB::wrsEventSuppressionAllowed.1 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsEventSuppressionAllowed.2 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsEventSuppressionAllowed.3 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsEventSuppressionAllowed.4 = INTEGER: false(0)
|
||||
WRS-ALARM-MIB::wrsEventSuppressionAllowed.5 = INTEGER: false(0)
|
@ -6,7 +6,11 @@
|
||||
SNMP Overview
|
||||
=============
|
||||
|
||||
|prod| can generate SNMP traps for |prod| Alarm Events and Customer Log Events.
|
||||
|prod| can generate :abbr:`SNMP (Simple Network Management Protocol)` traps for
|
||||
|prod| Alarm Events and Customer Log Events.
|
||||
|
||||
|prod| also supports SNMP GETs and WALKs of an Active Alarm table and a
|
||||
historical Event (alarm SET/CLEAR and log) table.
|
||||
|
||||
This includes alarms based on hardware sensors monitored by board management
|
||||
controllers.
|
@ -6,8 +6,9 @@
|
||||
Traps
|
||||
=====
|
||||
|
||||
|prod| supports SNMP traps. Traps send unsolicited information to monitoring
|
||||
software when significant events occur.
|
||||
|prod| supports :abbr:`SNMP (Simple Network Management Protocol)` traps. Traps
|
||||
send unsolicited information to monitoring software when significant events
|
||||
occur.
|
||||
|
||||
The following traps are defined.
|
||||
|
||||
@ -28,7 +29,9 @@ The following traps are defined.
|
||||
- **wrsAlarmHierarchicalClear**
|
||||
|
||||
.. note::
|
||||
Customer Logs always result in **wrsAlarmMessage** traps.
|
||||
Customer Logs always result in **wrsAlarmMessage** traps. |prod| uses Wind
|
||||
River Systems (**wrs**) Enterprise Registration and Alarm MIBs. See
|
||||
:ref:`SNMP Overview <snmp-overview>` for details.
|
||||
|
||||
For Critical, Major, Minor, Warning, and Message traps, all variables in the
|
||||
active alarm table are included as varbinds \(variable bindings\), where each
|
@ -53,13 +53,13 @@ Collect Tool Caveats and Usage
|
||||
|
||||
- For systems with an up-time of more than 2 months, use the date range options.
|
||||
|
||||
Use --start-date for the collection of logs on and after a given date:
|
||||
Use ``--start-date`` for the collection of logs on and after a given date:
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
(keystone_admin)$ collect [--start-date | -s] <YYYYMMDD>
|
||||
|
||||
Use --end-date for the collection of logs on and before a given date :
|
||||
Use ``--end-date`` for the collection of logs on and before a given date:
|
||||
|
||||
.. code-block:: none
|
||||
|
@ -20,16 +20,16 @@ If you need to reactivate a suppressed alarm, you can do so using the CLI.
|
||||
|
||||
where
|
||||
|
||||
**<alarm-id>**
|
||||
is a comma separated list of **Alarm ID** s of alarms to unsuppress.
|
||||
``<alarm-id>``
|
||||
is a comma separated **Alarm ID** list of alarms to unsuppress.
|
||||
|
||||
**--nowrap**
|
||||
``--nowrap``
|
||||
disables output wrapping.
|
||||
|
||||
**--nopaging**
|
||||
``--nopaging``
|
||||
disables paged output
|
||||
|
||||
**--uuid**
|
||||
``--uuid``
|
||||
includes the alarm type UUIDs in the output.
|
||||
|
||||
Alarm type\(s\) with the specified <alarm-id\(s\)> will be unsuppressed.
|
@ -15,7 +15,8 @@ You can use the CLI to find information about currently active system alarms.
|
||||
of alarms and warnings for the system.
|
||||
|
||||
To review detailed information about a specific alarm instance, see
|
||||
:ref:`Viewing Alarm Details Using the CLI <viewing-alarm-details-using-the-cli>`.
|
||||
:ref:`Viewing Alarm Details Using the CLI
|
||||
<viewing-alarm-details-using-the-cli>`.
|
||||
|
||||
.. rubric:: |proc|
|
||||
|
||||
@ -48,56 +49,56 @@ To review detailed information about a specific alarm instance, see
|
||||
for more details on <QUERY> string syntax. Also see additional query
|
||||
examples below.
|
||||
|
||||
You can use one of the following --query command filters to view
|
||||
You can use one of the following ``--query`` command filters to view
|
||||
specific subsets of alarms, or a particular alarm:
|
||||
|
||||
.. table::
|
||||
:widths: auto
|
||||
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| Query Filter | Comment |
|
||||
+============================================================================+============================================================================+
|
||||
| :command:`uuid=<uuid\>` | Query alarms by UUID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query uuid=4ab5698a-19cb... |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`alarm\_id=<alarm id\>` | Query alarms by alarm ID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query alarm_id=100.104 |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`alarm\_type=<type\>` | Query alarms by type, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | alarm_type=operational-violation |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`entity\_type\_id=<type id\>` | Query alarms by entity type ID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | entity_type_id=system.host |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`entity\_instance\_id=<instance id\>` | Query alarms by entity instance id, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | entity_instance_id=host=worker-0 |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`severity=<severity\>` | Query alarms by severity type, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query severity=warning |
|
||||
| | |
|
||||
| | The valid severity types are critical, major, minor, and warning. |
|
||||
+----------------------------------------------------------------------------+----------------------------------------------------------------------------+
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| Query Filter | Comment |
|
||||
+=====================================================+============================================================================+
|
||||
| :command:`uuid=<uuid\>` | Query alarms by UUID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query uuid=4ab5698a-19cb... |
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`alarm\_id=<alarm id\>` | Query alarms by alarm ID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query alarm_id=100.104 |
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`alarm\_type=<type\>` | Query alarms by type, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | alarm_type=operational-violation |
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`entity\_type\_id=<type id\>` | Query alarms by entity type ID, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | entity_type_id=system.host |
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`entity\_instance\_id=<instance id\>` | Query alarms by entity instance id, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query \ |
|
||||
| | entity_instance_id=host=worker-0 |
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
| :command:`severity=<severity\>` | Query alarms by severity type, for example: |
|
||||
| | |
|
||||
| | .. code-block:: none |
|
||||
| | |
|
||||
| | ~(keystone_admin)$ fm alarm-list --query severity=warning |
|
||||
| | |
|
||||
| | The valid severity types are critical, major, minor, and warning. |
|
||||
+-----------------------------------------------------+----------------------------------------------------------------------------+
|
||||
|
||||
Query command filters can be combined into a single expression
|
||||
separated by semicolons, as illustrated in the following example:
|
||||
@ -111,7 +112,7 @@ To review detailed information about a specific alarm instance, see
|
||||
their Alarm ID set to S<\(alarm-id\)>.
|
||||
|
||||
**--uuid**
|
||||
The --uuid option on the :command:`fm alarm-list` command lists the
|
||||
The ``--uuid`` option on the :command:`fm alarm-list` command lists the
|
||||
active alarm list with unique UUIDs for each alarm such that this
|
||||
UUID can be used in display alarm details with the
|
||||
:command:`fm alarm-show` <UUID> command.
|
||||
@ -122,7 +123,7 @@ To review detailed information about a specific alarm instance, see
|
||||
**--mgmt\_affecting**
|
||||
Management affecting alarms prevent some critical administrative
|
||||
actions from being performed. For example, software upgrades. Using the
|
||||
--mgmt\_affecting option will list an additional column in the output,
|
||||
``--mgmt\_affecting`` option will list an additional column in the output,
|
||||
'Management Affecting', which indicates whether the alarm is management
|
||||
affecting or not.
|
||||
|
||||
@ -133,7 +134,7 @@ To review detailed information about a specific alarm instance, see
|
||||
|
||||
.. code-block:: none
|
||||
|
||||
~(keystone_admin)$ fm alarm-list --uuid
|
||||
~(keystone_admin)$ fm alarm-list ``--uuid``
|
||||
+--------------+-------+------------------+---------------+----------+-----------+
|
||||
| UUID | Alarm | Reason Text | Entity ID | Severity | Time |
|
||||
| | ID | | | | Stamp |
|
||||
@ -189,4 +190,4 @@ To review detailed information about a specific alarm instance, see
|
||||
| | locked to take it | | | | 794640 |
|
||||
| | out-of-service. | | | | |
|
||||
| | | | | | |
|
||||
+-------+-------------------+---------------+----------+------------+-------------+
|
||||
+-------+-------------------+---------------+----------+------------+-------------+
|
@ -32,27 +32,27 @@ You can use CLI commands to work with historical alarms and logs in the event lo
|
||||
|
||||
Optional arguments:
|
||||
|
||||
**-q QUERY, --query QUERY**
|
||||
``-q QUERY, --query QUERY``
|
||||
\- key\[op\]data\_type::value; list. data\_type is optional, but if
|
||||
supplied must be string, integer, float, or boolean.
|
||||
|
||||
**-l NUMBER, --limit NUMBER**
|
||||
``-l NUMBER, --limit NUMBER``
|
||||
Maximum number of event logs to return.
|
||||
|
||||
**--alarms**
|
||||
``--alarms``
|
||||
Show historical alarms set/clears only.
|
||||
|
||||
**--logs**
|
||||
``--logs``
|
||||
Show customer logs only.
|
||||
|
||||
**--include\_suppress**
|
||||
``--include\_suppress``
|
||||
Show suppressed alarms as well as unsuppressed alarms.
|
||||
|
||||
**--uuid**
|
||||
``--uuid``
|
||||
Include the unique event UUID in the listing such that it can be used
|
||||
in displaying event details with :command:`fm event-show` <uuid>.
|
||||
|
||||
**-nopaging**
|
||||
``-nopaging``
|
||||
Disable output paging.
|
||||
|
||||
For details on CLI paging, see
|
||||
@ -96,7 +96,7 @@ You can use CLI commands to work with historical alarms and logs in the event lo
|
||||
+-----------+-----+-----+--------------------+-----------------+---------+
|
||||
|
||||
.. note::
|
||||
You can also use the --nopaging option to avoid paging long event
|
||||
You can also use the ``--nopaging`` option to avoid paging long event
|
||||
lists.
|
||||
|
||||
In the following example, the :command:`fm event-list` command shows
|
@ -1,44 +0,0 @@
|
||||
|
||||
.. rdr1552680506097
|
||||
.. _snmp-event-table:
|
||||
|
||||
================
|
||||
SNMP Event Table
|
||||
================
|
||||
|
||||
|prod| supports SNMP active and historical alarms, and customer logs, in an
|
||||
event table.
|
||||
|
||||
The event table contains historical alarms \(sets and clears\) alarms and
|
||||
customer logs. It does not contain active alarms. Each entry in the table
|
||||
includes the following variables:
|
||||
|
||||
.. _snmp-event-table-ul-y1w-4lk-qq:
|
||||
|
||||
- <UUID>
|
||||
|
||||
- <EventID>
|
||||
|
||||
- <State>
|
||||
|
||||
- <EntityInstanceID>
|
||||
|
||||
- <DateAndTime>
|
||||
|
||||
- <EventSeverity>
|
||||
|
||||
- <ReasonText>
|
||||
|
||||
- <EventType>
|
||||
|
||||
- <ProbableCause>
|
||||
|
||||
- <ProposedRepairAction>
|
||||
|
||||
- <ServiceAffecting>
|
||||
|
||||
- <SuppressionAllowed>
|
||||
|
||||
.. note::
|
||||
The previous SNMP Historical Alarm Table and the SNMP Customer Log Table
|
||||
are still supported but marked as deprecated in the MIB.
|
@ -58,13 +58,22 @@ Configuration
|
||||
configuration/index
|
||||
|
||||
----------------
|
||||
Fault Management
|
||||
Fault management
|
||||
----------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
fault/index
|
||||
fault-mgmt/index
|
||||
|
||||
------------------------------------------------
|
||||
Data Network Configuration and Management Guides
|
||||
------------------------------------------------
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
|
||||
datanet/index
|
||||
|
||||
----------------
|
||||
Operation guides
|
||||
|