[docs] Edits the StackLight InfluxDB-Grafana plugin guide structure
Edits the table of contents and overall structure of the StackLight InfluxDB-Grafana plugin for Fuel documentation. Change-Id: Icc9d73fca27513e8336cca1486bfc83634b5d116
This commit is contained in:
@@ -6,7 +6,7 @@ source_suffix = '.rst'
|
||||
master_doc = 'index'
|
||||
|
||||
project = u'The StackLight InfluxDB-Grafana plugin for Fuel'
|
||||
copyright = u'2015, Mirantis Inc.'
|
||||
copyright = u'2016, Mirantis Inc.'
|
||||
|
||||
version = '0.10'
|
||||
release = '0.10.0'
|
||||
|
||||
135
doc/source/configure_plugin.rst
Normal file
135
doc/source/configure_plugin.rst
Normal file
@@ -0,0 +1,135 @@
|
||||
.. _plugin_configuration:
|
||||
|
||||
Plugin configuration
|
||||
--------------------
|
||||
|
||||
To configure the **StackLight InfluxDB-Grafana Plugin**, you need to follow these steps:
|
||||
|
||||
1. `Create a new environment
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/create-environment/start-create-env.html>`_.
|
||||
|
||||
2. Click on the *Settings* tab of the Fuel web UI and select the *Other* category.
|
||||
|
||||
3. Scroll down through the settings until you find the **InfluxDB-Grafana Server
|
||||
Plugin** section. You should see a page like this:
|
||||
|
||||
.. image:: ../images/influx_grafana_settings.png
|
||||
:width: 800
|
||||
|
||||
4. Tick the **InfluxDB-Grafana Plugin** box and fill-in the required fields as indicated below.
|
||||
|
||||
a. Specify the number of days of retention for your data.
|
||||
b. Specify the InfluxDB admin password (called root password in the InfluxDB documentation).
|
||||
c. Specify the database name (default is lma).
|
||||
d. Specify the InfluxDB username and password.
|
||||
e. Specify the Grafana username and password.
|
||||
|
||||
5. Since the introduction of Grafana 2.6.0, the plugin now uses a MySQL database
|
||||
to store its configuration data such as the dashboard templates.
|
||||
|
||||
a. Select **Local MySQL** if you want to create the Grafana database using the MySQL server
|
||||
of the OpenStack control-plane. Otherwise, select **Remote server** and specify
|
||||
the fully qualified name or IP address of the MySQL server you want to use.
|
||||
b. Then, specify the MySQL database name, username and password that will be used
|
||||
to access that database.
|
||||
|
||||
6. Tick the *Enable TLS for Grafana* box if you want to encrypt your
|
||||
Grafana credentials (username, password). Then, fill-in the required
|
||||
fields as indicated below.
|
||||
|
||||
.. image:: ../images/tls_settings.png
|
||||
:width: 800
|
||||
|
||||
a. Specify the DNS name of the Grafana server. This parameter is used
|
||||
to create a link in the Fuel dashboard to the Grafana server.
|
||||
#. Specify the location of a PEM file that contains the certificate
|
||||
and the private key of the Grafana server that will be used in TLS handchecks
|
||||
with the client.
|
||||
|
||||
7. Tick the *Use LDAP for Grafana authentication* box if you want to authenticate
|
||||
via LDAP to Grafana. Then, fill-in the required fields as indicated below.
|
||||
|
||||
.. image:: ../images/ldap_auth.png
|
||||
:width: 800
|
||||
|
||||
a. Select the *LDAPS* button if you want to enable LDAP authentication
|
||||
over SSL.
|
||||
#. Specify one or several LDAP server addresses separated by a space. Those
|
||||
addresses must be accessible from the node where Grafana is installed.
|
||||
Note that addresses external to the *management network* are not routable
|
||||
by default (see the note below).
|
||||
#. Specify the LDAP server port number or leave it empty to use the defaults.
|
||||
#. Specify the *Bind DN* of a user who has search priviliges on the LDAP server.
|
||||
#. Specify the password of the user identified by the *Bind DN* above.
|
||||
#. Specify the *Base DN* in the Directory Information Tree (DIT) from where
|
||||
to search for users.
|
||||
#. Specify a valid user search filter (ex. (uid=%s)).
|
||||
The result of the search should return a unique user entry.
|
||||
#. Specify a valid search filter to search for users.
|
||||
Example ``(uid=%s)``
|
||||
|
||||
You can further restrict access to Grafana to those users who
|
||||
are member of a specific LDAP group.
|
||||
|
||||
a. Tick the *Enable group-based authorization*.
|
||||
#. Specify the LDAP group *Base DN* in the DIT from where to search
|
||||
for groups.
|
||||
#. Specify the LDAP group search filter.
|
||||
Example ``(&(objectClass=posixGroup)(memberUid=%s))``
|
||||
#. Specify the CN of the LDAP group that will be mapped to the *admin role*
|
||||
#. Specify the CN of the LDAP group that will be mapped to the *viewer role*
|
||||
|
||||
Users who have the *admin role* can modify the Grafana dashboards
|
||||
or create new ones. Users who have the *viewer role* can only
|
||||
visualise the Grafana dashboards.
|
||||
|
||||
7. `Configure your environment
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/configure-environment.html>`_.
|
||||
|
||||
.. note:: By default, StackLight is configured to use the *management network*,
|
||||
of the so-called `Default Node Network Group
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/configure-environment/network-settings.html>`_.
|
||||
While this default setup may be appropriate for small deployments or
|
||||
evaluation purposes, it is recommended not to use this network
|
||||
for StackLight in production. It is instead recommended to create a network
|
||||
dedicated to StackLight using the `networking templates
|
||||
<https://docs.mirantis.com/openstack/fuel/fuel-8.0/operations.html#using-networking-templates>`_
|
||||
capability of Fuel. Using a dedicated network for StackLight will
|
||||
improve performances and reduce the monitoring footprint on the
|
||||
control-plane. It will also facilitate access to the Gafana UI
|
||||
after deployment as the *management network* is not routable.
|
||||
|
||||
8. Click the *Nodes* tab and assign the *InfluxDB_Grafana* role
|
||||
to the node(s) where you want to install the plugin.
|
||||
|
||||
You can see in the example below that the *InfluxDB_Grafana*
|
||||
role is assigned to three nodes along side with the
|
||||
*Alerting_Infrastructure* and the *Elasticsearch_Kibana* roles.
|
||||
Here, the three plugins of the LMA toolchain backend servers are
|
||||
installed on the same nodes. You can assign the *InfluxDB_Grafana*
|
||||
role to either one node (standalone install) or three nodes for HA.
|
||||
|
||||
.. image:: ../images/influx_grafana_role.png
|
||||
:width: 800
|
||||
|
||||
.. note:: Installing the InfluxDB server on more than three nodes
|
||||
is currently not possible using the Fuel plugin.
|
||||
Similarly, installing the InfluxDB server on two nodes
|
||||
is not recommended to avoid split-brain situations in the Raft
|
||||
consensus of the InfluxDB cluster as well as the *Pacemaker* cluster
|
||||
which is responsible of the VIP address failover.
|
||||
To be also noted that it is possible to add or remove nodes
|
||||
with the *InfluxDB_Grafana* role in the cluster after deployment.
|
||||
|
||||
9. `Adjust the disk partitioning if necessary
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/configure-environment/customize-partitions.html>`_.
|
||||
|
||||
By default, the InfluxDB-Grafana Plugin allocates:
|
||||
|
||||
* 20% of the first available disk for the operating system by honoring
|
||||
a range of 15GB minimum to 50GB maximum.
|
||||
* 10GB for */var/log*.
|
||||
* At least 30 GB for the InfluxDB database in */var/lib/influxdb*.
|
||||
|
||||
10. `Deploy your environment
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/deploy-environment.html>`_.
|
||||
33
doc/source/definitions.rst
Normal file
33
doc/source/definitions.rst
Normal file
@@ -0,0 +1,33 @@
|
||||
.. _definitions:
|
||||
|
||||
Key terms
|
||||
---------
|
||||
|
||||
The table below lists the key terms and acronyms that are used
|
||||
in this document.
|
||||
|
||||
+---------------------+-------------------------------------------------------+
|
||||
| **Terms & acronyms**| **Definition** |
|
||||
+=====================+=======================================================+
|
||||
| The Collector | The StackLight Collector is a smart monitoring agent |
|
||||
| | running on every node. It collects and processes |
|
||||
| | the metrics of your OpenStack environment. |
|
||||
+---------------------+-------------------------------------------------------+
|
||||
| InfluxDB | InfluxDB is a time-series, metrics, and analytics |
|
||||
| | open-source database (MIT license). It is written in |
|
||||
| | Go and has no external dependencies. |
|
||||
| | InfluxDB is targeted at use cases for DevOps, metrics,|
|
||||
| | sensor data, and real-time analytics. |
|
||||
+---------------------+-------------------------------------------------------+
|
||||
| Grafana | Grafana is an Apache 2.0 licensed general purpose |
|
||||
| | dashboard and graph composer. It is focused on |
|
||||
| | providing rich ways to visualize metrics time-series, |
|
||||
| | mainly though graphs but supports other ways to |
|
||||
| | visualize data through a pluggable panel architecture.|
|
||||
| | |
|
||||
| | It has rich support for Graphite, InfluxDB, and |
|
||||
| | OpenTSDB and also supports other data sources through |
|
||||
| | plugins. Grafana is most commonly used for |
|
||||
| | infrastructure monitoring, application monitoring, and|
|
||||
| | metric analytics. |
|
||||
+---------------------+-------------------------------------------------------+
|
||||
@@ -1,21 +1,37 @@
|
||||
================================================================
|
||||
Welcome to the StackLight InfluxDB-Grafana Plugin Documentation!
|
||||
================================================================
|
||||
=========================================================================
|
||||
Welcome to the StackLight InfluxDB-Grafana plugin for Fuel documentation!
|
||||
=========================================================================
|
||||
|
||||
User documentation
|
||||
==================
|
||||
Overview
|
||||
~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:maxdepth: 1
|
||||
|
||||
overview
|
||||
releases
|
||||
installation
|
||||
user
|
||||
intro
|
||||
definitions
|
||||
requirements
|
||||
limitations
|
||||
release_notes
|
||||
licenses
|
||||
appendix
|
||||
references
|
||||
|
||||
Indices and Tables
|
||||
==================
|
||||
Installing and configuring StackLight InfluxDB-Grafana plugin
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
* :ref:`search`
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
install_intro
|
||||
install
|
||||
configure_plugin
|
||||
verification
|
||||
|
||||
Using StackLight InfluxDB-Grafana plugin
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 1
|
||||
|
||||
usage
|
||||
troubleshooting
|
||||
@@ -1,10 +1,7 @@
|
||||
.. _user_installation:
|
||||
|
||||
Installation Guide
|
||||
==================
|
||||
|
||||
InfluxDB-Grafana Fuel Plugin installation using the RPM file of the Fuel Plugins Catalog
|
||||
----------------------------------------------------------------------------------------
|
||||
Install using the RPM file of the Fuel plugins catalog
|
||||
------------------------------------------------------
|
||||
|
||||
To install the StackLight InfluxDB-Grafana Fuel Plugin using the RPM file of the Fuel Plugins
|
||||
Catalog, you need to follow these steps:
|
||||
@@ -30,8 +27,8 @@ Catalog, you need to follow these steps:
|
||||
---|----------------------|----------|----------------
|
||||
1 | influxdb_grafana | 0.10.0 | 4.0.0
|
||||
|
||||
StackLight InfluxDB-Grafana Fuel Plugin installtion from source
|
||||
---------------------------------------------------------------
|
||||
Install from source
|
||||
-------------------
|
||||
|
||||
Alternatively, you may want to build the RPM file of the plugin from source if,
|
||||
for example, you want to test the latest features of the master branch or customize the plugin.
|
||||
@@ -78,15 +75,4 @@ node so that you won't have to copy that file later on.
|
||||
|
||||
7. Now that you have created the RPM file, you can install the plugin using the `fuel plugins --install` command::
|
||||
|
||||
[root@fuel ~] fuel plugins --install ./fuel-plugin-influxdb-grafana/*.noarch.rpm
|
||||
|
||||
StackLight InfluxDB-Grafana Fuel plugin software components
|
||||
-----------------------------------------------------------
|
||||
|
||||
+----------------+-------------------------------------+
|
||||
| Components | Version |
|
||||
+================+=====================================+
|
||||
| InfluxDB | v0.11.1 for Ubuntu (64-bit) |
|
||||
+----------------+-------------------------------------+
|
||||
| Grafana | v3.0.4 for Ubuntu (64-bit) |
|
||||
+----------------+-------------------------------------+
|
||||
[root@fuel ~] fuel plugins --install ./fuel-plugin-influxdb-grafana/*.noarch.rpm
|
||||
19
doc/source/install_intro.rst
Normal file
19
doc/source/install_intro.rst
Normal file
@@ -0,0 +1,19 @@
|
||||
Introduction
|
||||
------------
|
||||
|
||||
You can install the StackLight InfluxDB-Grafana plugin using one of the
|
||||
following options:
|
||||
|
||||
• Install using the RPM file
|
||||
• Install from source
|
||||
|
||||
The following is a list of software components installed by the StackLight
|
||||
InfluxDB-Grafana plugin:
|
||||
|
||||
+----------------+-------------------------------------+
|
||||
| Components | Version |
|
||||
+================+=====================================+
|
||||
| InfluxDB | v0.11.1 for Ubuntu (64-bit) |
|
||||
+----------------+-------------------------------------+
|
||||
| Grafana | v3.0.4 for Ubuntu (64-bit) |
|
||||
+----------------+-------------------------------------+
|
||||
28
doc/source/intro.rst
Normal file
28
doc/source/intro.rst
Normal file
@@ -0,0 +1,28 @@
|
||||
.. _intro:
|
||||
|
||||
Introduction
|
||||
------------
|
||||
|
||||
The **StackLight InfluxDB-Grafana Fuel Plugin** is used to install and configure
|
||||
InfluxDB and Grafana which collectively provide access to the
|
||||
metrics analytics of Mirantis OpenStack.
|
||||
InfluxDB is a powerful distributed time-series database
|
||||
to store and search metrics time-series. The metrics analytics are used to
|
||||
visualize the time-series and the annotations produced by the StackLight Collector.
|
||||
The annotations contain insightful information about the faults and anomalies
|
||||
that resulted in a change of state for the clusters of nodes and services
|
||||
of the OpenStack environment.
|
||||
|
||||
The InfluxDB-Grafana Plugin is an indispensable tool to answering
|
||||
the questions "what has changed in my OpenStack environment, when and why?".
|
||||
Grafana is installed with a collection of predefined dashboards for each
|
||||
of the OpenStack services that are monitored.
|
||||
Among those dashboards, the *Main Dashboard* provides a single pane of glass
|
||||
overview of your OpenStack environment status.
|
||||
|
||||
InfluxDB and Grafana are key components
|
||||
of the `LMA Toolchain project <https://launchpad.net/lma-toolchain>`_
|
||||
as shown in the figure below.
|
||||
|
||||
.. image:: ../images/toolchain_map.png
|
||||
:width: 430pt
|
||||
@@ -1,10 +1,10 @@
|
||||
.. _licenses:
|
||||
|
||||
Licenses
|
||||
========
|
||||
--------
|
||||
|
||||
Third Party Components
|
||||
----------------------
|
||||
++++++++++++++++++++++
|
||||
|
||||
+----------+-----------------------+-----------+
|
||||
| Name | Project Web Site | License |
|
||||
@@ -15,7 +15,7 @@ Third Party Components
|
||||
+----------+-----------------------+-----------+
|
||||
|
||||
Puppet modules
|
||||
--------------
|
||||
++++++++++++++
|
||||
|
||||
+---------+--------------------------------------------------+-----------+
|
||||
| Name | Project Web Site | License |
|
||||
|
||||
9
doc/source/limitations.rst
Normal file
9
doc/source/limitations.rst
Normal file
@@ -0,0 +1,9 @@
|
||||
.. _plugin_limitations:
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
|
||||
Currently, the size of an InfluxDB cluster the Fuel plugin can deploy is limited to three nodes. In addition to this,
|
||||
each node of the InfluxDB cluster is configured to run under the *meta* node role and the *data* node role. Therefore,
|
||||
it is not possible to separate the nodes participating in the Raft consensus cluster from
|
||||
the nodes accessing the data replicas.
|
||||
@@ -1,85 +0,0 @@
|
||||
.. _user_overview:
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
The **StackLight InfluxDB-Grafana Fuel Plugin** is used to install and configure
|
||||
InfluxDB and Grafana which collectively provide access to the
|
||||
metrics analytics of Mirantis OpenStack.
|
||||
InfluxDB is a powerful distributed time-series database
|
||||
to store and search metrics time-series. The metrics analytics are used to
|
||||
visualize the time-series and the annotations produced by the StackLight Collector.
|
||||
The annotations contain insightful information about the faults and anomalies
|
||||
that resulted in a change of state for the clusters of nodes and services
|
||||
of the OpenStack environment.
|
||||
|
||||
The InfluxDB-Grafana Plugin is an indispensable tool to answering
|
||||
the questions "what has changed in my OpenStack environment, when and why?".
|
||||
Grafana is installed with a collection of predefined dashboards for each
|
||||
of the OpenStack services that are monitored.
|
||||
Among those dashboards, the *Main Dashboard* provides a single pane of glass
|
||||
overview of your OpenStack environment status.
|
||||
|
||||
InfluxDB and Grafana are key components
|
||||
of the `LMA Toolchain project <https://launchpad.net/lma-toolchain>`_
|
||||
as shown in the figure below.
|
||||
|
||||
.. image:: ../images/toolchain_map.png
|
||||
:align: center
|
||||
|
||||
.. _plugin_requirements:
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
+------------------------+--------------------------------------------------------------------------------------------+
|
||||
| **Requirement** | **Version/Comment** |
|
||||
+========================+============================================================================================+
|
||||
| Disk space | The plugin’s specification requires to provision at least 15GB of disk space for the |
|
||||
| | system, 10GB for the logs and 30GB for the database. The installation of the |
|
||||
| | plugin will fail if there is less than 55GB of disk space available on the node. |
|
||||
+------------------------+--------------------------------------------------------------------------------------------+
|
||||
| Mirantis OpenStack | 8.0, 9.0 |
|
||||
+------------------------+--------------------------------------------------------------------------------------------+
|
||||
| Hardware configuration | The hardware configuration (RAM, CPU, disk(s)) required by this plugin depends on the size |
|
||||
| | of your cloud environment and other factors like the retention policy. An average |
|
||||
| | setup would require a quad-core server with 8 GB of RAM and access to a 500-1000 IOPS disk.|
|
||||
| | Please check the `InfluxDB Hardware Sizing Guide |
|
||||
| | <https://docs.influxdata.com/influxdb/v0.10/guides/hardware_sizing/>`_ for additional |
|
||||
| | sizing information. |
|
||||
| | |
|
||||
| | It is also highly recommended to use a dedicated disk for your data storage. Otherwise, |
|
||||
| | The InfluxDB-Grafana Plugin will use the root filesystem by default. |
|
||||
+------------------------+--------------------------------------------------------------------------------------------+
|
||||
|
||||
Limitations
|
||||
-----------
|
||||
|
||||
Currently, the size of an InfluxDB cluster the Fuel plugin can deploy is limited to three nodes. In addition to this,
|
||||
each node of the InfluxDB cluster is configured to run under the *meta* node role and the *data* node role. Therefore,
|
||||
it is not possible to separate the nodes participating in the Raft consensus cluster from
|
||||
the nodes accessing the data replicas.
|
||||
|
||||
Key terms, acronyms and abbreviations
|
||||
-------------------------------------
|
||||
|
||||
+----------------------+--------------------------------------------------------------------------------------------+
|
||||
| **Terms & acronyms** | **Definition** |
|
||||
+======================+============================================================================================+
|
||||
| The Collector | The StackLight Collector is a smart monitoring agent running on every node which collects |
|
||||
| | and process the metrics of your OpenStack environment. |
|
||||
+----------------------+--------------------------------------------------------------------------------------------+
|
||||
| InfluxDB | InfluxDB is a time-series, metrics, and analytics open-source database (MIT license). |
|
||||
| | It’s written in Go and has no external dependencies. |
|
||||
| | |
|
||||
| | InfluxDB is targeted at use cases for DevOps, metrics, sensor data, and real-time |
|
||||
| | analytics. |
|
||||
+----------------------+--------------------------------------------------------------------------------------------+
|
||||
| Grafana | Grafana is an (Apache 2.0 Licensed) general purpose dashboard and graph composer. |
|
||||
| | It's focused on providing rich ways to visualize metrics time-series, mainly though graphs |
|
||||
| | but supports other ways to visualize data through a pluggable panel architecture. |
|
||||
| | |
|
||||
| | It currently has rich support for Graphite, InfluxDB and OpenTSDB and also supports other |
|
||||
| | data sources via plugins. Grafana is most commonly used for infrastructure monitoring, |
|
||||
| | application monitoring and metric analytics. |
|
||||
+----------------------+--------------------------------------------------------------------------------------------+
|
||||
@@ -1,8 +1,8 @@
|
||||
.. _user_appendix:
|
||||
.. _references:
|
||||
|
||||
Appendix
|
||||
========
|
||||
References
|
||||
----------
|
||||
|
||||
* The `InfluxDB-Grafana plugin <https://github.com/openstack/fuel-plugin-influxdb-grafana>`_ project at GitHub.
|
||||
* The official `InfluxDB documentation <https://influxdb.com/docs/v0.9/>`_.
|
||||
* The official `Grafana documentation <http://docs.grafana.org/v3.0>`_.
|
||||
* The `InfluxDB-Grafana plugin <https://github.com/openstack/fuel-plugin-influxdb-grafana>`_ project at GitHub
|
||||
* The official `InfluxDB documentation <https://influxdb.com/docs/v0.9/>`_
|
||||
* The official `Grafana documentation <http://docs.grafana.org/v3.0>`_
|
||||
@@ -1,10 +1,10 @@
|
||||
.. _releases:
|
||||
.. _release_notes:
|
||||
|
||||
Release Notes
|
||||
=============
|
||||
Release notes
|
||||
-------------
|
||||
|
||||
Version 0.10.0
|
||||
--------------
|
||||
0.10.0
|
||||
++++++
|
||||
|
||||
* Changes
|
||||
|
||||
@@ -16,8 +16,8 @@ Version 0.10.0
|
||||
* Upgrade to InfluxDB v0.11.1.
|
||||
* Upgrade to Grafana v3.0.4.
|
||||
|
||||
Version 0.9.0
|
||||
-------------
|
||||
0.9.0
|
||||
+++++
|
||||
|
||||
- A new dashboard for hypervisor metrics.
|
||||
- A new dashboard for InfluxDB cluster.
|
||||
@@ -27,8 +27,8 @@ Version 0.9.0
|
||||
- Add support for InfluxDB clustering (beta state).
|
||||
- Use MySQL as Grafana backend to support HA.
|
||||
|
||||
Version 0.8.0
|
||||
-------------
|
||||
0.8.0
|
||||
+++++
|
||||
|
||||
- Add support for the "influxdb_grafana" Fuel Plugin role instead of
|
||||
the "base-os" role which had several limitations.
|
||||
@@ -38,7 +38,7 @@ Version 0.8.0
|
||||
- Several dashboard visualisation improvements.
|
||||
- A new self-monitoring dashboard.
|
||||
|
||||
Version 0.7.0
|
||||
-------------
|
||||
0.7.0
|
||||
+++++
|
||||
|
||||
- Initial release of the plugin. This is a beta version.
|
||||
- Initial release of the plugin. This is a beta version.
|
||||
27
doc/source/requirements.rst
Normal file
27
doc/source/requirements.rst
Normal file
@@ -0,0 +1,27 @@
|
||||
.. _plugin_requirements:
|
||||
|
||||
Requirements
|
||||
------------
|
||||
|
||||
+-----------------------+-----------------------------------------------------------------------+
|
||||
| **Requirement** | **Version/Comment** |
|
||||
+=======================+=======================================================================+
|
||||
| Disk space | The plugin’s specification requires to provision at least 15GB of disk|
|
||||
| | spase for the system, 10GB for the logs and 30GB for the database. The|
|
||||
| | installation of the plugin will fail if there is less than 55GB of |
|
||||
| | disk space available on the node. |
|
||||
+-----------------------+-----------------------------------------------------------------------+
|
||||
| Mirantis OpenStack | 8.0, 9.0 |
|
||||
+-----------------------+-----------------------------------------------------------------------+
|
||||
| Hardware configuration| The hardware configuration (RAM, CPU, disk(s)) required by this plugin|
|
||||
| | depends on the size of your cloud environment and other factors like |
|
||||
| | the retention policy. An average setup would require a quad-core |
|
||||
| | server with 8 GB of RAM and access to a 500-1000 IOPS disk. |
|
||||
| | See the `InfluxDB Hardware Sizing Guide |
|
||||
| | <https://docs.influxdata.com/influxdb/v0.10/guides/hardware_sizing/>`_|
|
||||
| | for additional sizing information. |
|
||||
| | |
|
||||
| | It is also highly recommended to use a dedicated disk for your data |
|
||||
| | storage. Otherwise, the InfluxDB-Grafana Plugin will use the root |
|
||||
| | filesystem by default. |
|
||||
+-----------------------+-----------------------------------------------------------------------+
|
||||
51
doc/source/troubleshooting.rst
Normal file
51
doc/source/troubleshooting.rst
Normal file
@@ -0,0 +1,51 @@
|
||||
.. _troubleshooting:
|
||||
|
||||
Troubleshooting
|
||||
---------------
|
||||
|
||||
If you get no data in Grafana, follow these troubleshooting tips.
|
||||
|
||||
#. First, check that the LMA Collector is running properly by following the
|
||||
LMA Collector troubleshooting instructions in the
|
||||
`LMA Collector Fuel Plugin User Guide <http://fuel-plugin-lma-collector.readthedocs.org/>`_.
|
||||
|
||||
#. Check that the nodes are able to connect to the InfluxDB cluster via the VIP address
|
||||
(see above how to get the InfluxDB cluster VIP address) on port *8086*::
|
||||
|
||||
root@node-2:~# curl -I http://<VIP>:8086/ping
|
||||
|
||||
The server should return a 204 HTTP status::
|
||||
|
||||
HTTP/1.1 204 No Content
|
||||
Request-Id: cdc3c545-d19d-11e5-b457-000000000000
|
||||
X-Influxdb-Version: 0.10.0
|
||||
Date: Fri, 12 Feb 2016 15:32:19 GMT
|
||||
|
||||
#. Check that InfluxDB cluster VIP address is up and running::
|
||||
|
||||
root@node-1:~# crm resource status vip__influxdb
|
||||
resource vip__influxdb is running on: node-1.test.domain.local
|
||||
|
||||
#. Check that the InfluxDB service is started on all nodes of the cluster::
|
||||
|
||||
root@node-1:~# service influxdb status
|
||||
influxdb Process is running [ OK ]
|
||||
|
||||
#. If not, (re)start it::
|
||||
|
||||
root@node-1:~# service influxdb start
|
||||
Starting the process influxdb [ OK ]
|
||||
influxdb process was started [ OK ]
|
||||
|
||||
#. Check that Grafana server is running::
|
||||
|
||||
root@node-1:~# service grafana-server status
|
||||
* grafana is running
|
||||
|
||||
#. If not, (re)start it::
|
||||
|
||||
root@node-1:~# service grafana-server start
|
||||
* Starting Grafana Server
|
||||
|
||||
#. If none of the above solves the problem, check the logs in ``/var/log/influxdb/influxdb.log``
|
||||
and ``/var/log/grafana/grafana.log`` to find out what might have gone wrong.
|
||||
217
doc/source/usage.rst
Normal file
217
doc/source/usage.rst
Normal file
@@ -0,0 +1,217 @@
|
||||
.. _usage:
|
||||
|
||||
Exploring your time-series with Grafana
|
||||
---------------------------------------
|
||||
|
||||
The InfluxDB-Grafana Plugin comes with a collection of predefined
|
||||
dashboards you can use to visualize the time-series stored in InfluxDB.
|
||||
|
||||
Please check the LMA Collector documentation for a complete list of all the
|
||||
`metrics time-series <http://fuel-plugin-lma-collector.readthedocs.org/en/latest/appendix_b.html>`_
|
||||
that are collected and stored in InfluxDB.
|
||||
|
||||
The Main Dashboard
|
||||
++++++++++++++++++
|
||||
|
||||
We suggest you start with the **Main Dashboard**, as shown
|
||||
below, as an entry to the other dashboards.
|
||||
The **Main Dashboard** provides a single pane of glass from where you can visualize the
|
||||
overall health status of your OpenStack services such as Nova and Cinder
|
||||
but also HAProxy, MySQL and RabbitMQ to name a few..
|
||||
|
||||
.. image:: ../images/grafana_main.png
|
||||
:width: 800
|
||||
|
||||
As you can see, the **Main Dashboard** (as most dashboards) provides
|
||||
a drop down menu list in the upper left corner of the window
|
||||
from where you can pick a particular metric dimension such as
|
||||
the *controller name* or the *device name* you want to select.
|
||||
|
||||
In the example above, the system metrics of *node-48* are
|
||||
being displayed in the dashboard.
|
||||
|
||||
Within the **OpenStack Services** row, each of the services
|
||||
represented can be assigned five different status.
|
||||
|
||||
.. note:: The precise determination of a service health status depends
|
||||
on the correlation policies implemented for that service by a `Global Status Evaluation (GSE)
|
||||
plugin <http://fuel-plugin-lma-collector.readthedocs.org/en/latest/alarms.html#cluster-policies>`_.
|
||||
|
||||
The meaning associated with a service health status is the following:
|
||||
|
||||
- **Down**: One or several primary functions of a service
|
||||
cluster has failed. For example,
|
||||
all API endpoints of a service cluster like Nova
|
||||
or Cinder are failed.
|
||||
- **Critical**: One or several primary functions of a
|
||||
service cluster are severely degraded. The quality
|
||||
of service delivered to the end-user should be severely
|
||||
impacted.
|
||||
- **Warning**: One or several primary functions of a
|
||||
service cluster are slightly degraded. The quality
|
||||
of service delivered to the end-user should be slightly
|
||||
impacted.
|
||||
- **Unknown**: There is not enough data to infer the actual
|
||||
health status of a service cluster.
|
||||
- **Okay**: None of the above was found to be true.
|
||||
|
||||
The **Virtual Compute Resources** row provides an overview of
|
||||
the amount of virtual resources being used by the compute nodes
|
||||
including the number of virtual CPUs, the amount of memory
|
||||
and disk space being used as well as the amount of virtual
|
||||
resources remaining available to create new instances.
|
||||
|
||||
The "System" row provides an overview of the amount of physical
|
||||
resources being used on the control plane (the controller cluster).
|
||||
You can select a specific controller using the
|
||||
controller's drop down list in the left corner of the toolbar.
|
||||
|
||||
The "Ceph" row provides an overview of the resources usage
|
||||
and current health status of the Ceph cluster when it is deployed
|
||||
in the OpenStack environment.
|
||||
|
||||
The **Main Dashboard** is also an entry point to access more detailed
|
||||
dashboards for each of the OpenStack services that are monitored.
|
||||
For example, if you click on the *Nova box*, the **Nova
|
||||
Dashboard** is displayed.
|
||||
|
||||
.. image:: ../images/grafana_nova.png
|
||||
:width: 800
|
||||
|
||||
The Nova dashboard
|
||||
++++++++++++++++++
|
||||
|
||||
The **Nova Dashboard** provides a detailed view of the
|
||||
Nova service's related metrics.
|
||||
|
||||
The **Service Status** row provides information about the Nova service
|
||||
cluster health status as a whole including the status of the API frontend
|
||||
(the HAProxy public VIP), a counter of HTTP 5xx errors,
|
||||
the HTTP requests response time and status code.
|
||||
|
||||
The **Nova API** row provides information about the current health status of
|
||||
the API backends (nova-api, ec2-api, ...).
|
||||
|
||||
The **Nova Services** row provides information about the current and
|
||||
historical status of the Nova *workers*.
|
||||
|
||||
The **Instances** row provides information about the number of active
|
||||
instances in error and instances creation time statistics.
|
||||
|
||||
The **Resources** row provides various virtual resources usage indicators.
|
||||
|
||||
Self-monitoring dashboards
|
||||
++++++++++++++++++++++++++
|
||||
|
||||
The first **Self-Monitoring Dashboard** was introduced in LMA 0.8.
|
||||
The intent of the self-monitoring dashboards is to bring operational
|
||||
insights about how the monitoring system itself (the toolchain) performs overall.
|
||||
|
||||
The **Self-Monitoring Dashboard**, provides information about the *hekad*
|
||||
and *collectd* processes.
|
||||
In particular, it gives information about the amount of system resources
|
||||
consumed by these processes, the time allocated to the Lua plugins
|
||||
running within *hekad*, the amount of messages being processed and
|
||||
the time it takes to process those messages.
|
||||
|
||||
Again, it is possible to select a particular node view using the drop down
|
||||
menu list.
|
||||
|
||||
With LMA 0.9, we have introduced two new dashboards.
|
||||
|
||||
#. The **Elasticsearch Cluster Dashboard** provides information about
|
||||
the overall health status of the Elasticsearch cluster including
|
||||
the state of the shards, the number of pending tasks and various resources
|
||||
usage metrics.
|
||||
|
||||
#. The **InfluxDB Cluster Dashboard** provides statistics about the InfluxDB
|
||||
processes running in the InfluxDB cluster including various resources usage metrics.
|
||||
|
||||
|
||||
The hypervisor dashboard
|
||||
++++++++++++++++++++++++
|
||||
|
||||
LMA 0.9 introduces a new **Hypervisor Dashboard** which brings operational
|
||||
insights about the virtual instances managed through *libvirt*.
|
||||
As shown in the figure below, the **Hypervisor Dashboard** assembles a
|
||||
view of various *libvirt* metrics. A dropdown menu list allows to pick
|
||||
a particular instance UUID running on a particular node. In the
|
||||
example below, the metrics for the instance id *ba844a75-b9db-4c2f-9cb9-0b083fe03fb7*
|
||||
running on *node-4* are displayed.
|
||||
|
||||
.. image:: ../images/grafana_hypervisor.png
|
||||
:width: 800
|
||||
|
||||
Check the LMA Collector documentation for additional information about the
|
||||
`*libvirt* metrics <http://fuel-plugin-lma-collector.readthedocs.org/en/latest/appendix_b.html#libvirt>`_
|
||||
that are displayed in the **Hypervisor Dashboard**.
|
||||
|
||||
Other dashboards
|
||||
++++++++++++++++
|
||||
|
||||
In total there are 19 different dashboards you can use to
|
||||
explore different time-series facets of your OpenStack environment.
|
||||
|
||||
Viewing faults and anomalies
|
||||
++++++++++++++++++++++++++++
|
||||
|
||||
The LMA Toolchain is capable of detecting a number of service-affecting
|
||||
conditions such as the faults and anomalies that occured in your OpenStack
|
||||
environment.
|
||||
Those conditions are reported in annotations that are displayed in
|
||||
Grafana. The Grafana annotations contain a textual
|
||||
representation of the alarm (or set of alarms) that were triggered
|
||||
by the Collectors for a service.
|
||||
In other words, the annotations contain valuable insights
|
||||
that you could use to diagnose and
|
||||
troubleshoot problems. Furthermore, with the Grafana annotations,
|
||||
the system makes a distinction between what is estimated as a
|
||||
direct root cause versus what is estimated as an indirect
|
||||
root cause. This is internally represented in a dependency graph.
|
||||
There are first degree dependencies used to describe situations
|
||||
whereby the health status of an entity
|
||||
strictly depends on the health status of another entity. For
|
||||
example Nova as a service has first degree dependencies
|
||||
with the nova-api endpoints and the nova-scheduler workers. But
|
||||
there are also second degree dependencies whereby the health
|
||||
status of an entity doesn't strictly depends on the health status
|
||||
of another entity, although it might, depending on other operations
|
||||
being performed. For example, by default we declared that Nova
|
||||
has a second degree dependency with Neutron. As a result, the
|
||||
health status of Nova will not be directly impacted by the health
|
||||
status of Neutron but the annotation will provide
|
||||
a root cause analysis hint. Let's assume a situation
|
||||
where Nova has changed from *okay* to *critical* status (because of
|
||||
5xx HTTP errors) and that Neutron has been in *down* status for a while.
|
||||
In this case, the Nova dashboard will display an annotation showing that
|
||||
Nova has changed to a *warning* status because the system has detected
|
||||
5xx errors and that it may be due to the fact that Neutron is *down*.
|
||||
An example of what an annotation looks like is shown below.
|
||||
|
||||
.. image:: ../images/grafana_nova_annot.png
|
||||
:width: 800
|
||||
|
||||
This annotation shows that the health status of Nova is *down*
|
||||
because there is no *nova-api* service backend (viewed from HAProxy)
|
||||
that is *up*.
|
||||
|
||||
Hiding nodes from dashboards
|
||||
++++++++++++++++++++++++++++
|
||||
|
||||
When you remove a node from the environment, it is still displayed in
|
||||
the 'server' and 'controller' drop-down lists. To hide it from the list
|
||||
you need to edit the associated InfluxDB query in the *templating* section.
|
||||
For example, if you want to remove *node-1*, you need to add the following
|
||||
condition to the *where* clause::
|
||||
|
||||
and hostname != 'node-1'
|
||||
|
||||
|
||||
.. image:: ../images/remove_controllers_from_templating.png
|
||||
|
||||
If you want to hide more than one node you can add more conditions like this::
|
||||
|
||||
and hostname != 'node-1' and hostname != 'node-2'
|
||||
|
||||
This should be done for all dashboards that display the deleted node and you
|
||||
need to save them afterwards.
|
||||
@@ -1,499 +0,0 @@
|
||||
.. _user_guide:
|
||||
|
||||
User Guide
|
||||
==========
|
||||
|
||||
.. _plugin_configuration:
|
||||
|
||||
Plugin configuration
|
||||
--------------------
|
||||
|
||||
To configure the **StackLight InfluxDB-Grafana Plugin**, you need to follow these steps:
|
||||
|
||||
1. `Create a new environment
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/create-environment/start-create-env.html>`_.
|
||||
|
||||
2. Click on the *Settings* tab of the Fuel web UI and select the *Other* category.
|
||||
|
||||
3. Scroll down through the settings until you find the **InfluxDB-Grafana Server
|
||||
Plugin** section. You should see a page like this:
|
||||
|
||||
.. image:: ../images/influx_grafana_settings.png
|
||||
:width: 800
|
||||
|
||||
4. Tick the **InfluxDB-Grafana Plugin** box and fill-in the required fields as indicated below.
|
||||
|
||||
a. Specify the number of days of retention for your data.
|
||||
b. Specify the InfluxDB admin password (called root password in the InfluxDB documentation).
|
||||
c. Specify the database name (default is lma).
|
||||
d. Specify the InfluxDB username and password.
|
||||
e. Specify the Grafana username and password.
|
||||
|
||||
5. Since the introduction of Grafana 2.6.0, the plugin now uses a MySQL database
|
||||
to store its configuration data such as the dashboard templates.
|
||||
|
||||
a. Select **Local MySQL** if you want to create the Grafana database using the MySQL server
|
||||
of the OpenStack control-plane. Otherwise, select **Remote server** and specify
|
||||
the fully qualified name or IP address of the MySQL server you want to use.
|
||||
b. Then, specify the MySQL database name, username and password that will be used
|
||||
to access that database.
|
||||
|
||||
6. Tick the *Enable TLS for Grafana* box if you want to encrypt your
|
||||
Grafana credentials (username, password). Then, fill-in the required
|
||||
fields as indicated below.
|
||||
|
||||
.. image:: ../images/tls_settings.png
|
||||
:width: 800
|
||||
|
||||
a. Specify the DNS name of the Grafana server. This parameter is used
|
||||
to create a link in the Fuel dashboard to the Grafana server.
|
||||
#. Specify the location of a PEM file that contains the certificate
|
||||
and the private key of the Grafana server that will be used in TLS handchecks
|
||||
with the client.
|
||||
|
||||
7. Tick the *Use LDAP for Grafana authentication* box if you want to authenticate
|
||||
via LDAP to Grafana. Then, fill-in the required fields as indicated below.
|
||||
|
||||
.. image:: ../images/ldap_auth.png
|
||||
:width: 800
|
||||
|
||||
a. Select the *LDAPS* button if you want to enable LDAP authentication
|
||||
over SSL.
|
||||
#. Specify one or several LDAP server addresses separated by a space. Those
|
||||
addresses must be accessible from the node where Grafana is installed.
|
||||
Note that addresses external to the *management network* are not routable
|
||||
by default (see the note below).
|
||||
#. Specify the LDAP server port number or leave it empty to use the defaults.
|
||||
#. Specify the *Bind DN* of a user who has search priviliges on the LDAP server.
|
||||
#. Specify the password of the user identified by the *Bind DN* above.
|
||||
#. Specify the *Base DN* in the Directory Information Tree (DIT) from where
|
||||
to search for users.
|
||||
#. Specify a valid user search filter (ex. (uid=%s)).
|
||||
The result of the search should return a unique user entry.
|
||||
#. Specify a valid search filter to search for users.
|
||||
Example ``(uid=%s)``
|
||||
|
||||
You can further restrict access to Grafana to those users who
|
||||
are member of a specific LDAP group.
|
||||
|
||||
a. Tick the *Enable group-based authorization*.
|
||||
#. Specify the LDAP group *Base DN* in the DIT from where to search
|
||||
for groups.
|
||||
#. Specify the LDAP group search filter.
|
||||
Example ``(&(objectClass=posixGroup)(memberUid=%s))``
|
||||
#. Specify the CN of the LDAP group that will be mapped to the *admin role*
|
||||
#. Specify the CN of the LDAP group that will be mapped to the *viewer role*
|
||||
|
||||
Users who have the *admin role* can modify the Grafana dashboards
|
||||
or create new ones. Users who have the *viewer role* can only
|
||||
visualise the Grafana dashboards.
|
||||
|
||||
7. `Configure your environment
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/configure-environment.html>`_.
|
||||
|
||||
.. note:: By default, StackLight is configured to use the *management network*,
|
||||
of the so-called `Default Node Network Group
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/configure-environment/network-settings.html>`_.
|
||||
While this default setup may be appropriate for small deployments or
|
||||
evaluation purposes, it is recommended not to use this network
|
||||
for StackLight in production. It is instead recommended to create a network
|
||||
dedicated to StackLight using the `networking templates
|
||||
<https://docs.mirantis.com/openstack/fuel/fuel-8.0/operations.html#using-networking-templates>`_
|
||||
capability of Fuel. Using a dedicated network for StackLight will
|
||||
improve performances and reduce the monitoring footprint on the
|
||||
control-plane. It will also facilitate access to the Gafana UI
|
||||
after deployment as the *management network* is not routable.
|
||||
|
||||
8. Click the *Nodes* tab and assign the *InfluxDB_Grafana* role
|
||||
to the node(s) where you want to install the plugin.
|
||||
|
||||
You can see in the example below that the *InfluxDB_Grafana*
|
||||
role is assigned to three nodes along side with the
|
||||
*Alerting_Infrastructure* and the *Elasticsearch_Kibana* roles.
|
||||
Here, the three plugins of the LMA toolchain backend servers are
|
||||
installed on the same nodes. You can assign the *InfluxDB_Grafana*
|
||||
role to either one node (standalone install) or three nodes for HA.
|
||||
|
||||
.. image:: ../images/influx_grafana_role.png
|
||||
:width: 800
|
||||
|
||||
.. note:: Installing the InfluxDB server on more than three nodes
|
||||
is currently not possible using the Fuel plugin.
|
||||
Similarly, installing the InfluxDB server on two nodes
|
||||
is not recommended to avoid split-brain situations in the Raft
|
||||
consensus of the InfluxDB cluster as well as the *Pacemaker* cluster
|
||||
which is responsible of the VIP address failover.
|
||||
To be also noted that it is possible to add or remove nodes
|
||||
with the *InfluxDB_Grafana* role in the cluster after deployment.
|
||||
|
||||
9. `Adjust the disk partitioning if necessary
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/configure-environment/customize-partitions.html>`_.
|
||||
|
||||
By default, the InfluxDB-Grafana Plugin allocates:
|
||||
|
||||
* 20% of the first available disk for the operating system by honoring
|
||||
a range of 15GB minimum to 50GB maximum.
|
||||
* 10GB for */var/log*.
|
||||
* At least 30 GB for the InfluxDB database in */var/lib/influxdb*.
|
||||
|
||||
10. `Deploy your environment
|
||||
<http://docs.openstack.org/developer/fuel-docs/userdocs/fuel-user-guide/deploy-environment.html>`_.
|
||||
|
||||
.. _plugin_install_verification:
|
||||
|
||||
Plugin verification
|
||||
-------------------
|
||||
|
||||
Be aware that depending on the number of nodes and deployment setup,
|
||||
deploying a Mirantis OpenStack environment can typically take anything
|
||||
from 30 minutes to several hours. But once your deployment is complete,
|
||||
you should see a notification message indicating that you deployment
|
||||
successfully completed as in the figure below.
|
||||
|
||||
.. image:: ../images/deployment_notification.png
|
||||
:width: 800
|
||||
|
||||
Verifying InfluxDB
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You should verify that the InfluxDB cluster is running properly.
|
||||
First, you need first to retreive the InfluxDB cluster VIP address.
|
||||
Here is how to proceed.
|
||||
|
||||
#. On the Fuel Master node, find the IP address of a node where the InfluxDB
|
||||
server is installed using the following command::
|
||||
|
||||
[root@fuel ~]# fuel nodes
|
||||
id | status | name | cluster | ip | mac | roles |
|
||||
---|----------|------------------|---------|------------|-----|------------------|
|
||||
1 | ready | Untitled (fa:87) | 1 | 10.109.0.8 | ... | influxdb_grafana |
|
||||
2 | ready | Untitled (12:aa) | 1 | 10.109.0.3 | ... | influxdb_grafana |
|
||||
3 | ready | Untitled (4e:6e) | 1 | 10.109.0.7 | ... | influxdb_grafana |
|
||||
|
||||
|
||||
#. Then `ssh` to anyone of these nodes (ex. *node-1*) and type the command::
|
||||
|
||||
root@node-1:~# hiera lma::influxdb::vip
|
||||
10.109.1.4
|
||||
|
||||
This tells you that the VIP address of your InfluxDB cluster is *10.109.1.4*.
|
||||
|
||||
#. With that VIP address type the command::
|
||||
|
||||
root@node-1:~# /usr/bin/influx -database lma -password lmapass \
|
||||
--username root -host 10.109.1.4 -port 8086
|
||||
Visit https://enterprise.influxdata.com to register for updates,
|
||||
InfluxDB server management, and monitoring.
|
||||
Connected to http://10.109.1.4:8086 version 0.10.0
|
||||
InfluxDB shell 0.10.0
|
||||
>
|
||||
|
||||
As you can see, executing */usr/bin/influx* will start an interactive CLI and automatically connect to
|
||||
the InfluxDB server. Then if you type::
|
||||
|
||||
> show series
|
||||
|
||||
You should see a dump of all the time-series collected so far.
|
||||
Then, if you type::
|
||||
|
||||
> show servers
|
||||
name: data_nodes
|
||||
----------------
|
||||
id http_addr tcp_addr
|
||||
1 node-1:8086 node-1:8088
|
||||
3 node-2:8086 node-2:8088
|
||||
5 node-3:8086 node-3:8088
|
||||
|
||||
name: meta_nodes
|
||||
----------------
|
||||
id http_addr tcp_addr
|
||||
1 node-1:8091 node-1:8088
|
||||
2 node-2:8091 node-2:8088
|
||||
4 node-3:8091 node-3:8088
|
||||
|
||||
You should see a list of the nodes participating in the `InfluxDB cluster
|
||||
<https://docs.influxdata.com/influxdb/v0.10/guides/clustering/>`_ with their roles (data or meta).
|
||||
|
||||
|
||||
Verifying Grafana
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
From the Fuel dDashboard, click on the **Grafana** link (or enter the IP address
|
||||
and port number if your DNS is not setup).
|
||||
The first time you access Grafana, you are requested to
|
||||
authenticate using your credentials.
|
||||
|
||||
.. image:: ../images/grafana_login.png
|
||||
:width: 800
|
||||
|
||||
Then you should be redirected to the *Grafana Home Page*
|
||||
from where you can select a dashboard as shown below.
|
||||
|
||||
.. image:: ../images/grafana_home.png
|
||||
:width: 800
|
||||
|
||||
Exploring your time-series with Grafana
|
||||
---------------------------------------
|
||||
|
||||
The InfluxDB-Grafana Plugin comes with a collection of predefined
|
||||
dashboards you can use to visualize the time-series stored in InfluxDB.
|
||||
|
||||
Please check the LMA Collector documentation for a complete list of all the
|
||||
`metrics time-series <http://fuel-plugin-lma-collector.readthedocs.org/en/latest/appendix_b.html>`_
|
||||
that are collected and stored in InfluxDB.
|
||||
|
||||
The Main Dashboard
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
We suggest you start with the **Main Dashboard**, as shown
|
||||
below, as an entry to the other dashboards.
|
||||
The **Main Dashboard** provides a single pane of glass from where you can visualize the
|
||||
overall health status of your OpenStack services such as Nova and Cinder
|
||||
but also HAProxy, MySQL and RabbitMQ to name a few..
|
||||
|
||||
.. image:: ../images/grafana_main.png
|
||||
:width: 800
|
||||
|
||||
As you can see, the **Main Dashboard** (as most dashboards) provides
|
||||
a drop down menu list in the upper left corner of the window
|
||||
from where you can pick a particular metric dimension such as
|
||||
the *controller name* or the *device name* you want to select.
|
||||
|
||||
In the example above, the system metrics of *node-48* are
|
||||
being displayed in the dashboard.
|
||||
|
||||
Within the **OpenStack Services** row, each of the services
|
||||
represented can be assigned five different status.
|
||||
|
||||
.. note:: The precise determination of a service health status depends
|
||||
on the correlation policies implemented for that service by a `Global Status Evaluation (GSE)
|
||||
plugin <http://fuel-plugin-lma-collector.readthedocs.org/en/latest/alarms.html#cluster-policies>`_.
|
||||
|
||||
The meaning associated with a service health status is the following:
|
||||
|
||||
- **Down**: One or several primary functions of a service
|
||||
cluster has failed. For example,
|
||||
all API endpoints of a service cluster like Nova
|
||||
or Cinder are failed.
|
||||
- **Critical**: One or several primary functions of a
|
||||
service cluster are severely degraded. The quality
|
||||
of service delivered to the end-user should be severely
|
||||
impacted.
|
||||
- **Warning**: One or several primary functions of a
|
||||
service cluster are slightly degraded. The quality
|
||||
of service delivered to the end-user should be slightly
|
||||
impacted.
|
||||
- **Unknown**: There is not enough data to infer the actual
|
||||
health status of a service cluster.
|
||||
- **Okay**: None of the above was found to be true.
|
||||
|
||||
The **Virtual Compute Resources** row provides an overview of
|
||||
the amount of virtual resources being used by the compute nodes
|
||||
including the number of virtual CPUs, the amount of memory
|
||||
and disk space being used as well as the amount of virtual
|
||||
resources remaining available to create new instances.
|
||||
|
||||
The "System" row provides an overview of the amount of physical
|
||||
resources being used on the control plane (the controller cluster).
|
||||
You can select a specific controller using the
|
||||
controller's drop down list in the left corner of the toolbar.
|
||||
|
||||
The "Ceph" row provides an overview of the resources usage
|
||||
and current health status of the Ceph cluster when it is deployed
|
||||
in the OpenStack environment.
|
||||
|
||||
The **Main Dashboard** is also an entry point to access more detailed
|
||||
dashboards for each of the OpenStack services that are monitored.
|
||||
For example, if you click on the *Nova box*, the **Nova
|
||||
Dashboard** is displayed.
|
||||
|
||||
.. image:: ../images/grafana_nova.png
|
||||
:width: 800
|
||||
|
||||
The Nova Dashboard
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The **Nova Dashboard** provides a detailed view of the
|
||||
Nova service's related metrics.
|
||||
|
||||
The **Service Status** row provides information about the Nova service
|
||||
cluster health status as a whole including the status of the API frontend
|
||||
(the HAProxy public VIP), a counter of HTTP 5xx errors,
|
||||
the HTTP requests response time and status code.
|
||||
|
||||
The **Nova API** row provides information about the current health status of
|
||||
the API backends (nova-api, ec2-api, ...).
|
||||
|
||||
The **Nova Services** row provides information about the current and
|
||||
historical status of the Nova *workers*.
|
||||
|
||||
The **Instances** row provides information about the number of active
|
||||
instances in error and instances creation time statistics.
|
||||
|
||||
The **Resources** row provides various virtual resources usage indicators.
|
||||
|
||||
Self-Monitoring Dashboards
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The first **Self-Monitoring Dashboard** was introduced in LMA 0.8.
|
||||
The intent of the self-monitoring dashboards is to bring operational
|
||||
insights about how the monitoring system itself (the toolchain) performs overall.
|
||||
|
||||
The **Self-Monitoring Dashboard**, provides information about the *hekad*
|
||||
and *collectd* processes.
|
||||
In particular, it gives information about the amount of system resources
|
||||
consumed by these processes, the time allocated to the Lua plugins
|
||||
running within *hekad*, the amount of messages being processed and
|
||||
the time it takes to process those messages.
|
||||
|
||||
Again, it is possible to select a particular node view using the drop down
|
||||
menu list.
|
||||
|
||||
With LMA 0.9, we have introduced two new dashboards.
|
||||
|
||||
#. The **Elasticsearch Cluster Dashboard** provides information about
|
||||
the overall health status of the Elasticsearch cluster including
|
||||
the state of the shards, the number of pending tasks and various resources
|
||||
usage metrics.
|
||||
|
||||
#. The **InfluxDB Cluster Dashboard** provides statistics about the InfluxDB
|
||||
processes running in the InfluxDB cluster including various resources usage metrics.
|
||||
|
||||
|
||||
The Hypervisor Dashboard
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
LMA 0.9 introduces a new **Hypervisor Dashboard** which brings operational
|
||||
insights about the virtual instances managed through *libvirt*.
|
||||
As shown in the figure below, the **Hypervisor Dashboard** assembles a
|
||||
view of various *libvirt* metrics. A dropdown menu list allows to pick
|
||||
a particular instance UUID running on a particular node. In the
|
||||
example below, the metrics for the instance id *ba844a75-b9db-4c2f-9cb9-0b083fe03fb7*
|
||||
running on *node-4* are displayed.
|
||||
|
||||
.. image:: ../images/grafana_hypervisor.png
|
||||
:width: 800
|
||||
|
||||
Check the LMA Collector documentation for additional information about the
|
||||
`*libvirt* metrics <http://fuel-plugin-lma-collector.readthedocs.org/en/latest/appendix_b.html#libvirt>`_
|
||||
that are displayed in the **Hypervisor Dashboard**.
|
||||
|
||||
Other Dashboards
|
||||
~~~~~~~~~~~~~~~~
|
||||
|
||||
In total there are 19 different dashboards you can use to
|
||||
explore different time-series facets of your OpenStack environment.
|
||||
|
||||
Viewing Faults and Anomalies
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
The LMA Toolchain is capable of detecting a number of service-affecting
|
||||
conditions such as the faults and anomalies that occured in your OpenStack
|
||||
environment.
|
||||
Those conditions are reported in annotations that are displayed in
|
||||
Grafana. The Grafana annotations contain a textual
|
||||
representation of the alarm (or set of alarms) that were triggered
|
||||
by the Collectors for a service.
|
||||
In other words, the annotations contain valuable insights
|
||||
that you could use to diagnose and
|
||||
troubleshoot problems. Furthermore, with the Grafana annotations,
|
||||
the system makes a distinction between what is estimated as a
|
||||
direct root cause versus what is estimated as an indirect
|
||||
root cause. This is internally represented in a dependency graph.
|
||||
There are first degree dependencies used to describe situations
|
||||
whereby the health status of an entity
|
||||
strictly depends on the health status of another entity. For
|
||||
example Nova as a service has first degree dependencies
|
||||
with the nova-api endpoints and the nova-scheduler workers. But
|
||||
there are also second degree dependencies whereby the health
|
||||
status of an entity doesn't strictly depends on the health status
|
||||
of another entity, although it might, depending on other operations
|
||||
being performed. For example, by default we declared that Nova
|
||||
has a second degree dependency with Neutron. As a result, the
|
||||
health status of Nova will not be directly impacted by the health
|
||||
status of Neutron but the annotation will provide
|
||||
a root cause analysis hint. Let's assume a situation
|
||||
where Nova has changed from *okay* to *critical* status (because of
|
||||
5xx HTTP errors) and that Neutron has been in *down* status for a while.
|
||||
In this case, the Nova dashboard will display an annotation showing that
|
||||
Nova has changed to a *warning* status because the system has detected
|
||||
5xx errors and that it may be due to the fact that Neutron is *down*.
|
||||
An example of what an annotation looks like is shown below.
|
||||
|
||||
.. image:: ../images/grafana_nova_annot.png
|
||||
:width: 800
|
||||
|
||||
This annotation shows that the health status of Nova is *down*
|
||||
because there is no *nova-api* service backend (viewed from HAProxy)
|
||||
that is *up*.
|
||||
|
||||
Hiding nodes from dashboards
|
||||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||||
|
||||
When you remove a node from the environment, it is still displayed in
|
||||
the 'server' and 'controller' drop-down lists. To hide it from the list
|
||||
you need to edit the associated InfluxDB query in the *templating* section.
|
||||
For example, if you want to remove *node-1*, you need to add the following
|
||||
condition to the *where* clause::
|
||||
|
||||
and hostname != 'node-1'
|
||||
|
||||
|
||||
.. image:: ../images/remove_controllers_from_templating.png
|
||||
|
||||
If you want to hide more than one node you can add more conditions like this::
|
||||
|
||||
and hostname != 'node-1' and hostname != 'node-2'
|
||||
|
||||
This should be done for all dashboards that display the deleted node and you
|
||||
need to save them afterwards.
|
||||
|
||||
Troubleshooting
|
||||
---------------
|
||||
|
||||
If you get no data in Grafana, follow these troubleshooting tips.
|
||||
|
||||
#. First, check that the LMA Collector is running properly by following the
|
||||
LMA Collector troubleshooting instructions in the
|
||||
`LMA Collector Fuel Plugin User Guide <http://fuel-plugin-lma-collector.readthedocs.org/>`_.
|
||||
|
||||
#. Check that the nodes are able to connect to the InfluxDB cluster via the VIP address
|
||||
(see above how to get the InfluxDB cluster VIP address) on port *8086*::
|
||||
|
||||
root@node-2:~# curl -I http://<VIP>:8086/ping
|
||||
|
||||
The server should return a 204 HTTP status::
|
||||
|
||||
HTTP/1.1 204 No Content
|
||||
Request-Id: cdc3c545-d19d-11e5-b457-000000000000
|
||||
X-Influxdb-Version: 0.10.0
|
||||
Date: Fri, 12 Feb 2016 15:32:19 GMT
|
||||
|
||||
#. Check that InfluxDB cluster VIP address is up and running::
|
||||
|
||||
root@node-1:~# crm resource status vip__influxdb
|
||||
resource vip__influxdb is running on: node-1.test.domain.local
|
||||
|
||||
#. Check that the InfluxDB service is started on all nodes of the cluster::
|
||||
|
||||
root@node-1:~# service influxdb status
|
||||
influxdb Process is running [ OK ]
|
||||
|
||||
#. If not, (re)start it::
|
||||
|
||||
root@node-1:~# service influxdb start
|
||||
Starting the process influxdb [ OK ]
|
||||
influxdb process was started [ OK ]
|
||||
|
||||
#. Check that Grafana server is running::
|
||||
|
||||
root@node-1:~# service grafana-server status
|
||||
* grafana is running
|
||||
|
||||
#. If not, (re)start it::
|
||||
|
||||
root@node-1:~# service grafana-server start
|
||||
* Starting Grafana Server
|
||||
|
||||
#. If none of the above solves the problem, check the logs in ``/var/log/influxdb/influxdb.log``
|
||||
and ``/var/log/grafana/grafana.log`` to find out what might have gone wrong.
|
||||
92
doc/source/verification.rst
Normal file
92
doc/source/verification.rst
Normal file
@@ -0,0 +1,92 @@
|
||||
.. _verification:
|
||||
|
||||
Plugin verification
|
||||
-------------------
|
||||
|
||||
Be aware that depending on the number of nodes and deployment setup,
|
||||
deploying a Mirantis OpenStack environment can typically take anything
|
||||
from 30 minutes to several hours. But once your deployment is complete,
|
||||
you should see a notification message indicating that you deployment
|
||||
successfully completed as in the figure below.
|
||||
|
||||
.. image:: ../images/deployment_notification.png
|
||||
:width: 800
|
||||
|
||||
Verifying InfluxDB
|
||||
~~~~~~~~~~~~~~~~~~
|
||||
|
||||
You should verify that the InfluxDB cluster is running properly.
|
||||
First, you need first to retreive the InfluxDB cluster VIP address.
|
||||
Here is how to proceed.
|
||||
|
||||
#. On the Fuel Master node, find the IP address of a node where the InfluxDB
|
||||
server is installed using the following command::
|
||||
|
||||
[root@fuel ~]# fuel nodes
|
||||
id | status | name | cluster | ip | mac | roles |
|
||||
---|----------|------------------|---------|------------|-----|------------------|
|
||||
1 | ready | Untitled (fa:87) | 1 | 10.109.0.8 | ... | influxdb_grafana |
|
||||
2 | ready | Untitled (12:aa) | 1 | 10.109.0.3 | ... | influxdb_grafana |
|
||||
3 | ready | Untitled (4e:6e) | 1 | 10.109.0.7 | ... | influxdb_grafana |
|
||||
|
||||
|
||||
#. Then `ssh` to anyone of these nodes (ex. *node-1*) and type the command::
|
||||
|
||||
root@node-1:~# hiera lma::influxdb::vip
|
||||
10.109.1.4
|
||||
|
||||
This tells you that the VIP address of your InfluxDB cluster is *10.109.1.4*.
|
||||
|
||||
#. With that VIP address type the command::
|
||||
|
||||
root@node-1:~# /usr/bin/influx -database lma -password lmapass \
|
||||
--username root -host 10.109.1.4 -port 8086
|
||||
Visit https://enterprise.influxdata.com to register for updates,
|
||||
InfluxDB server management, and monitoring.
|
||||
Connected to http://10.109.1.4:8086 version 0.10.0
|
||||
InfluxDB shell 0.10.0
|
||||
>
|
||||
|
||||
As you can see, executing */usr/bin/influx* will start an interactive CLI and automatically connect to
|
||||
the InfluxDB server. Then if you type::
|
||||
|
||||
> show series
|
||||
|
||||
You should see a dump of all the time-series collected so far.
|
||||
Then, if you type::
|
||||
|
||||
> show servers
|
||||
name: data_nodes
|
||||
----------------
|
||||
id http_addr tcp_addr
|
||||
1 node-1:8086 node-1:8088
|
||||
3 node-2:8086 node-2:8088
|
||||
5 node-3:8086 node-3:8088
|
||||
|
||||
name: meta_nodes
|
||||
----------------
|
||||
id http_addr tcp_addr
|
||||
1 node-1:8091 node-1:8088
|
||||
2 node-2:8091 node-2:8088
|
||||
4 node-3:8091 node-3:8088
|
||||
|
||||
You should see a list of the nodes participating in the `InfluxDB cluster
|
||||
<https://docs.influxdata.com/influxdb/v0.10/guides/clustering/>`_ with their roles (data or meta).
|
||||
|
||||
|
||||
Verifying Grafana
|
||||
~~~~~~~~~~~~~~~~~
|
||||
|
||||
From the Fuel dDashboard, click on the **Grafana** link (or enter the IP address
|
||||
and port number if your DNS is not setup).
|
||||
The first time you access Grafana, you are requested to
|
||||
authenticate using your credentials.
|
||||
|
||||
.. image:: ../images/grafana_login.png
|
||||
:width: 800
|
||||
|
||||
Then you should be redirected to the *Grafana Home Page*
|
||||
from where you can select a dashboard as shown below.
|
||||
|
||||
.. image:: ../images/grafana_home.png
|
||||
:width: 800
|
||||
Reference in New Issue
Block a user