Add some documentation for statsd
Change-Id: Ie93681bec4c12bfda5c3c31d34e3cb02176e252d
This commit is contained in:
@@ -1,5 +1,5 @@
|
|||||||
Configuration of Node Pool Manager and Worker
|
Configuration of Services
|
||||||
=============================================
|
=========================
|
||||||
|
|
||||||
Options can be specified either via the command line, or with a configuration
|
Options can be specified either via the command line, or with a configuration
|
||||||
file, or both. Options given on the command line will override any options
|
file, or both. Options given on the command line will override any options
|
||||||
@@ -273,4 +273,50 @@ Pool Manager Command Line Options
|
|||||||
Enable verbose output. Normally, only errors are logged. This enables
|
Enable verbose output. Normally, only errors are logged. This enables
|
||||||
additional logging, but not as much as the :option:`-d` option.
|
additional logging, but not as much as the :option:`-d` option.
|
||||||
|
|
||||||
|
Statsd Command Line Options
|
||||||
|
---------------------------
|
||||||
|
|
||||||
|
.. program:: libra_statsd.py
|
||||||
|
|
||||||
|
.. option:: --api_server <HOST:PORT>
|
||||||
|
|
||||||
|
The hostname/IP and port colon separated for use with the HP REST API
|
||||||
|
driver. Can be specified multiple times for multiple servers. This
|
||||||
|
option is also used for the hp_rest alerting driver.
|
||||||
|
|
||||||
|
.. option:: --server <HOST:PORT>
|
||||||
|
|
||||||
|
Used to specify the Gearman job server hostname and port. This option
|
||||||
|
can be used multiple times to specify multiple job servers
|
||||||
|
|
||||||
|
.. option:: --driver <DRIVER LIST>
|
||||||
|
|
||||||
|
The drivers to be used for alerting. This option can be used multiple
|
||||||
|
times to specift multiple drivers.
|
||||||
|
|
||||||
|
.. option:: --ping_interval <PING_INTERVAL>
|
||||||
|
|
||||||
|
How often to run a ping check of load balancers (in seconds), default 60
|
||||||
|
|
||||||
|
.. option:: --repair_interval <REPAIR_INTERVAL>
|
||||||
|
|
||||||
|
How often to run a check to see if damaged load balancers had been
|
||||||
|
repaired (in seconds), default 180
|
||||||
|
|
||||||
|
.. option:: --datadog_api_key <KEY>
|
||||||
|
|
||||||
|
The API key to be used for the datadog driver
|
||||||
|
|
||||||
|
.. option:: --datadog_app_key <KEY>
|
||||||
|
|
||||||
|
The Application key to be used for the datadog driver
|
||||||
|
|
||||||
|
.. option:: --datadog_message_tail <TEXT>
|
||||||
|
|
||||||
|
Some text to add at the end of an alerting message such as a list of
|
||||||
|
users to alert (using @user@email.com format), used for the datadog
|
||||||
|
driver.
|
||||||
|
|
||||||
|
.. option:: --datadog_tags <TAGS>
|
||||||
|
|
||||||
|
A list of tags to be used for the datadog driver
|
||||||
|
|||||||
@@ -7,4 +7,5 @@ Load Balancer as a Service Device Tools
|
|||||||
introduction
|
introduction
|
||||||
worker/index
|
worker/index
|
||||||
pool_mgm/index
|
pool_mgm/index
|
||||||
|
statsd/index
|
||||||
config
|
config
|
||||||
|
|||||||
24
doc/statsd/about.rst
Normal file
24
doc/statsd/about.rst
Normal file
@@ -0,0 +1,24 @@
|
|||||||
|
Description
|
||||||
|
===========
|
||||||
|
|
||||||
|
Purpose
|
||||||
|
-------
|
||||||
|
|
||||||
|
The Libra Statsd is a monitoring system for the health of load balancers. It
|
||||||
|
can query many load balancers in parallel and supports a plugable architecture
|
||||||
|
for different methods of reporting.
|
||||||
|
|
||||||
|
Design
|
||||||
|
------
|
||||||
|
|
||||||
|
Statsd currently only does an advanced "ping" style monitoring. By default it
|
||||||
|
will get a list of ONLINE load balancers from the API server and will send a
|
||||||
|
gearman message to the worker of each one. The worker tests its own HAProxy
|
||||||
|
instance and will report a success/fail. If there is a failure or the gearman
|
||||||
|
message times-out then this is sent to the alerting backends. There is a
|
||||||
|
further secheduled run set to every three minutes which will re-test the failed
|
||||||
|
devices to see if they have been repair. If they have this will trigger a
|
||||||
|
'repaired' notice.
|
||||||
|
|
||||||
|
Alerting is done using a plugin system which can have multiple plugins enabled
|
||||||
|
at the same time.
|
||||||
56
doc/statsd/drivers.rst
Normal file
56
doc/statsd/drivers.rst
Normal file
@@ -0,0 +1,56 @@
|
|||||||
|
Statsd Drivers
|
||||||
|
==============
|
||||||
|
|
||||||
|
Introduction
|
||||||
|
------------
|
||||||
|
|
||||||
|
Statsd has a small driver API to be used for alerting. Multiple drivers can
|
||||||
|
be loaded at the same time to alert in multiple places.
|
||||||
|
|
||||||
|
Design
|
||||||
|
------
|
||||||
|
|
||||||
|
The base class called ``AlertDriver`` is used to create new drivers. These
|
||||||
|
will be supplied ``self.logger`` to use for logging and ``self.args`` which
|
||||||
|
contains the arguments supplied to statsd. Drivers using this need to
|
||||||
|
supply two functions:
|
||||||
|
|
||||||
|
.. py:class:: AlertDriver
|
||||||
|
|
||||||
|
.. py:method:: send_alert(message, device_id)
|
||||||
|
|
||||||
|
:param message: A message with details of the failure
|
||||||
|
:param device_id: The ID of the device that has failed
|
||||||
|
|
||||||
|
.. py:method:: send_repair(message, device_id)
|
||||||
|
|
||||||
|
:param message: A message with details of the recovered load balancer
|
||||||
|
:param device_id: The ID of the device that has been recovered
|
||||||
|
|
||||||
|
|
||||||
|
.. py:data:: known_drivers
|
||||||
|
|
||||||
|
This is the dictionary that maps values for the
|
||||||
|
:option:`--driver <libra_statsd.py --driver>` option
|
||||||
|
to a class implementing the driver :py:class:`~AlertDriver` API
|
||||||
|
for the statsd server. After implementing a new driver class, you simply add
|
||||||
|
a new entry to this dictionary to make it a selectable option.
|
||||||
|
|
||||||
|
Dummy Driver
|
||||||
|
------------
|
||||||
|
|
||||||
|
This driver is used for simple testing/debugging. It echos the message details
|
||||||
|
into statsd's log file.
|
||||||
|
|
||||||
|
Datadog Driver
|
||||||
|
--------------
|
||||||
|
|
||||||
|
The Datadog driver uses the Datadog API to send alerts into the Datadog event
|
||||||
|
stream. Alerts are sent as 'ERROR' and repairs as 'SUCCESS'.
|
||||||
|
|
||||||
|
HP REST Driver
|
||||||
|
--------------
|
||||||
|
|
||||||
|
This sends messages to the HP REST API server to mark nodes as ERROR/ONLINE.
|
||||||
|
|
||||||
|
|
||||||
8
doc/statsd/index.rst
Normal file
8
doc/statsd/index.rst
Normal file
@@ -0,0 +1,8 @@
|
|||||||
|
Statsd Monitoring Daemon
|
||||||
|
========================
|
||||||
|
|
||||||
|
.. toctree::
|
||||||
|
:maxdepth: 2
|
||||||
|
|
||||||
|
about
|
||||||
|
drivers
|
||||||
Reference in New Issue
Block a user