Add some documentation for statsd

Change-Id: Ie93681bec4c12bfda5c3c31d34e3cb02176e252d
This commit is contained in:
Andrew Hutchings
2013-04-19 15:51:25 +01:00
committed by David Shrewsbury
parent 7a13c2a704
commit e1daeaddbc
5 changed files with 137 additions and 2 deletions

View File

@@ -1,5 +1,5 @@
Configuration of Node Pool Manager and Worker
=============================================
Configuration of Services
=========================
Options can be specified either via the command line, or with a configuration
file, or both. Options given on the command line will override any options
@@ -273,4 +273,50 @@ Pool Manager Command Line Options
Enable verbose output. Normally, only errors are logged. This enables
additional logging, but not as much as the :option:`-d` option.
Statsd Command Line Options
---------------------------
.. program:: libra_statsd.py
.. option:: --api_server <HOST:PORT>
The hostname/IP and port colon separated for use with the HP REST API
driver. Can be specified multiple times for multiple servers. This
option is also used for the hp_rest alerting driver.
.. option:: --server <HOST:PORT>
Used to specify the Gearman job server hostname and port. This option
can be used multiple times to specify multiple job servers
.. option:: --driver <DRIVER LIST>
The drivers to be used for alerting. This option can be used multiple
times to specift multiple drivers.
.. option:: --ping_interval <PING_INTERVAL>
How often to run a ping check of load balancers (in seconds), default 60
.. option:: --repair_interval <REPAIR_INTERVAL>
How often to run a check to see if damaged load balancers had been
repaired (in seconds), default 180
.. option:: --datadog_api_key <KEY>
The API key to be used for the datadog driver
.. option:: --datadog_app_key <KEY>
The Application key to be used for the datadog driver
.. option:: --datadog_message_tail <TEXT>
Some text to add at the end of an alerting message such as a list of
users to alert (using @user@email.com format), used for the datadog
driver.
.. option:: --datadog_tags <TAGS>
A list of tags to be used for the datadog driver

View File

@@ -7,4 +7,5 @@ Load Balancer as a Service Device Tools
introduction
worker/index
pool_mgm/index
statsd/index
config

24
doc/statsd/about.rst Normal file
View File

@@ -0,0 +1,24 @@
Description
===========
Purpose
-------
The Libra Statsd is a monitoring system for the health of load balancers. It
can query many load balancers in parallel and supports a plugable architecture
for different methods of reporting.
Design
------
Statsd currently only does an advanced "ping" style monitoring. By default it
will get a list of ONLINE load balancers from the API server and will send a
gearman message to the worker of each one. The worker tests its own HAProxy
instance and will report a success/fail. If there is a failure or the gearman
message times-out then this is sent to the alerting backends. There is a
further secheduled run set to every three minutes which will re-test the failed
devices to see if they have been repair. If they have this will trigger a
'repaired' notice.
Alerting is done using a plugin system which can have multiple plugins enabled
at the same time.

56
doc/statsd/drivers.rst Normal file
View File

@@ -0,0 +1,56 @@
Statsd Drivers
==============
Introduction
------------
Statsd has a small driver API to be used for alerting. Multiple drivers can
be loaded at the same time to alert in multiple places.
Design
------
The base class called ``AlertDriver`` is used to create new drivers. These
will be supplied ``self.logger`` to use for logging and ``self.args`` which
contains the arguments supplied to statsd. Drivers using this need to
supply two functions:
.. py:class:: AlertDriver
.. py:method:: send_alert(message, device_id)
:param message: A message with details of the failure
:param device_id: The ID of the device that has failed
.. py:method:: send_repair(message, device_id)
:param message: A message with details of the recovered load balancer
:param device_id: The ID of the device that has been recovered
.. py:data:: known_drivers
This is the dictionary that maps values for the
:option:`--driver <libra_statsd.py --driver>` option
to a class implementing the driver :py:class:`~AlertDriver` API
for the statsd server. After implementing a new driver class, you simply add
a new entry to this dictionary to make it a selectable option.
Dummy Driver
------------
This driver is used for simple testing/debugging. It echos the message details
into statsd's log file.
Datadog Driver
--------------
The Datadog driver uses the Datadog API to send alerts into the Datadog event
stream. Alerts are sent as 'ERROR' and repairs as 'SUCCESS'.
HP REST Driver
--------------
This sends messages to the HP REST API server to mark nodes as ERROR/ONLINE.

8
doc/statsd/index.rst Normal file
View File

@@ -0,0 +1,8 @@
Statsd Monitoring Daemon
========================
.. toctree::
:maxdepth: 2
about
drivers