Add skeleton logstash module.

This new logstash module adds classes to install logstash agents and indexers as well as redis and elasticsearch. The configuration for each of these services is rudimentary but it shouldn't be difficult to expand the configs and make them useful. Also, add a logstash.openstack.org node that will have an agent, indexer, web frontend, redis, and elasticsearch installed on it. Change-Id: I25b635f088f99d45cfaa70ed122c6433d3784937 Reviewed-on: https://review.openstack.org/19871 Reviewed-by: Jeremy Stanley <fungi@yuggoth.org> Approved: Clark Boylan <clark.boylan@gmail.com> Reviewed-by: Clark Boylan <clark.boylan@gmail.com> Tested-by: Jenkins
2013-01-16 15:01:25 -08:00 · 2013-01-16 15:01:25 -08:00 · 457a9d8764
commit 457a9d8764
parent 168b11398a
19 changed files with 1283 additions and 0 deletions
--- a/manifests/site.pp
+++ b/manifests/site.pp
@ -180,6 +180,12 @@ node 'puppet-dashboard.openstack.org' {
  }
 }

+node 'logstash.openstack.org' {
+  class { 'openstack_project::logstash':
+    sysadmins => hiera('sysadmins'),
+  }
+}
+
 # A machine to serve static content.
 node 'static.openstack.org' {
  class { 'openstack_project::static':
--- a/modules/logstash/files/elasticsearch.yml
+++ b/modules/logstash/files/elasticsearch.yml
@ -0,0 +1,365 @@
+##################### ElasticSearch Configuration Example #####################
+
+# This file contains an overview of various configuration settings,
+# targeted at operations staff. Application developers should
+# consult the guide at <http://elasticsearch.org/guide>.
+#
+# The installation procedure is covered at
+# <http://elasticsearch.org/guide/reference/setup/installation.html>.
+#
+# ElasticSearch comes with reasonable defaults for most settings,
+# so you can try it out without bothering with configuration.
+#
+# Most of the time, these defaults are just fine for running a production
+# cluster. If you're fine-tuning your cluster, or wondering about the
+# effect of certain configuration option, please _do ask_ on the
+# mailing list or IRC channel [http://elasticsearch.org/community].
+
+# Any element in the configuration can be replaced with environment variables
+# by placing them in ${...} notation. For example:
+#
+# node.rack: ${RACK_ENV_VAR}
+
+# See <http://elasticsearch.org/guide/reference/setup/configuration.html>
+# for information on supported formats and syntax for the configuration file.
+
+
+################################### Cluster ###################################
+
+# Cluster name identifies your cluster for auto-discovery. If you're running
+# multiple clusters on the same network, make sure you're using unique names.
+#
+# cluster.name: elasticsearch
+
+
+#################################### Node #####################################
+
+# Node names are generated dynamically on startup, so you're relieved
+# from configuring them manually. You can tie this node to a specific name:
+#
+# node.name: "Franz Kafka"
+
+# Every node can be configured to allow or deny being eligible as the master,
+# and to allow or deny to store the data.
+#
+# Allow this node to be eligible as a master node (enabled by default):
+#
+# node.master: true
+#
+# Allow this node to store data (enabled by default):
+#
+# node.data: true
+
+# You can exploit these settings to design advanced cluster topologies.
+#
+# 1. You want this node to never become a master node, only to hold data.
+#    This will be the "workhorse" of your cluster.
+#
+# node.master: false
+# node.data: true
+#
+# 2. You want this node to only serve as a master: to not store any data and
+#    to have free resources. This will be the "coordinator" of your cluster.
+#
+# node.master: true
+# node.data: false
+#
+# 3. You want this node to be neither master nor data node, but
+#    to act as a "search load balancer" (fetching data from nodes,
+#    aggregating results, etc.)
+#
+# node.master: false
+# node.data: false
+
+# Use the Cluster Health API [http://localhost:9200/_cluster/health], the
+# Node Info API [http://localhost:9200/_cluster/nodes] or GUI tools
+# such as <http://github.com/lukas-vlcek/bigdesk> and
+# <http://mobz.github.com/elasticsearch-head> to inspect the cluster state.
+
+# A node can have generic attributes associated with it, which can later be used
+# for customized shard allocation filtering, or allocation awareness. An attribute
+# is a simple key value pair, similar to node.key: value, here is an example:
+#
+# node.rack: rack314
+
+# By default, multiple nodes are allowed to start from the same installation location
+# to disable it, set the following:
+# node.max_local_storage_nodes: 1
+
+
+#################################### Index ####################################
+
+# You can set a number of options (such as shard/replica options, mapping
+# or analyzer definitions, translog settings, ...) for indices globally,
+# in this file.
+#
+# Note, that it makes more sense to configure index settings specifically for
+# a certain index, either when creating it or by using the index templates API.
+#
+# See <http://elasticsearch.org/guide/reference/index-modules/> and
+# <http://elasticsearch.org/guide/reference/api/admin-indices-create-index.html>
+# for more information.
+
+# Set the number of shards (splits) of an index (5 by default):
+#
+# index.number_of_shards: 5
+
+# Set the number of replicas (additional copies) of an index (1 by default):
+#
+# index.number_of_replicas: 1
+
+# Note, that for development on a local machine, with small indices, it usually
+# makes sense to "disable" the distributed features:
+#
+# index.number_of_shards: 1
+# index.number_of_replicas: 0
+
+# These settings directly affect the performance of index and search operations
+# in your cluster. Assuming you have enough machines to hold shards and
+# replicas, the rule of thumb is:
+#
+# 1. Having more *shards* enhances the _indexing_ performance and allows to
+#    _distribute_ a big index across machines.
+# 2. Having more *replicas* enhances the _search_ performance and improves the
+#    cluster _availability_.
+#
+# The "number_of_shards" is a one-time setting for an index.
+#
+# The "number_of_replicas" can be increased or decreased anytime,
+# by using the Index Update Settings API.
+#
+# ElasticSearch takes care about load balancing, relocating, gathering the
+# results from nodes, etc. Experiment with different settings to fine-tune
+# your setup.
+
+# Use the Index Status API (<http://localhost:9200/A/_status>) to inspect
+# the index status.
+
+
+#################################### Paths ####################################
+
+# Path to directory containing configuration (this file and logging.yml):
+#
+# path.conf: /path/to/conf
+
+# Path to directory where to store index data allocated for this node.
+#
+# path.data: /path/to/data
+#
+# Can optionally include more than one location, causing data to be striped across
+# the locations (a la RAID 0) on a file level, favouring locations with most free
+# space on creation. For example:
+#
+# path.data: /path/to/data1,/path/to/data2
+
+# Path to temporary files:
+#
+# path.work: /path/to/work
+
+# Path to log files:
+#
+# path.logs: /path/to/logs
+
+# Path to where plugins are installed:
+#
+# path.plugins: /path/to/plugins
+
+
+#################################### Plugin ###################################
+
+# If a plugin listed here is not installed for current node, the node will not start.
+#
+# plugin.mandatory: mapper-attachments,lang-groovy
+
+
+################################### Memory ####################################
+
+# ElasticSearch performs poorly when JVM starts swapping: you should ensure that
+# it _never_ swaps.
+#
+# Set this property to true to lock the memory:
+#
+# bootstrap.mlockall: true
+
+# Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set
+# to the same value, and that the machine has enough memory to allocate
+# for ElasticSearch, leaving enough memory for the operating system itself.
+#
+# You should also make sure that the ElasticSearch process is allowed to lock
+# the memory, eg. by using `ulimit -l unlimited`.
+
+
+############################## Network And HTTP ###############################
+
+# ElasticSearch, by default, binds itself to the 0.0.0.0 address, and listens
+# on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node
+# communication. (the range means that if the port is busy, it will automatically
+# try the next port).
+
+# Set the bind address specifically (IPv4 or IPv6):
+#
+# network.bind_host: 192.168.0.1
+
+# Set the address other nodes will use to communicate with this node. If not
+# set, it is automatically derived. It must point to an actual IP address.
+#
+# network.publish_host: 192.168.0.1
+
+# Set both 'bind_host' and 'publish_host':
+#
+# network.host: 192.168.0.1
+
+# Set a custom port for the node to node communication (9300 by default):
+#
+# transport.tcp.port: 9300
+
+# Enable compression for all communication between nodes (disabled by default):
+#
+# transport.tcp.compress: true
+
+# Set a custom port to listen for HTTP traffic:
+#
+# http.port: 9200
+
+# Set a custom allowed content length:
+#
+# http.max_content_length: 100mb
+
+# Disable HTTP completely:
+#
+# http.enabled: false
+
+
+################################### Gateway ###################################
+
+# The gateway allows for persisting the cluster state between full cluster
+# restarts. Every change to the state (such as adding an index) will be stored
+# in the gateway, and when the cluster starts up for the first time,
+# it will read its state from the gateway.
+
+# There are several types of gateway implementations. For more information,
+# see <http://elasticsearch.org/guide/reference/modules/gateway>.
+
+# The default gateway type is the "local" gateway (recommended):
+#
+# gateway.type: local
+
+# Settings below control how and when to start the initial recovery process on
+# a full cluster restart (to reuse as much local data as possible when using shared
+# gateway).
+
+# Allow recovery process after N nodes in a cluster are up:
+#
+# gateway.recover_after_nodes: 1
+
+# Set the timeout to initiate the recovery process, once the N nodes
+# from previous setting are up (accepts time value):
+#
+# gateway.recover_after_time: 5m
+
+# Set how many nodes are expected in this cluster. Once these N nodes
+# are up (and recover_after_nodes is met), begin recovery process immediately
+# (without waiting for recover_after_time to expire):
+#
+# gateway.expected_nodes: 2
+
+
+############################# Recovery Throttling #############################
+
+# These settings allow to control the process of shards allocation between
+# nodes during initial recovery, replica allocation, rebalancing,
+# or when adding and removing nodes.
+
+# Set the number of concurrent recoveries happening on a node:
+#
+# 1. During the initial recovery
+#
+# cluster.routing.allocation.node_initial_primaries_recoveries: 4
+#
+# 2. During adding/removing nodes, rebalancing, etc
+#
+# cluster.routing.allocation.node_concurrent_recoveries: 2
+
+# Set to throttle throughput when recovering (eg. 100mb, by default unlimited):
+#
+# indices.recovery.max_size_per_sec: 0
+
+# Set to limit the number of open concurrent streams when
+# recovering a shard from a peer:
+#
+# indices.recovery.concurrent_streams: 5
+
+
+################################## Discovery ##################################
+
+# Discovery infrastructure ensures nodes can be found within a cluster
+# and master node is elected. Multicast discovery is the default.
+
+# Set to ensure a node sees N other master eligible nodes to be considered
+# operational within the cluster. Set this option to a higher value (2-4)
+# for large clusters (>3 nodes):
+#
+# discovery.zen.minimum_master_nodes: 1
+
+# Set the time to wait for ping responses from other nodes when discovering.
+# Set this option to a higher value on a slow or congested network
+# to minimize discovery failures:
+#
+# discovery.zen.ping.timeout: 3s
+
+# See <http://elasticsearch.org/guide/reference/modules/discovery/zen.html>
+# for more information.
+
+# Unicast discovery allows to explicitly control which nodes will be used
+# to discover the cluster. It can be used when multicast is not present,
+# or to restrict the cluster communication-wise.
+#
+# 1. Disable multicast discovery (enabled by default):
+#
+discovery.zen.ping.multicast.enabled: false
+#
+# 2. Configure an initial list of master nodes in the cluster
+#    to perform discovery when new nodes (master or data) are started:
+#
+# discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"]
+discovery.zen.ping.unicast.hosts: ["localhost"]
+
+# EC2 discovery allows to use AWS EC2 API in order to perform discovery.
+#
+# You have to install the cloud-aws plugin for enabling the EC2 discovery.
+#
+# See <http://elasticsearch.org/guide/reference/modules/discovery/ec2.html>
+# for more information.
+#
+# See <http://elasticsearch.org/tutorials/2011/08/22/elasticsearch-on-ec2.html>
+# for a step-by-step tutorial.
+
+
+################################## Slow Log ##################################
+
+# Shard level query and fetch threshold logging.
+
+#index.search.slowlog.threshold.query.warn: 10s
+#index.search.slowlog.threshold.query.info: 5s
+#index.search.slowlog.threshold.query.debug: 2s
+#index.search.slowlog.threshold.query.trace: 500ms
+
+#index.search.slowlog.threshold.fetch.warn: 1s
+#index.search.slowlog.threshold.fetch.info: 800ms
+#index.search.slowlog.threshold.fetch.debug: 500ms
+#index.search.slowlog.threshold.fetch.trace: 200ms
+
+#index.indexing.slowlog.threshold.index.warn: 10s
+#index.indexing.slowlog.threshold.index.info: 5s
+#index.indexing.slowlog.threshold.index.debug: 2s
+#index.indexing.slowlog.threshold.index.trace: 500ms
+
+################################## GC Logging ################################
+
+#monitor.jvm.gc.ParNew.warn: 1000ms
+#monitor.jvm.gc.ParNew.info: 700ms
+#monitor.jvm.gc.ParNew.debug: 400ms
+
+#monitor.jvm.gc.ConcurrentMarkSweep.warn: 10s
+#monitor.jvm.gc.ConcurrentMarkSweep.info: 5s
+#monitor.jvm.gc.ConcurrentMarkSweep.debug: 2s
--- a/modules/logstash/files/logstash-agent.conf
+++ b/modules/logstash/files/logstash-agent.conf
@ -0,0 +1,24 @@
+# logstash - agent instance
+#
+# Copied from http://cookbook.logstash.net/recipes/using-upstart/
+
+description     "logstash agent instance"
+
+start on virtual-filesystems
+stop on runlevel [06]
+
+# Respawn it if the process exits
+respawn
+respawn limit 5 30
+expect fork
+
+# You need to chdir somewhere writable because logstash needs to unpack a few
+# temporary files on startup.
+chdir /opt/logstash
+
+script
+
+  # This runs logstash agent as the 'logstash' user
+  su -s /bin/sh -c 'exec "$0" "$@"' logstash -- /usr/bin/java -jar logstash.jar agent -f /etc/logstash/agent.conf --log /var/log/logstash/agent.log &
+  emit logstash-agent-running
+end script
--- a/modules/logstash/files/logstash-indexer.conf
+++ b/modules/logstash/files/logstash-indexer.conf
@ -0,0 +1,24 @@
+# logstash - indexer instance
+#
+# Copied from http://cookbook.logstash.net/recipes/using-upstart/
+
+description     "logstash indexer instance"
+
+start on virtual-filesystems
+stop on runlevel [06]
+
+# Respawn it if the process exits
+respawn
+respawn limit 5 30
+expect fork
+
+# You need to chdir somewhere writable because logstash needs to unpack a few
+# temporary files on startup.
+chdir /opt/logstash
+
+script
+
+  # This runs logstash indexer as the 'logstash' user
+  su -s /bin/sh -c 'exec "$0" "$@"' logstash -- /usr/bin/java -jar logstash.jar agent -f /etc/logstash/indexer.conf --log /var/log/logstash/indexer.log &
+  emit logstash-indexer-running
+end script
--- a/modules/logstash/files/logstash-web.conf
+++ b/modules/logstash/files/logstash-web.conf
@ -0,0 +1,24 @@
+# logstash - web instance
+#
+# Copied from http://cookbook.logstash.net/recipes/using-upstart/
+
+description     "logstash web instance"
+
+start on virtual-filesystems
+stop on runlevel [06]
+
+# Respawn it if the process exits
+respawn
+respawn limit 5 30
+expect fork
+
+# You need to chdir somewhere writable because logstash needs to unpack a few
+# temporary files on startup.
+chdir /opt/logstash
+
+script
+
+  # This runs logstash web as the 'logstash' user
+  su -s /bin/sh -c 'exec "$0" "$@"' logstash -- /usr/bin/java -jar logstash.jar web --backend elasticsearch://127.0.0.1/ --log /var/log/logstash/web.log &
+  emit logstash-web-running
+end script
--- a/modules/logstash/files/redis.conf
+++ b/modules/logstash/files/redis.conf
@ -0,0 +1,417 @@
+# Redis configuration file example
+
+# Note on units: when memory size is needed, it is possible to specifiy
+# it in the usual form of 1k 5GB 4M and so forth:
+#
+# 1k => 1000 bytes
+# 1kb => 1024 bytes
+# 1m => 1000000 bytes
+# 1mb => 1024*1024 bytes
+# 1g => 1000000000 bytes
+# 1gb => 1024*1024*1024 bytes
+#
+# units are case insensitive so 1GB 1Gb 1gB are all the same.
+
+# By default Redis does not run as a daemon. Use 'yes' if you need it.
+# Note that Redis will write a pid file in /var/run/redis.pid when daemonized.
+daemonize yes
+
+# When running daemonized, Redis writes a pid file in /var/run/redis.pid by
+# default. You can specify a custom pid file location here.
+pidfile /var/run/redis/redis-server.pid
+
+# Accept connections on the specified port, default is 6379.
+# If port 0 is specified Redis will not listen on a TCP socket.
+port 6379
+
+# If you want you can bind a single interface, if the bind option is not
+# specified all the interfaces will listen for incoming connections.
+#
+#bind 127.0.0.1
+
+# Specify the path for the unix socket that will be used to listen for
+# incoming connections. There is no default, so Redis will not listen
+# on a unix socket when not specified.
+#
+# unixsocket /var/run/redis/redis.sock
+
+# Close the connection after a client is idle for N seconds (0 to disable)
+timeout 300
+
+# Set server verbosity to 'debug'
+# it can be one of:
+# debug (a lot of information, useful for development/testing)
+# verbose (many rarely useful info, but not a mess like the debug level)
+# notice (moderately verbose, what you want in production probably)
+# warning (only very important / critical messages are logged)
+loglevel notice
+
+# Specify the log file name. Also 'stdout' can be used to force
+# Redis to log on the standard output. Note that if you use standard
+# output for logging but daemonize, logs will be sent to /dev/null
+logfile /var/log/redis/redis-server.log
+
+# To enable logging to the system logger, just set 'syslog-enabled' to yes,
+# and optionally update the other syslog parameters to suit your needs.
+# syslog-enabled no
+
+# Specify the syslog identity.
+# syslog-ident redis
+
+# Specify the syslog facility.  Must be USER or between LOCAL0-LOCAL7.
+# syslog-facility local0
+
+# Set the number of databases. The default database is DB 0, you can select
+# a different one on a per-connection basis using SELECT <dbid> where
+# dbid is a number between 0 and 'databases'-1
+databases 16
+
+################################ SNAPSHOTTING  #################################
+#
+# Save the DB on disk:
+#
+#   save <seconds> <changes>
+#
+#   Will save the DB if both the given number of seconds and the given
+#   number of write operations against the DB occurred.
+#
+#   In the example below the behaviour will be to save:
+#   after 900 sec (15 min) if at least 1 key changed
+#   after 300 sec (5 min) if at least 10 keys changed
+#   after 60 sec if at least 10000 keys changed
+#
+#   Note: you can disable saving at all commenting all the "save" lines.
+
+save 900 1
+save 300 10
+save 60 10000
+
+# Compress string objects using LZF when dump .rdb databases?
+# For default that's set to 'yes' as it's almost always a win.
+# If you want to save some CPU in the saving child set it to 'no' but
+# the dataset will likely be bigger if you have compressible values or keys.
+rdbcompression yes
+
+# The filename where to dump the DB
+dbfilename dump.rdb
+
+# The working directory.
+#
+# The DB will be written inside this directory, with the filename specified
+# above using the 'dbfilename' configuration directive.
+# 
+# Also the Append Only File will be created inside this directory.
+# 
+# Note that you must specify a directory here, not a file name.
+dir /var/lib/redis
+
+################################# REPLICATION #################################
+
+# Master-Slave replication. Use slaveof to make a Redis instance a copy of
+# another Redis server. Note that the configuration is local to the slave
+# so for example it is possible to configure the slave to save the DB with a
+# different interval, or to listen to another port, and so on.
+#
+# slaveof <masterip> <masterport>
+
+# If the master is password protected (using the "requirepass" configuration
+# directive below) it is possible to tell the slave to authenticate before
+# starting the replication synchronization process, otherwise the master will
+# refuse the slave request.
+#
+# masterauth <master-password>
+
+# When a slave lost the connection with the master, or when the replication
+# is still in progress, the slave can act in two different ways:
+#
+# 1) if slave-serve-stale-data is set to 'yes' (the default) the slave will
+#    still reply to client requests, possibly with out of data data, or the
+#    data set may just be empty if this is the first synchronization.
+#
+# 2) if slave-serve-stale data is set to 'no' the slave will reply with
+#    an error "SYNC with master in progress" to all the kind of commands
+#    but to INFO and SLAVEOF.
+#
+slave-serve-stale-data yes
+
+################################## SECURITY ###################################
+
+# Require clients to issue AUTH <PASSWORD> before processing any other
+# commands.  This might be useful in environments in which you do not trust
+# others with access to the host running redis-server.
+#
+# This should stay commented out for backward compatibility and because most
+# people do not need auth (e.g. they run their own servers).
+# 
+# Warning: since Redis is pretty fast an outside user can try up to
+# 150k passwords per second against a good box. This means that you should
+# use a very strong password otherwise it will be very easy to break.
+#
+# requirepass foobared
+
+# Command renaming.
+#
+# It is possilbe to change the name of dangerous commands in a shared
+# environment. For instance the CONFIG command may be renamed into something
+# of hard to guess so that it will be still available for internal-use
+# tools but not available for general clients.
+#
+# Example:
+#
+# rename-command CONFIG b840fc02d524045429941cc15f59e41cb7be6c52
+#
+# It is also possilbe to completely kill a command renaming it into
+# an empty string:
+#
+# rename-command CONFIG ""
+
+################################### LIMITS ####################################
+
+# Set the max number of connected clients at the same time. By default there
+# is no limit, and it's up to the number of file descriptors the Redis process
+# is able to open. The special value '0' means no limits.
+# Once the limit is reached Redis will close all the new connections sending
+# an error 'max number of clients reached'.
+#
+# maxclients 128
+
+# Don't use more memory than the specified amount of bytes.
+# When the memory limit is reached Redis will try to remove keys with an
+# EXPIRE set. It will try to start freeing keys that are going to expire
+# in little time and preserve keys with a longer time to live.
+# Redis will also try to remove objects from free lists if possible.
+#
+# If all this fails, Redis will start to reply with errors to commands
+# that will use more memory, like SET, LPUSH, and so on, and will continue
+# to reply to most read-only commands like GET.
+#
+# WARNING: maxmemory can be a good idea mainly if you want to use Redis as a
+# 'state' server or cache, not as a real DB. When Redis is used as a real
+# database the memory usage will grow over the weeks, it will be obvious if
+# it is going to use too much memory in the long run, and you'll have the time
+# to upgrade. With maxmemory after the limit is reached you'll start to get
+# errors for write operations, and this may even lead to DB inconsistency.
+#
+# maxmemory <bytes>
+
+# MAXMEMORY POLICY: how Redis will select what to remove when maxmemory
+# is reached? You can select among five behavior:
+# 
+# volatile-lru -> remove the key with an expire set using an LRU algorithm
+# allkeys-lru -> remove any key accordingly to the LRU algorithm
+# volatile-random -> remove a random key with an expire set
+# allkeys->random -> remove a random key, any key
+# volatile-ttl -> remove the key with the nearest expire time (minor TTL)
+# noeviction -> don't expire at all, just return an error on write operations
+# 
+# Note: with all the kind of policies, Redis will return an error on write
+#       operations, when there are not suitable keys for eviction.
+#
+#       At the date of writing this commands are: set setnx setex append
+#       incr decr rpush lpush rpushx lpushx linsert lset rpoplpush sadd
+#       sinter sinterstore sunion sunionstore sdiff sdiffstore zadd zincrby
+#       zunionstore zinterstore hset hsetnx hmset hincrby incrby decrby
+#       getset mset msetnx exec sort
+#
+# The default is:
+#
+# maxmemory-policy volatile-lru
+
+# LRU and minimal TTL algorithms are not precise algorithms but approximated
+# algorithms (in order to save memory), so you can select as well the sample
+# size to check. For instance for default Redis will check three keys and
+# pick the one that was used less recently, you can change the sample size
+# using the following configuration directive.
+#
+# maxmemory-samples 3
+
+############################## APPEND ONLY MODE ###############################
+
+# By default Redis asynchronously dumps the dataset on disk. If you can live
+# with the idea that the latest records will be lost if something like a crash
+# happens this is the preferred way to run Redis. If instead you care a lot
+# about your data and don't want to that a single record can get lost you should
+# enable the append only mode: when this mode is enabled Redis will append
+# every write operation received in the file appendonly.aof. This file will
+# be read on startup in order to rebuild the full dataset in memory.
+#
+# Note that you can have both the async dumps and the append only file if you
+# like (you have to comment the "save" statements above to disable the dumps).
+# Still if append only mode is enabled Redis will load the data from the
+# log file at startup ignoring the dump.rdb file.
+#
+# IMPORTANT: Check the BGREWRITEAOF to check how to rewrite the append
+# log file in background when it gets too big.
+
+appendonly no
+
+# The name of the append only file (default: "appendonly.aof")
+# appendfilename appendonly.aof
+
+# The fsync() call tells the Operating System to actually write data on disk
+# instead to wait for more data in the output buffer. Some OS will really flush 
+# data on disk, some other OS will just try to do it ASAP.
+#
+# Redis supports three different modes:
+#
+# no: don't fsync, just let the OS flush the data when it wants. Faster.
+# always: fsync after every write to the append only log . Slow, Safest.
+# everysec: fsync only if one second passed since the last fsync. Compromise.
+#
+# The default is "everysec" that's usually the right compromise between
+# speed and data safety. It's up to you to understand if you can relax this to
+# "no" that will will let the operating system flush the output buffer when
+# it wants, for better performances (but if you can live with the idea of
+# some data loss consider the default persistence mode that's snapshotting),
+# or on the contrary, use "always" that's very slow but a bit safer than
+# everysec.
+#
+# If unsure, use "everysec".
+
+# appendfsync always
+appendfsync everysec
+# appendfsync no
+
+# When the AOF fsync policy is set to always or everysec, and a background
+# saving process (a background save or AOF log background rewriting) is
+# performing a lot of I/O against the disk, in some Linux configurations
+# Redis may block too long on the fsync() call. Note that there is no fix for
+# this currently, as even performing fsync in a different thread will block
+# our synchronous write(2) call.
+#
+# In order to mitigate this problem it's possible to use the following option
+# that will prevent fsync() from being called in the main process while a
+# BGSAVE or BGREWRITEAOF is in progress.
+#
+# This means that while another child is saving the durability of Redis is
+# the same as "appendfsync none", that in pratical terms means that it is
+# possible to lost up to 30 seconds of log in the worst scenario (with the
+# default Linux settings).
+# 
+# If you have latency problems turn this to "yes". Otherwise leave it as
+# "no" that is the safest pick from the point of view of durability.
+no-appendfsync-on-rewrite no
+
+################################ VIRTUAL MEMORY ###############################
+
+# Virtual Memory allows Redis to work with datasets bigger than the actual
+# amount of RAM needed to hold the whole dataset in memory.
+# In order to do so very used keys are taken in memory while the other keys
+# are swapped into a swap file, similarly to what operating systems do
+# with memory pages.
+#
+# To enable VM just set 'vm-enabled' to yes, and set the following three
+# VM parameters accordingly to your needs.
+
+vm-enabled no
+# vm-enabled yes
+
+# This is the path of the Redis swap file. As you can guess, swap files
+# can't be shared by different Redis instances, so make sure to use a swap
+# file for every redis process you are running. Redis will complain if the
+# swap file is already in use.
+#
+# The best kind of storage for the Redis swap file (that's accessed at random) 
+# is a Solid State Disk (SSD).
+#
+# *** WARNING *** if you are using a shared hosting the default of putting
+# the swap file under /tmp is not secure. Create a dir with access granted
+# only to Redis user and configure Redis to create the swap file there.
+vm-swap-file /var/lib/redis/redis.swap
+
+# vm-max-memory configures the VM to use at max the specified amount of
+# RAM. Everything that deos not fit will be swapped on disk *if* possible, that
+# is, if there is still enough contiguous space in the swap file.
+#
+# With vm-max-memory 0 the system will swap everything it can. Not a good
+# default, just specify the max amount of RAM you can in bytes, but it's
+# better to leave some margin. For instance specify an amount of RAM
+# that's more or less between 60 and 80% of your free RAM.
+vm-max-memory 0
+
+# Redis swap files is split into pages. An object can be saved using multiple
+# contiguous pages, but pages can't be shared between different objects.
+# So if your page is too big, small objects swapped out on disk will waste
+# a lot of space. If you page is too small, there is less space in the swap
+# file (assuming you configured the same number of total swap file pages).
+#
+# If you use a lot of small objects, use a page size of 64 or 32 bytes.
+# If you use a lot of big objects, use a bigger page size.
+# If unsure, use the default :)
+vm-page-size 32
+
+# Number of total memory pages in the swap file.
+# Given that the page table (a bitmap of free/used pages) is taken in memory,
+# every 8 pages on disk will consume 1 byte of RAM.
+#
+# The total swap size is vm-page-size * vm-pages
+#
+# With the default of 32-bytes memory pages and 134217728 pages Redis will
+# use a 4 GB swap file, that will use 16 MB of RAM for the page table.
+#
+# It's better to use the smallest acceptable value for your application,
+# but the default is large in order to work in most conditions.
+vm-pages 134217728
+
+# Max number of VM I/O threads running at the same time.
+# This threads are used to read/write data from/to swap file, since they
+# also encode and decode objects from disk to memory or the reverse, a bigger
+# number of threads can help with big objects even if they can't help with
+# I/O itself as the physical device may not be able to couple with many
+# reads/writes operations at the same time.
+#
+# The special value of 0 turn off threaded I/O and enables the blocking
+# Virtual Memory implementation.
+vm-max-threads 4
+
+############################### ADVANCED CONFIG ###############################
+
+# Hashes are encoded in a special way (much more memory efficient) when they
+# have at max a given numer of elements, and the biggest element does not
+# exceed a given threshold. You can configure this limits with the following
+# configuration directives.
+hash-max-zipmap-entries 512
+hash-max-zipmap-value 64
+
+# Similarly to hashes, small lists are also encoded in a special way in order
+# to save a lot of space. The special representation is only used when
+# you are under the following limits:
+list-max-ziplist-entries 512
+list-max-ziplist-value 64
+
+# Sets have a special encoding in just one case: when a set is composed
+# of just strings that happens to be integers in radix 10 in the range
+# of 64 bit signed integers.
+# The following configuration setting sets the limit in the size of the
+# set in order to use this special memory saving encoding.
+set-max-intset-entries 512
+
+# Active rehashing uses 1 millisecond every 100 milliseconds of CPU time in
+# order to help rehashing the main Redis hash table (the one mapping top-level
+# keys to values). The hash table implementation redis uses (see dict.c)
+# performs a lazy rehashing: the more operation you run into an hash table
+# that is rhashing, the more rehashing "steps" are performed, so if the
+# server is idle the rehashing is never complete and some more memory is used
+# by the hash table.
+# 
+# The default is to use this millisecond 10 times every second in order to
+# active rehashing the main dictionaries, freeing memory when possible.
+#
+# If unsure:
+# use "activerehashing no" if you have hard latency requirements and it is
+# not a good thing in your environment that Redis can reply form time to time
+# to queries with 2 milliseconds delay.
+#
+# use "activerehashing yes" if you don't have such hard requirements but
+# want to free memory asap when possible.
+activerehashing yes
+
+################################## INCLUDES ###################################
+
+# Include one or more other config files here.  This is useful if you
+# have a standard template that goes to all redis server but also need
+# to customize a few per-server settings.  Include files can include
+# other files, so use this wisely.
+#
+# include /path/to/local.conf
+# include /path/to/other.conf
--- a/modules/logstash/manifests/agent.pp
+++ b/modules/logstash/manifests/agent.pp
@ -0,0 +1,49 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Class to install logstash agent (shipper).
+# conf_template accepts path to agent config template.
+#
+class logstash::agent (
+  $conf_template = 'logstash/agent.conf.erb'
+) {
+  include logstash
+
+  file { '/etc/init/logstash-agent.conf':
+    ensure  => present,
+    source  => 'puppet:///modules/logstash/logstash-agent.conf',
+    replace => true,
+    owner   => 'root',
+  }
+
+  file { '/etc/logstash/agent.conf':
+    ensure  => present,
+    content => template($conf_template),
+    replace => true,
+    owner   => 'logstash',
+    group   => 'logstash',
+    mode    => '0644',
+    require => Class['logstash'],
+  }
+
+  service { 'logstash-agent':
+    ensure    => running,
+    enable    => true,
+    subscribe => File['/etc/logstash/agent.conf'],
+    require   => [
+      Class['logstash'],
+      File['/etc/init/logstash-agent.conf'],
+    ]
+  }
+}
--- a/modules/logstash/manifests/elasticsearch.pp
+++ b/modules/logstash/manifests/elasticsearch.pp
@ -0,0 +1,55 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Class to install elasticsearch.
+#
+class logstash::elasticsearch {
+  # install java runtime
+  package { 'java7-runtime-headless':
+    ensure => present,
+  }
+
+  exec { 'get_elasticsearch_deb':
+    command => 'wget http://download.elasticsearch.org/elasticsearch/elasticsearch/elasticsearch-0.20.2.deb -O /tmp/elasticsearch-0.20.2.deb',
+    path    => '/bin:/usr/bin',
+    creates => '/tmp/elasticsearch-0.20.2.deb',
+  }
+
+  # install elastic search
+  package { 'elasticsearch':
+    ensure   => present,
+    source   => '/tmp/elasticsearch-0.20.2.deb',
+    provider => 'dpkg',
+    require  => [
+      Package['java7-runtime-headless'],
+      Exec['get_elasticsearch_deb'],
+    ]
+  }
+
+  file { '/etc/elasticsearch/elasticsearch.yml':
+    ensure  => present,
+    source  => 'puppet:///modules/logstash/elasticsearch.yml',
+    replace => true,
+    owner   => 'root',
+    group   => 'root',
+    mode    => '0644',
+    require => Package['elasticsearch'],
+  }
+
+  service { 'elasticsearch':
+    ensure    => running,
+    require   => Package['elasticsearch'],
+    subscribe => File['/etc/elasticsearch/elasticsearch.yml'],
+  }
+}
--- a/modules/logstash/manifests/indexer.pp
+++ b/modules/logstash/manifests/indexer.pp
@ -0,0 +1,49 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Class to install logstash indexer.
+# conf_template accepts path to indexer config template.
+#
+class logstash::indexer (
+  $conf_template = 'logstash/indexer.conf.erb'
+) {
+  include logstash
+
+  file { '/etc/init/logstash-indexer.conf':
+    ensure  => present,
+    source  => 'puppet:///modules/logstash/logstash-indexer.conf',
+    replace => true,
+    owner   => 'root',
+  }
+
+  file { '/etc/logstash/indexer.conf':
+    ensure  => present,
+    content => template($conf_template),
+    replace => true,
+    owner   => 'logstash',
+    group   => 'logstash',
+    mode    => '0644',
+    require => Class['logstash'],
+  }
+
+  service { 'logstash-indexer':
+    ensure    => running,
+    enable    => true,
+    subscribe => File['/etc/logstash/indexer.conf'],
+    require   => [
+      Class['logstash'],
+      File['/etc/init/logstash-indexer.conf'],
+    ]
+  }
+}
--- a/modules/logstash/manifests/init.pp
+++ b/modules/logstash/manifests/init.pp
@ -0,0 +1,71 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Class to install common logstash items.
+#
+class logstash {
+  group { 'logstash':
+    ensure => present,
+  }
+
+  user { 'logstash':
+    ensure     => present,
+    comment    => 'Logstash User',
+    home       => '/opt/logstash',
+    gid        => 'logstash',
+    shell      => '/bin/bash',
+    membership => 'minimum',
+    require    => Group['logstash'],
+  }
+
+  file { '/opt/logstash':
+    ensure   => directory,
+    owner    => 'logstash',
+    group    => 'logstash',
+    mode     => '0644',
+    require  => User['logstash'],
+  }
+
+  exec { 'get_logstash_jar':
+    command => 'wget http://logstash.objects.dreamhost.com/release/logstash-1.1.9-monolithic.jar -O /opt/logstash/logstash.jar',
+    path    => '/bin:/usr/bin',
+    creates => '/opt/logstash/logstash.jar',
+    require => File['/opt/logstash'],
+  }
+
+  file { '/opt/logstash/logstash.jar':
+    ensure  => present,
+    owner   => 'logstash',
+    group   => 'logstash',
+    mode    => '0644',
+    require => [
+      User['logstash'],
+      Exec['get_logstash_jar'],
+    ]
+  }
+
+  file { '/var/log/logstash':
+    ensure => directory,
+    owner  => 'logstash',
+    group  => 'logstash',
+    mode   => '0644',
+  }
+
+  file { '/etc/logstash':
+    ensure => directory,
+    owner  => 'logstash',
+    group  => 'logstash',
+    mode   => '0644',
+  }
+}
--- a/modules/logstash/manifests/redis.pp
+++ b/modules/logstash/manifests/redis.pp
@ -0,0 +1,41 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Class to install redis.
+#
+class logstash::redis {
+  # TODO(clarkb): Access to redis should be controlled at a network level
+  # (with iptables) and with client authentication. Put this in place before
+  # opening redis to external clients.
+
+  package { 'redis-server':
+    ensure => present,
+  }
+
+  file { '/etc/redis/redis.conf':
+    ensure  => present,
+    source  => 'puppet:///modules/logstash/redis.conf',
+    replace => true,
+    owner   => 'root',
+    group   => 'root',
+    mode    => '0644',
+    require => Package['redis-server'],
+  }
+
+  service { 'redis-server':
+    ensure    => running,
+    require   => Package['redis-server'],
+    subscribe => File['/etc/redis/redis.conf'],
+  }
+}
--- a/modules/logstash/manifests/web.pp
+++ b/modules/logstash/manifests/web.pp
@ -0,0 +1,55 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Class to run logstash web front end.
+#
+class logstash::web (
+  $vhost_name = $::fqdn,
+  $serveradmin = "webmaster@${::fqdn}"
+) {
+  include apache
+  apache::vhost { $vhost_name:
+    port     => 80,
+    docroot  => 'MEANINGLESS ARGUMENT',
+    priority => '50',
+    template => 'logstash/logstash.vhost.erb',
+  }
+  a2mod { 'rewrite':
+    ensure => present,
+  }
+  a2mod { 'proxy':
+    ensure => present,
+  }
+  a2mod { 'proxy_http':
+    ensure => present,
+  }
+
+  include logstash
+
+  file { '/etc/init/logstash-web.conf':
+    ensure  => present,
+    source  => 'puppet:///modules/logstash/logstash-web.conf',
+    replace => true,
+    owner   => 'root',
+  }
+
+  service { 'logstash-web':
+    ensure    => running,
+    enable    => true,
+    require   => [
+      Class['logstash'],
+      File['/etc/init/logstash-web.conf'],
+    ]
+  }
+}
--- a/modules/logstash/templates/agent.conf.erb
+++ b/modules/logstash/templates/agent.conf.erb
@ -0,0 +1,9 @@
+input {
+  stdin {
+    type => "stdin-type"
+  }
+}
+
+output {
+  redis { host => "127.0.0.1" data_type => "list" key => "logstash" }
+}
--- a/modules/logstash/templates/indexer.conf.erb
+++ b/modules/logstash/templates/indexer.conf.erb
@ -0,0 +1,18 @@
+input {
+  redis {
+    host => "127.0.0.1"
+    type => "redis-input"
+    # these settings should match the output of the agent
+    data_type => "list"
+    key => "logstash"
+
+    # We use json_event here since the sender is a logstash agent
+    format => "json_event"
+  }
+}
+
+output {
+  elasticsearch {
+    host => "127.0.0.1"
+  }
+}
--- a/modules/logstash/templates/logstash.vhost.erb
+++ b/modules/logstash/templates/logstash.vhost.erb
@ -0,0 +1,13 @@
+<VirtualHost <%= scope.lookupvar("::logstash::web::vhost_name") %>:80>
+             ServerName <%= scope.lookupvar("::logstash::web::vhost_name") %>
+             ServerAdmin <%= scope.lookupvar("::logstash::web::serveradmin") %>
+
+             ErrorLog ${APACHE_LOG_DIR}/<%= scope.lookupvar("::logstash::web::vhost_name") %>-error.log
+
+             LogLevel warn
+
+             CustomLog ${APACHE_LOG_DIR}/<%= scope.lookupvar("::logstash::web::vhost_name") %>-access.log combined
+
+             ProxyPass / http://127.0.0.1:9292/ retry=0
+             ProxyPassReverse / http://127.0.0.1:9292/
+</VirtualHost>
--- a/modules/openstack_project/manifests/logstash.pp
+++ b/modules/openstack_project/manifests/logstash.pp
@ -0,0 +1,34 @@
+# Copyright 2013 Hewlett-Packard Development Company, L.P.
+#
+# Licensed under the Apache License, Version 2.0 (the "License"); you may
+# not use this file except in compliance with the License. You may obtain
+# a copy of the License at
+#
+#      http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+# License for the specific language governing permissions and limitations
+# under the License.
+#
+# Logstash indexer server glue class.
+#
+class openstack_project::logstash (
+  $sysadmins = []
+) {
+  class { 'openstack_project::server':
+    iptables_public_tcp_ports => [22, 80],
+    sysadmins                 => $sysadmins,
+  }
+
+  class { 'logstash::agent':
+    conf_template => 'openstack_project/logstash/agent.conf.erb',
+  }
+  class { 'logstash::indexer':
+    conf_template => 'openstack_project/logstash/indexer.conf.erb',
+  }
+  include logstash::redis
+  include logstash::elasticsearch
+  include logstash::web
+}
--- a/modules/openstack_project/templates/logstash/agent.conf.erb
+++ b/modules/openstack_project/templates/logstash/agent.conf.erb
@ -0,0 +1,8 @@
+input {
+  syslog {
+    type => syslog
+    port => 5514
+  }
+}
+
+<%= scope.function_template(['openstack_project/logstash/redis-output.conf.erb']) %>
--- a/modules/openstack_project/templates/logstash/indexer.conf.erb
+++ b/modules/openstack_project/templates/logstash/indexer.conf.erb
@ -0,0 +1,18 @@
+input {
+  redis {
+    host => "127.0.0.1"
+    type => "redis-input"
+    # these settings should match the output of the agent
+    data_type => "list"
+    key => "logstash"
+
+    # We use json_event here since the sender is a logstash agent
+    format => "json_event"
+  }
+}
+
+output {
+  elasticsearch {
+    host => "127.0.0.1"
+  }
+}
--- a/modules/openstack_project/templates/logstash/redis-output.conf.erb
+++ b/modules/openstack_project/templates/logstash/redis-output.conf.erb
@ -0,0 +1,3 @@
+output {
+  redis { host => "127.0.0.1" data_type => "list" key => "logstash" }
+}