Sandy Walsh 2012-10-26 15:00:50 -03:00
StackTach is a debugging tool for OpenStack Nova.
It takes events from AMQP and inserts them in a database for the StackTach django project server for web display.
stacktach workers consume the monitor.info and monitor.error rabbit messape queues.
It's important that they keep running otherwise the queues fill up and slow down the
whole nova environment.
There's a django app in the 'stacktach' directory that lets you see what's
going on in your nova install, then there's two flavors of worker processes
that collect the amqp messages for displaying in the stacktach django apps.
The app was originally designed as a service so that different customers could
monitor their nova deployments, so it has 'tenants'.
After installing the django app like most django apps, if you navigate to it,
it'll prompt you to make a new tenant. It'll be given the number 1. You can
view it in the browser by appending /1 onto the end of the url.
That tenant ID then needs to be passed to one of the worker setups.
The original worker setup uses the single script 'worker.py'. It depends on
the 'kombu' lib to talk to your amqp server and uses eventlet green threads,
one for each nova deploy.
The newer version consists of the scripts:
* start_workers.py - Starts up sub-processes of worker_new.py's
* worker_conf.py - Holds the configs for your nova deploys
* worker_new.py - The actual worker code
This version was written to address stability issues in the original.
It depends on 'amqplib' for talking to the amqp broker.
It uses subprocess instead of threading to fire up the workers, one sub-proc
for each nova deploy.
The newer version seems to have a memory leak, so needs restarting
occasionally, but seems to be overall more stabler than the original.
Before use, put your nova install detail into 'worker_conf.py'
If you need to start them manually:
sudo -i
cd /path/to/stacktach/install
export DJANGO_SETTINGS_MODULE=stacktach.settings
python start_workers.py
The start_workers.py imports the list of deployments to consume from worker_conf.py,
then it starts a sub process of worker.py for each deploy.
Nice way to see the logs in a screen session:
ls *.log | grep "global\|cell" | xargs watch --differences tail -n2

# StackTach
StackTach is a debugging / monitoring utility for OpenStack ([Open]StackTach[ometer]). StackTach can work with multiple datacenters including multi-cell deployments.
## Overview
OpenStack has the ability to publish notifications to a RabbitMQ exchange as they occur. So, rather than pouring through reams of logs across multiple servers, you can now watch requests travel through the system from a single location.
A detailed description of the notifications published by OpenStack [is available here](http://wiki.openstack.org/SystemUsageData)
StackTach has three primary components:
1. The Worker daemon. Consumes the notifications from the Rabbit queue and writes it to a SQL database.
1. The Web UI, which is a Django application. Provides a real-time display of notifications as they are consumed by the worker. Also provides for point-and-click analysis of the events for following related events.
1. Stacky, the command line tool. Operator and Admins aren't big fans of web interfaces. StackTach also exposes a REST interface which Stacky can use to provide output suitable for tail/grep post-processing.
## Installing StackTach
### The "Hurry Up" Install Guide
1. Create a database for StackTach to use. By default, StackTach assumes MySql, but you can modify the settings.py file to others.
1. Install django and the other required libraries listed in `./etc/pip-requires.txt` (I hope I got 'em all)
1. Clone this repo
1. Copy and configure the config files in `./etc` (see below for details)
1. Create the necessary database tables (python manage.py syncdb) You don't need an administrator account since there are no user profiles used.
1. Configure OpenStack to publish Notifications back into RabbitMQ (see below)
1. Restart the OpenStack services.
1. Run the Worker to start consuming messages. (see below)
1. Run the web server (python manage.py runserver)
1. Point your browser to `` (the default server location)
1. Click on stuff, see what happens. You can't hurt anything, it's all read-only.
Of course, this is only suitable for playing around. If you want to get serious about deploying StackTach you should set up a proper webserver and database on standalone servers. There is a lot of data that gets collected by StackTach (depending on your deployment size) ... be warned. Keep an eye on DB size.
#### The Config Files
There are two config files for StackTach. The first one tells us where the second one is. A sample of these two files is in `./etc/sample_*`
The `sample_stacktach_config.sh` shell script defines the necessary environment variables StackTach needs. Most of these are just information about the database (assuming MySql) but some are a little different.
`STACKTACH_INSTALL_DIR` should point to where StackTach is running out of. In most cases this will be your repo directory, but it could be elsewhere if your going for a proper deployment.
The StackTach worker needs to know which RabbitMQ servers to listen to. This information is stored in the deployment file. `STACKTACH_DEPLOYMENTS_FILE` should point to this json file. To learn more about the deployments file, see further down.
Finally, `DJANGO_SETTINGS_MODULE` tells Django where to get its configuration from. This should point to the `setting.py` file. You shouldn't have to do much with the `settings.py` file and most of what it needs is in these environment variables.
The `sample_stacktach_worker_config.json` file tells StackTach where each of the RabbitMQ servers are that it needs to get events from. In most cases you'll only have one entry in this file, but for large multi-cell deployments, this file can get pretty large. It's also handy for setting up one StackTach for each developer environment.
The file is in json format and the main configuration is under the `"deployments"` key, which should contain a list of deployment dictionaries.
A blank worker config file would look like this:
{"deployments": [] }
But that's not much fun. A deployment entry would look like this:
{"deployments": [
"name": "east_coast.prod.cell1",
"rabbit_host": "",
"rabbit_port": 5672,
"rabbit_userid": "rabbit",
"rabbit_password": "rabbit",
"rabbit_virtual_host": "/"
where, *name* is whatever you want to call your deployment, and *rabbit_<>* are the connectivity details for your rabbit server. It should be the same information in your `nova.conf` file that OpenStack is using. Note, json has no concept of comments, so using `#`, `//` or `/* */` as a comment won't work.
You can add as many deployments as you like. When `./worker/start_workers.py` will process for each deployment defined.
That's it.
Once you have this working well, you should download and install Stacky and play with the command line tool.

export STACKTACH_DB_NAME="stacktach"
export STACKTACH_DB_PASSWORD="password"
export STACKTACH_INSTALL_DIR="/srv/www/stacktach/"
export STACKTACH_DEPLOYMENTS_FILE="/srv/www/stacktach/stacktach_worker_config.json"
export DJANGO_SETTINGS_MODULE="settings"

{"deployments": [
"name": "east_coast.prod.global",
"rabbit_host": "",
"rabbit_port": 5672,
"rabbit_userid": "rabbit",
"rabbit_password": "rabbit",
"rabbit_virtual_host": "/"
"name": "east_coast.prod.cell1",
"rabbit_host": "",
"rabbit_port": 5672,
"rabbit_userid": "rabbit",
"rabbit_password": "rabbit",
"rabbit_virtual_host": "/"

# Django settings for stproject project.
import os
db_name = os.environ['STACKTACH_DB_NAME']
db_username = os.environ['STACKTACH_DB_USERNAME']
db_password = os.environ['STACKTACH_DB_PASSWORD']
install_dir = os.environ['STACKTACH_INSTALL_DIR']
DEBUG = True
# ('Your Name', 'your_email@example.com'),
'default': {
'ENGINE': 'django.db.backends.mysql',
'NAME': db_name,
'USER': db_username,
'PASSWORD': db_password,
'HOST': '', # Set to empty string for localhost.
'PORT': '', # Set to empty string for default.
# Local time zone for this installation. Choices can be found here:
# http://en.wikipedia.org/wiki/List_of_tz_zones_by_name
# although not all choices may be available on all operating systems.
# On Unix systems, a value of None will cause Django to use the same
# timezone as the operating system.
# If running in a Windows environment this must be set to the same as your
# system time zone.
TIME_ZONE = 'America/Chicago'
# Language code for this installation. All choices can be found here:
# http://www.i18nguy.com/unicode/language-identifiers.html
# If you set this to False, Django will make some optimizations so as not
# to load the internationalization machinery.
USE_I18N = True
# If you set this to False, Django will not format dates, numbers and
# calendars according to the current locale
USE_L10N = True
# Absolute filesystem path to the directory that will hold user-uploaded files.
# Example: "/home/media/media.lawrence.com/media/"
# URL that handles the media served from MEDIA_ROOT. Make sure to use a
# trailing slash.
# Examples: "http://media.lawrence.com/media/", "http://example.com/media/"
# Absolute path to the directory static files should be collected to.
# Don't put anything in this directory yourself; store your static files
# in apps' "static/" subdirectories and in STATICFILES_DIRS.
# Example: "/home/media/media.lawrence.com/static/"
# URL prefix for static files.
# Example: "http://media.lawrence.com/static/"
STATIC_URL = '/static/'
# URL prefix for admin static files -- CSS, JavaScript and images.
# Make sure to use a trailing slash.
# Examples: "http://foo.com/static/admin/", "/static/admin/".
#ADMIN_MEDIA_PREFIX = '/static/admin/'
# Additional locations of static files
# Put strings here, like "/home/html/static" or "C:/www/django/static".
# Always use forward slashes, even on Windows.
# Don't forget to use absolute paths, not relative paths.
[install_dir + "static",]
# List of finder classes that know how to find static files in
# various locations.
# 'django.contrib.staticfiles.finders.DefaultStorageFinder',
# Make this unique, and don't share it with anybody.
SECRET_KEY = 'x=rgdy5@(*!$e5ou0j!q104+m5mt1d%ud9ujyykhklss7*um3t'
# List of callables that know how to import templates from various sources.
# 'django.template.loaders.eggs.Loader',
ROOT_URLCONF = 'stacktach.urls'
# Put strings here, like "/home/html/django_templates" or "C:/www/django/templates".
# Always use forward slashes, even on Windows.
# Don't forget to use absolute paths, not relative paths.
install_dir + "templates"
# A sample logging configuration. The only tangible logging
# performed by this configuration is to send an email to
# the site admins on every HTTP 500 error.
# See http://docs.djangoproject.com/en/dev/topics/logging for
# more details on how to customize your logging configuration.
'version': 1,
'disable_existing_loggers': False,
'handlers': {
'mail_admins': {
'level': 'ERROR',
'class': 'django.utils.log.AdminEmailHandler'
'loggers': {
'django.request': {
'handlers': ['mail_admins'],
'level': 'ERROR',
'propagate': True,

from django.db import models
class Tenant(models.Model):
email = models.CharField(max_length=50)
project_name = models.CharField(max_length=50)
nova_stats_template = models.CharField(max_length=200)
loggly_template = models.CharField(max_length=200)
tenant_id = models.AutoField(primary_key=True, unique=True)
class Deployment(models.Model):
name = models.CharField(max_length=50)
def get_or_create_deployment(name):
return Deployment.objects.get_or_create(name=name)
class RawData(models.Model):
tenant = models.ForeignKey(Tenant, db_index=True,
nova_tenant = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
deployment = models.ForeignKey(Deployment)
tenant = models.CharField(max_length=50, null=True, blank=True,
json = models.TextField()
routing_key = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
state = models.CharField(max_length=50, null=True,
state = models.CharField(max_length=20, null=True,
blank=True, db_index=True)
old_state = models.CharField(max_length=20, null=True,
blank=True, db_index=True)
old_task = models.CharField(max_length=30, null=True,
blank=True, db_index=True)
when = models.DateTimeField(db_index=True)
microseconds = models.IntegerField(default=0)
microseconds = models.IntegerField(default=0, db_index=True)
publisher = models.CharField(max_length=100, null=True,
blank=True, db_index=True)
event = models.CharField(max_length=50, null=True,
@ -47,21 +50,34 @@ class RawData(models.Model):
blank=True, db_index=True)
instance = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
value = models.FloatField(null=True, blank=True, db_index=True)
units = models.CharField(max_length=10, null=True,
request_id = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
# Grouping ID is a number assigned and meant to fence-post
# a block of time. <set group id = 1> <do stuff> <set group id = 2> ...
# Later there will be REST call for setting this.
grouping_id = models.IntegerField(default=0, db_index=True)
# Nested calls can be grouped by a common transaction ID if you like.
# A calls B calls C calls D. These can all be linked with a common
# transaction ID if you like.
transaction_id = models.IntegerField(default=0, db_index=True)
def __repr__(self):
return self.event
class TenantForm(forms.ModelForm):
class Meta:
model = Tenant
fields = ('email', 'project_name', 'nova_stats_template', 'loggly_template')
class Lifecycle(models.Model):
instance = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
last_state = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
last_task_state = models.CharField(max_length=50, null=True,
blank=True, db_index=True)
last_raw = models.ForeignKey(RawData, null=True)
class Timing(models.Model):
name = models.CharField(max_length=50, db_index=True)
lifecycle = models.ForeignKey(Lifecycle)
start_raw = models.ForeignKey(RawData, related_name='+', null=True)
end_raw = models.ForeignKey(RawData, related_name='+', null=True)
start_when = models.DateTimeField(db_index=True, null=True)
start_ms = models.IntegerField(default=0)
end_when = models.DateTimeField(db_index=True, null=True)
end_ms = models.IntegerField(default=3)
diff_days = models.IntegerField(default=0)
diff_seconds = models.IntegerField(default=0)
diff_usecs = models.IntegerField(default=0)

import datetime
import json
from django.db.models import Q
from django.http import HttpResponse
import models
import views
SECS_PER_DAY = 60 * 60 * 24
def get_event_names():
return models.RawData.objects.values('event').distinct()
def get_host_names():
# TODO: We need to upgrade to Django 1.4 so we can get tenent id and
# host and just do distinct on host name.
# like: values('host', 'tenant_id').distinct('host')
# This will be more meaningful. Host by itself isn't really.
return models.RawData.objects.values('host').distinct()
def routing_key_type(key):
if key.endswith('error'):
return 'E'
return ' '
def get_deployments():
return models.Deployment.objects.all().order_by('name')
def show_timings_for_uuid(uuid):
lifecycles = models.Lifecycle.objects.filter(instance=uuid)
results = []
for lc in lifecycles:
timings = models.Timing.objects.filter(lifecycle=lc)
if not timings:
this = []
this.append(["?", "Event", "Time (secs)"])
for t in timings:
state = "?"
if t.start_raw:
state = 'S'
if t.end_raw:
sate = 'E'
if t.start_raw and t.end_raw:
state = "."
this.append([state, t.name, sec_to_time(seconds_from_timing(t))])
return results
def seconds_from_timedelta(days, seconds, usecs):
us = usecs / 1000000.0
return (days * SECS_PER_DAY) + seconds + us
def seconds_from_timing(t):
return seconds_from_timedelta(t.diff_days, t.diff_seconds, t.diff_usecs)
def sec_to_time(fseconds):
seconds = int(fseconds)
usec = fseconds - seconds
days = seconds / (60 * 60 * 24)
seconds -= (days * (60 * 60 * 24))
hours = seconds / (60 * 60)
seconds -= (hours * (60 * 60))
minutes = seconds / 60
seconds -= (minutes * 60)
usec = ('%.2f' % usec).lstrip('0')
return "%dd %02d:%02d:%02d%s" % (days, hours, minutes, seconds, usec)
def rsp(data):
return HttpResponse(json.dumps(data))
def do_deployments(request):
deployments = get_deployments()
results = []
results.append(["#", "Name"])
for deployment in deployments:
results.append([deployment.id, deployment.name])
return rsp(results)
def do_events(request):
events = get_event_names()
results = []
results.append(["Event Name"])
for event in events:
return rsp(results)
def do_hosts(request):
hosts = get_host_names()
results = []
results.append(["Host Name"])
for host in hosts:
return rsp(results)
def do_uuid(request):
uuid = str(request.GET['uuid'])
related = models.RawData.objects.select_related(). \
filter(instance=uuid).order_by('when', 'microseconds')
results = []
results.append(["#", "?", "When", "Deployment", "Event", "Host",
"State", "State'", "Task'"])
for e in related:
results.append([e.id, routing_key_type(e.routing_key), str(e.when),
e.deployment.name, e.event,
e.state, e.old_state, e.old_task])
return rsp(results)
def do_timings(request, name):
results = []
results.append([name, "Time"])
timings = models.Timing.objects.select_related().filter(name=name) \
.exclude(Q(start_raw=None) | Q(end_raw=None)) \
.order_by('diff_days', 'diff_seconds',
for t in timings:
seconds = seconds_from_timing(t)
results.append([t.lifecycle.instance, sec_to_time(seconds)])
return rsp(results)
def do_summary(request):
events = get_event_names()
interesting = []
for e in events:
ev = e['event']
if ev.endswith('.start'):
results = []
results.append(["Event", "N", "Min", "Max", "Avg"])
for name in interesting:
timings = models.Timing.objects.filter(name=name) \
.exclude(Q(start_raw=None) | Q(end_raw=None))
if not timings:
total, _min, _max = 0.0, None, None
num = len(timings)
for t in timings:
seconds = seconds_from_timing(t)
total += seconds
if _min is None:
_min = seconds
if _max is None:
_max = seconds
_min = min(_min, seconds)
_max = max(_max, seconds)
results.append([name, int(num), sec_to_time(_min),
sec_to_time(_max), sec_to_time(int(total/num)) ])
return rsp(results)
def do_request(request):
request_id = request.GET['request_id']
events = models.RawData.objects.filter(request_id=request_id) \
.order_by('when', 'microseconds')
results = []
results.append(["#", "?", "When", "Deployment", "Event", "Host",
"State", "State'", "Task'"])
for e in events:
results.append([e.id, routing_key_type(e.routing_key), str(e.when),
e.deployment.name, e.event,
e.host, e.state,
e.old_state, e.old_task])
return rsp(results)
def do_show(request, event_id):
event_id = int(event_id)
results = []
event = None
event = models.RawData.objects.get(id=event_id)
except models.RawData.ObjectDoesNotExist:
return results
results.append(["Key", "Value"])
results.append(["#", event.id])
results.append(["Deployment", event.deployment.name])
results.append(["Category", event.routing_key])
results.append(["Publisher", event.publisher])
results.append(["State", event.state])
results.append(["Event", event.event])
results.append(["Service", event.service])
results.append(["Host", event.host])
results.append(["UUID", event.instance])
results.append(["Req ID", event.request_id])
final = [results, ]
j = json.loads(event.json)
final.append(json.dumps(j, indent=2))
return rsp(final)
def do_watch(request, deployment_id=None, event_name="", since=None):
deployment_map = {}
for d in get_deployments():
deployment_map[d.id] = d
events = get_event_names()
max_event_width = max([len(event['event']) for event in events])
hosts = get_host_names()
max_host_width = max([len(host['host']) for host in hosts])
deployment = None
if deployment_id:
deployment = models.Deployment.objects.get(id=deployment_id)
base_events = models.RawData.objects.order_by('-when', '-microseconds')
if tenant:
base_events = base_events.filter(deployment=deployment_id)
if event_name:
base_events = base_events.filter(event=event_name)
if since:
since = datetime.datetime.strptime(since, "%Y-%m-%d %H:%M:%S.%f")
events = events.filter(when__gt=since)
events = events[:20]
c = [10, 1, 10, 20, max_event_width, 36]
header = ("+%s" * len(c)) + "+"
splat = header.replace("+", "|")
results = []
results.append([''.center(col, '-') for col in c])
results.append(['#'.center(c[0]), '?',
results.append([''.center(col, '-') for col in c])
last = None
for event in events:
uuid = event.instance
if not uuid:
uuid = "-"
typ = routing_key_type(event.routing_key)
last = event.when
return rsp([results, last])
def do_kpi(request):
yesterday = datetime.datetime.now() - datetime.timedelta(days=1)
events = models.RawData.objects.exclude(instance=None) \
.exclude(when__lt=yesterday) \
.filter(Q(event__endswith='.end') |
Q(event="compute.instance.update")) \
.only('event', 'host', 'request_id',
'instance', 'deployment') \
.order_by('when', 'microseconds')
events = list(events)
instance_map = {} # { uuid: [(request_id, start_event, end_event), ...] }
for e in events:
if e.event == "compute.instance.update":
if "api" in e.host:
activities = instance_map.get(e.instance, [])
activities.append((e.request_id, e, None))
instance_map[e.instance] = activities
if not e.event.endswith(".end"):
activities = instance_map.get(e.instance)
if not activities:
# We missed the api start, skip it
found = False
for index, a in enumerate(activities):
request_id, start_event, end_event = a
#if end_event is not None:
# continue
if request_id == e.request_id:
end_event = e
activities[index] = (request_id, start_event, e)
found = True
results = []
results.append(["Event", "Time", "UUID", "Deployment"])
for uuid, activities in instance_map.iteritems():
for request_id, start_event, end_event in activities:
if not end_event:
event = end_event.event[:-len(".end")]
start = views._make_datetime_from_raw(start_event.when,
end = views._make_datetime_from_raw(end_event.when,
diff = end - start
results.append([event, sec_to_time(seconds_from_timedelta(
diff.days, diff.seconds, diff.microseconds)), uuid,
return rsp(results)

Copyright 2012 - Dark Secret Software Inc.
All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may
not use this file except in compliance with the License. You may obtain
a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.
<html lang="en">
<link href="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8/themes/base/jquery-ui.css" rel="stylesheet" type="text/css"/>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.5/jquery.min.js"></script>
<script src="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8/jquery-ui.min.js"></script>
<script type="text/javascript" src="/static/jquery.timers.js"></script>
<link href='http://fonts.googleapis.com/css?family=Kaushan+Script' rel='stylesheet' type='text/css'>
<link href='http://fonts.googleapis.com/css?family=PT+Sans&subset=latin' rel='stylesheet' type='text/css'>
<style type="text/css">
.fancy {
font-family: 'Kaushan Script';
font-style: normal;
padding-bottom: 0em;
font-size: 3em;
h1, h2 {
font-family: 'PT Sans', serif;
font-style: normal;
letter-spacing: -0.076em;
line-height: 1em;
body {
font-family:"Helvetica Neue", Arial, Helvetica, sans-serif;
a {
a:hover {
.cell-border {
border-left: 1px solid #bbbbcc;
.title {
font-weight: bold;
td {
padding-right: 1em;
.status {
border:1px #bbbbcc solid;
.status-title {
margin-left: 1em;
margin-right: 1em;
background-color: white;
font-family: 'PT Sans', serif;
.status-inner {
.std-height {
<script type="text/javascript">
{% block extra_js %}
{% endblock %}
$(document).ready(function() {
{% block extra_init_js %}
{% endblock %}
<div class='fancy'>StackTach</div>
{% block body %}
{% endblock %}

{% extends "base.html" %}
{% block extra_js %}
function details(tenant_id, column, row_id)
$("#detail").load('/' + tenant_id + '/details/' + column + '/' + row_id);
function expand(tenant_id, row_id)
$("#row_expansion_" + row_id).load('/' + tenant_id + '/expand/' + row_id);
function search_form(tenant_id)
var field = $("#field").val();
var value = $("#query").val();
var data = {'field':field, 'value':value};
$("#detail").load('/' + tenant_id + '/search/', data);
return false;
{% endblock %}
{% block extra_init_js %}
{% endblock %}
{% block body %}
<div style='float:right;'><a href='/logout'>logout</a></div>
<div class='status-title'>Recent Activity - {{state.tenant.project_name}} (TID:{{state.tenant.tenant_id}})</div>
<div id='host-box' class='status std-height'>
<div id='host_activity' class='status-inner'>
{% include "host_status.html" %}
<div class='status-title'>Commands</div>
<div class='status'>
<div class='status-inner' style='padding-bottom:0em; margin-bottom:0em; margin-top:1em'>
<form action="">
<select id='field'>
<option value='routing_key'>source
<option value='nova_tenant'>tenant
<option selected='true'>instance
<input type='text' id='query' size='60' value=''/>
<input type='submit' value='Search' onclick='return search_form({{state.tenant.tenant_id}});'/>
<div class='status-title'>Details</div>
<div class='status'>
<div id='detail' class='status-inner'>
<div>click on an item above to see more of the same type.</div>
{% endblock %}

{% extends "base.html" %}
{% block body %}
<div class='status-title'>New Tenant</div>
<div id='host-box' class='status'>
<div id='host_activity' class='status-inner'>
<form action='/new_tenant/' method='post'>{% csrf_token %}
{{ form.as_p }}
<input type="submit" value="Submit" />
{% endblock %}

<table style='font-size:1em;'>
<td class='title'></td>
<td class='title'>source</td>
<td class='title'>tenant</td>
<td class='title'>service</td>
<td class='title'>host</td>
<td class='title'>event</td>
<td class='title'>instance</td>
<td class='title'>when</td>
{% if not rows %}
<tr><td>No results</td></tr>
{% endif %}
{% for row in rows %}
<tr {% if row.highlight %}style='background-color:#FFD88F;'{% endif %} >
{% if allow_expansion %}
<a href='#' onclick='expand({{state.tenant.tenant_id}}, {{row.id}});'>[+]</a>
{% endif %}
<td><span style='{% if row.is_error %}background-color:#ffaaaa;{% endif %}'>
<a href='#' onclick='details({{state.tenant.tenant_id}}, "routing_key", {{row.id}});'>{{row.routing_key}}</a>
<td class='cell-border'>
<a href='#' onclick='details({{state.tenant.tenant_id}}, "nova_tenant", {{row.id}});'>
{% if row.nova_tenant %}{{row.nova_tenant}}{% endif %}
<td class='cell-border'><a href='#' onclick='details({{state.tenant.tenant_id}}, "service", {{row.id}});'>{{row.service}}</a></td>
<td class='cell-border'><a href='#' onclick='details({{state.tenant.tenant_id}}, "host", {{row.id}});'>{{row.host}}</a></td>
<td class='cell-border'><b><a href='#' onclick='details({{state.tenant.tenant_id}}, "event", {{row.id}});'>{{row.event}}</a></b></td>
<td class='cell-border'>
{% if row.instance %}
<a href='{{row.loggly}}' target="_blank">(L)</a>
<a href='{{row.novastats}}' target="_blank">(S)</a>
<a href='#' onclick='details({{state.tenant.tenant_id}}, "instance", {{row.id}});'>
{% endif %}
<td class='cell-border'><a href='#' onclick='details({{state.tenant.tenant_id}}, "when", {{row.id}});'>{% if show_absolute_time %}{{row.when}} (+{{row.when.microsecond}}){%else%}{{row.when|timesince:utc}} ago{%endif%}</a></td>
{% if allow_expansion %}
<td colspan=8>
<div id='row_expansion_{{row.id}}' style='font-size:1.2em'></div>
{% endif %}
{% endfor %}

{% extends "base.html" %}
{% block body %}
<div class='status-title'>About</div>
<div id='host-box' class='status'>
<div id='host_activity' class='status-inner'>
StackTach is a hosted debug/monitoring tool for OpenStack Nova
<div class='status-title'>Connecting StackTach to OpenStack</div>
<div id='instance-box' class='status'>
<div id='instance_activity' class='status-inner'>
<li>Get a <a href='/new_tenant'>StackTach Tenant ID</a>
<li>Add <pre>--notification_driver=nova.notifier.rabbit_notifier</pre> and
<li><pre>--notification_topics=notifications,monitor</pre> to your nova.conf file.
<li>Configure and run the <a target='_blank' href='https://github.com/Rackspace/StackTach'>StackTach Worker</a> somewhere in your Nova development environment.
<li>Restart Nova and visit http://[your server]/[your_tenant_id]/ to see your Nova installation in action!
{% endblock %}

from django.conf.urls.defaults import patterns, include, url
urlpatterns = patterns('',
url(r'^$', 'stacktach.views.welcome', name='welcome'),
url(r'new_tenant', 'stacktach.views.new_tenant', name='new_tenant'),
url(r'logout', 'stacktach.views.logout', name='logout'),
url(r'^(?P<tenant_id>\d+)/$', 'stacktach.views.home', name='home'),
url(r'^(?P<tenant_id>\d+)/data/$', 'stacktach.views.data',
url(r'stacky/deployments', 'stacktach.stacky_server.do_deployments'),
url(r'stacky/events', 'stacktach.stacky_server.do_events'),
url(r'stacky/hosts', 'stacktach.stacky_server.do_hosts'),
url(r'stacky/uuid', 'stacktach.stacky_server.do_uuid'),
url(r'stacky/timings/(?P<name>\w+)', 'stacktach.stacky_server.do_timings'),
url(r'stacky/summary', 'stacktach.stacky_server.do_summary'),
url(r'stacky/request', 'stacktach.stacky_server.do_request'),
url(r'stacky/watch', 'stacktach.stacky_server.do_watch'),
url(r'stacky/kpi', 'stacktach.stacky_server.do_kpi'),
url(r'^(?P<deployment_id>\d+)/$', 'stacktach.views.home', name='home'),
'stacktach.views.details', name='details'),
'stacktach.views.search', name='search'),
'stacktach.views.expand', name='expand'),
'stacktach.views.host_status', name='host_status'),
'stacktach.views.latest_raw', name='latest_raw'),
'stacktach.views.instance_status', name='instance_status'),

from django.shortcuts import render_to_response
from django import http
from django import template
from django.utils.functional import wraps
from django.views.decorators.csrf import csrf_protect
from stacktach import models
import datetime
import json
import pprint
import random
import sys
class My401(BaseException):
class HttpResponseUnauthorized(http.HttpResponse):
status_code = 401
def _extract_states(payload):
return {
'state' : payload.get('state', ""),
'old_state' : payload.get('old_state', ""),
'old_task' : payload.get('old_task_state', "")
def _monitor_message(routing_key, body):
event = body['event_type']
publisher = body['publisher_id']
parts = publisher.split('.')
request_id = body['_context_request_id']
parts = publisher.split('.')
service = parts[0]
if len(parts) > 1:
host = ".".join(parts[1:])
@ -39,42 +32,140 @@ def _monitor_message(routing_key, body):
#logging.error("publisher=%s, host=%s" % (publisher, host))
payload = body['payload']
request_spec = payload.get('request_spec', None)
# instance UUID's seem to hide in a lot of odd places.
instance = payload.get('instance_id', None)
instance = payload.get('instance_uuid', instance)
nova_tenant = body.get('_context_project_id', None)
nova_tenant = payload.get('tenant_id', nova_tenant)
return dict(host=host, instance=instance, publisher=publisher,
service=service, event=event, nova_tenant=nova_tenant)
if not instance:
instance = payload.get('exception', {}).get('kwargs', {}).get('uuid')
if not instance:
instance = payload.get('instance', {}).get('uuid')
tenant = body.get('_context_project_id', None)
tenant = payload.get('tenant_id', tenant)
resp = dict(host=host, instance=instance, publisher=publisher,
service=service, event=event, tenant=tenant,
return resp
def _compute_update_message(routing_key, body):
publisher = None
instance = None
args = body['args']
host = args['host']
request_id = body['_context_request_id']
service = args['service_name']
event = body['method']
nova_tenant = args.get('_context_project_id', None)
return dict(host=host, instance=instance, publisher=publisher,
service=service, event=event, nova_tenant=nova_tenant)
tenant = args.get('_context_project_id', None)
resp = dict(host=host, instance=instance, publisher=publisher,
service=service, event=event, tenant=tenant,
payload = data.get('payload', {})
return resp
def _tach_message(routing_key, body):
event = body['event_type']
value = body['value']
units = body['units']
transaction_id = body['transaction_id']
return dict(event=event, value=value, units=units, transaction_id=transaction_id)
# routing_key : handler
HANDLERS = {'monitor.info':_monitor_message,
def _parse(tenant, args, json_args):
def _make_datetime_from_raw(dt, ms):
if dt is None:
return None
return datetime.datetime(day=dt.day, month=dt.month, year=dt.year,
hour=dt.hour, minute=dt.minute, second=dt.second,
def aggregate(raw):
"""Roll up the raw event into a Lifecycle object
and a bunch of Timing objects.
We can use this for summarized timing reports.
# While we hope only one lifecycle ever exists it's quite
# likely we get multiple due to the workers and threads.
lifecycle = None
lifecycles = models.Lifecycle.objects.filter(instance=raw.instance)
if len(lifecycles) > 0:
lifecycle = lifecycles[0]
if not lifecycle:
lifecycle = models.Lifecycle(instance=raw.instance)
lifecycle.last_raw = raw
lifecycle.last_state = raw.state
lifecycle.last_task_state = raw.old_task
event = raw.event
parts = event.split('.')
step = parts[-1]
name = '.'.join(parts[:-1])
if not step in ['start', 'end']:
# We are going to try to track every event pair that comes
# through, but that's not as easy as it seems since we don't
# have a unique key for each request (request_id won't work
# since the call could come multiple times via a retry loop).
# So, we're just going to look for Timing objects that have
# start_raw but no end_raw. This could give incorrect data
# when/if we get two overlapping foo.start calls (which
# *shouldn't* happen).
start = step == 'start'
timing = None
timings = models.Timing.objects.filter(name=name, lifecycle=lifecycle)
if not start:
for t in timings:
if t.end_raw == None and t.start_raw != None:
timing = t
except models.RawData.DoesNotExist:
# Our raw data was removed.
if timing is None:
timing = models.Timing(name=name, lifecycle=lifecycle)
if start:
timing.start_raw = raw
timing.start_when = raw.when
timing.start_ms = raw.microseconds
# Erase all the other fields which may have been set
# the first time this operation was performed.
# For example, a resize that was done 3 times:
# We'll only record the last one, but track that 3 were done.
timing.end_raw = None
timing.end_when = None
timing.end_ms = 0
timing.diff_when = None
timing.diff_ms = 0
timing.end_raw = raw
timing.end_when = raw.when
timing.end_ms = raw.microseconds
start = _make_datetime_from_raw(timing.start_when, timing.start_ms)
end = _make_datetime_from_raw(timing.end_when, timing.end_ms)
# We could have missed start so watch out ...
if start and end:
diff = end - start
timing.diff_days = diff.days
timing.diff_seconds = diff.seconds
timing.diff_usecs = diff.microseconds
def process_raw_data(deployment, args, json_args):
"""This is called directly by the worker to add the event to the db."""
routing_key, body = args
handler = HANDLERS.get(routing_key, None)
if handler:
@ -82,7 +173,7 @@ def _parse(tenant, args, json_args):
if not values:
return {}
values['tenant'] = tenant
values['deployment'] = deployment
when = body['timestamp']
except KeyError:
@ -91,7 +182,8 @@ def _parse(tenant, args, json_args):
when = datetime.datetime.strptime(when, "%Y-%m-%d %H:%M:%S.%f")
except ValueError:
when = datetime.datetime.strptime(when, "%Y-%m-%dT%H:%M:%S.%f") # Old way of doing it
# Old way of doing it
when = datetime.datetime.strptime(when, "%Y-%m-%dT%H:%M:%S.%f")
values['microseconds'] = when.microsecond
except Exception, e:
@ -100,137 +192,61 @@ def _parse(tenant, args, json_args):
values['json'] = json_args
record = models.RawData(**values)
return values
return {}
def _post_process_raw_data(rows, state, highlight=None):
def _post_process_raw_data(rows, highlight=None):
for row in rows:
if "error" in row.routing_key:
row.is_error = True
if highlight and row.id == int(highlight):
row.highlight = True
row.when += datetime.timedelta(microseconds=row.microseconds)
novastats = state.tenant.nova_stats_template
if novastats and row.instance:
novastats = novastats.replace("[instance]", row.instance)
row.novastats = novastats
loggly = state.tenant.loggly_template
if loggly and row.instance:
loggly = loggly.replace("[instance]", row.instance)
row.loggly = loggly
class State(object):
def __init__(self):
self.version = VERSION
self.tenant = None
def __str__(self):
tenant = "?"
if self.tenant:
tenant = "'%s' - %s (%d)" % (self.tenant.project_name,
self.tenant.email, self.tenant.id)
return "[Version %s, Tenant %s]" % (self.version, tenant)
return "[Version %s, Tenant %s]" % (self.version, tenant)
def _default_context(request, deployment_id=0):
deployment = None
if 'deployment' in request.session:
d = request.session['deployment']
if d.id == deployment_id:
deployment = d
def _reset_state(request):
state = State()
request.session['state'] = state
return state
def _get_state(request, tenant_id=None):
tenant = None
if tenant_id:
if not deployment and deployment_id:
tenant = models.Tenant.objects.get(tenant_id=tenant_id)
except models.Tenant.DoesNotExist:
raise My401()
deployment = models.Deployment.objects.get(id=deployment_id)
request.session['deployment'] = deployment
except models.Deployment.DoesNotExist:
if 'state' in request.session:
state = request.session['state']
state =_reset_state(request)
if hasattr(state, 'version') and state.version < VERSION:
state =_reset_state(request)
state.tenant = tenant
return state
def tenant_check(view):
def inner(*args, **kwargs):
return view(*args, **kwargs)
# except HttpResponseUnauthorized, e:
except My401:
return HttpResponseUnauthorized()
return inner
def _default_context(state):
context = dict(utc=datetime.datetime.utcnow(), state=state)
context = dict(utc=datetime.datetime.utcnow(),
return context
def welcome(request):
state = _reset_state(request)
return render_to_response('welcome.html', _default_context(state))
deployments = models.Deployment.objects.all().order_by('name')
context = _default_context(request)
context['deployments'] = deployments
return render_to_response('welcome.html', context)
def home(request, tenant_id):
state = _get_state(request, tenant_id)
return render_to_response('index.html', _default_context(state))
def home(request, deployment_id):
context = _default_context(request, deployment_id)
return render_to_response('index.html', context)
def logout(request):
del request.session['state']
return render_to_response('welcome.html', _default_context(None))
def new_tenant(request):
state = _get_state(request)
context = _default_context(state)
if request.method == 'POST':
form = models.TenantForm(request.POST)
if form.is_valid():
rec = models.Tenant(**form.cleaned_data)
return http.HttpResponseRedirect('/%d' % rec.tenant_id)
form = models.TenantForm()
context['form'] = form
return render_to_response('new_tenant.html', context,
def data(request, tenant_id):
state = _get_state(request, tenant_id)
raw_args = request.POST.get('args', "{}")
args = json.loads(raw_args)
c = _default_context(state)
fields = _parse(state.tenant, args, raw_args)
c['cooked_args'] = fields
return render_to_response('data.html', c)
def details(request, tenant_id, column, row_id):
state = _get_state(request, tenant_id)
c = _default_context(state)
def details(request, deployment_id, column, row_id):
deployment_id = int(deployment_id)
c = _default_context(request, deployment_id)
row = models.RawData.objects.get(pk=row_id)
value = getattr(row, column)
rows = models.RawData.objects.filter(tenant=tenant_id)
rows = models.RawData.objects.select_related()
if deployment_id:
row = row.filter(deployment=deployment_id)
if column != 'when':
rows = rows.filter(**{column:value})
@ -240,17 +256,15 @@ def details(request, tenant_id, column, row_id):
rows = rows.filter(when__range=(from_time, to_time))
rows = rows.order_by('-when', '-microseconds')[:200]
_post_process_raw_data(rows, state, highlight=row_id)
_post_process_raw_data(rows, highlight=row_id)
c['rows'] = rows
c['allow_expansion'] = True
c['show_absolute_time'] = True
return render_to_response('rows.html', c)
def expand(request, tenant_id, row_id):
state = _get_state(request, tenant_id)
c = _default_context(state)
def expand(request, deployment_id, row_id):
c = _default_context(request, deployment_id)
row = models.RawData.objects.get(pk=row_id)
payload = json.loads(row.json)
pp = pprint.PrettyPrinter()
@ -258,29 +272,31 @@ def expand(request, tenant_id, row_id):
return render_to_response('expand.html', c)
def host_status(request, tenant_id):
state = _get_state(request, tenant_id)
c = _default_context(state)
hosts = models.RawData.objects.filter(tenant=tenant_id).\
order_by('-when', '-microseconds')[:20]
_post_process_raw_data(hosts, state)
c['rows'] = hosts
def latest_raw(request, deployment_id):
"""This is the 2sec ticker that updates the Recent Activity box."""
deployment_id = int(deployment_id)
c = _default_context(request, deployment_id)
query = models.RawData.objects.select_related()
if deployment_id > 0:
query = query.filter(deployment=deployment_id)
rows = query.order_by('-when', '-microseconds')[:20]
c['rows'] = rows
return render_to_response('host_status.html', c)
def search(request, tenant_id):
state = _get_state(request, tenant_id)
c = _default_context(state)
def search(request, deployment_id):
c = _default_context(request, deployment_id)
column = request.POST.get('field', None)
value = request.POST.get('value', None)
rows = None
if column != None and value != None:
rows = models.RawData.objects.filter(tenant=tenant_id).\
rows = models.RawData.objects.select_related()
if deployment_id:
row = rows.filter(deployment=deployment_id)
rows = rows.filter(**{column:value}). \
order_by('-when', '-microseconds')[:22]
_post_process_raw_data(rows, state)
c['rows'] = rows
c['allow_expansion'] = True
c['show_absolute_time'] = True

# Copyright 2012 Openstack LLC.
# All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
from multiprocessing import Process
from worker import run
from worker_conf import DEPLOYMENTS
if __name__ == '__main__':
processes = []
for deployment in DEPLOYMENTS:
if deployment.get('enabled', True):
process = Process(target=run, args=(deployment,))
process.daemon = True
for process in processes:

static/jquery.timers.js Normal file
View File

@ -0,0 +1,140 @@
* jQuery.timers - Timer abstractions for jQuery
* Written by Blair Mitchelmore (blair DOT mitchelmore AT gmail DOT com)
* Licensed under the WTFPL (http://sam.zoy.org/wtfpl/).
* Date: 2009/10/16
* @author Blair Mitchelmore
* @version 1.2
everyTime: function(interval, label, fn, times) {
return this.each(function() {
jQuery.timer.add(this, interval, label, fn, times);
oneTime: function(interval, label, fn) {
return this.each(function() {
jQuery.timer.add(this, interval, label, fn, 1);
stopTime: function(label, fn) {
return this.each(function() {
jQuery.timer.remove(this, label, fn);
timer: {
global: [],
guid: 1,
dataKey: "jQuery.timer",
regex: /^([0-9]+(?:\.[0-9]*)?)\s*(.*s)?$/,
powers: {
// Yeah this is major overkill...
'ms': 1,
'cs': 10,
'ds': 100,
's': 1000,
'das': 10000,
'hs': 100000,
'ks': 1000000
timeParse: function(value) {
if (value == undefined || value == null)
return null;
var result = this.regex.exec(jQuery.trim(value.toString()));
if (result[2]) {
var num = parseFloat(result[1]);
var mult = this.powers[result[2]] || 1;
return num * mult;
} else {
return value;
add: function(element, interval, label, fn, times) {
var counter = 0;
if (jQuery.isFunction(label)) {
if (!times)
times = fn;
fn = label;
label = interval;
interval = jQuery.timer.timeParse(interval);
if (typeof interval != 'number' || isNaN(interval) || interval < 0)
if (typeof times != 'number' || isNaN(times) || times < 0)
times = 0;
times = times || 0;
var timers = jQuery.data(element, this.dataKey) || jQuery.data(element, this.dataKey, {});
if (!timers[label])
timers[label] = {};
fn.timerID = fn.timerID || this.guid++;
var handler = function() {
if ((++counter > times && times !== 0) || fn.call(element, counter) === false)
jQuery.timer.remove(element, label, fn);
handler.timerID = fn.timerID;
if (!timers[label][fn.timerID])
timers[label][fn.timerID] = window.setInterval(handler,interval);
this.global.push( element );
remove: function(element, label, fn) {
var timers = jQuery.data(element, this.dataKey), ret;
if ( timers ) {
if (!label) {
for ( label in timers )
this.remove(element, label, fn);
} else if ( timers[label] ) {
if ( fn ) {
if ( fn.timerID ) {
delete timers[label][fn.timerID];
} else {
for ( var fn in timers[label] ) {
delete timers[label][fn];
for ( ret in timers[label] ) break;
if ( !ret ) {
ret = null;
delete timers[label];
for ( ret in timers ) break;
if ( !ret )
jQuery.removeData(element, this.dataKey);
jQuery(window).bind("unload", function() {
jQuery.each(jQuery.timer.global, function(index, item) {

View File

@ -0,0 +1,118 @@
Copyright 2012 - Dark Secret Software Inc.
All Rights Reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may
not use this file except in compliance with the License. You may obtain
a copy of the License at
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
License for the specific language governing permissions and limitations
under the License.
<html lang="en">
<link href="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8/themes/base/jquery-ui.css" rel="stylesheet" type="text/css"/>
<script src="http://ajax.googleapis.com/ajax/libs/jquery/1.5/jquery.min.js"></script>
<script src="http://ajax.googleapis.com/ajax/libs/jqueryui/1.8/jquery-ui.min.js"></script>
<script type="text/javascript" src="/static/jquery.timers.js"></script>
<link href='http://fonts.googleapis.com/css?family=Kaushan+Script' rel='stylesheet' type='text/css'>
<link href='http://fonts.googleapis.com/css?family=PT+Sans&subset=latin' rel='stylesheet' type='text/css'>
<style type="text/css">
.fancy {
font-family: 'Kaushan Script';
font-style: normal;
padding-bottom: 0em;
font-size: 3em;
h1, h2 {
font-family: 'PT Sans', serif;
font-style: normal;
letter-spacing: -0.076em;
line-height: 1em;
body {
font-family:"Helvetica Neue", Arial, Helvetica, sans-serif;
a {
a:hover {
.cell-border {
border-left: 1px solid #bbbbcc;
.title {
font-weight: bold;
td {
padding-right: 1em;
.status {
border:1px #bbbbcc solid;
.status-title {
margin-left: 1em;
margin-right: 1em;
background-color: white;
font-family: 'PT Sans', serif;
.status-inner {
.std-height {
<script type="text/javascript">
{% block extra_js %}
{% endblock %}
$(document).ready(function() {
{% block extra_init_js %}
{% endblock %}
<div class='fancy'><a href='/'>StackTach</a>
<span style='font-size: small'>v2</span>
<a href="https://github.com/rackspace/stacktach"><img style="position: absolute; top: 0; right: 0; border: 0;" src="https://s3.amazonaws.com/github/ribbons/forkme_right_green_007200.png" alt="Fork me on GitHub"></a>
{% block body %}
{% endblock %}

View File

@ -2,7 +2,7 @@
<script type="text/javascript">
$(document).oneTime(2000, function() {

{% extends "base.html" %}
{% block extra_js %}
function details(deployment_id, column, row_id)
$("#detail").load('/' + deployment_id + '/details/' + column +
'/' + row_id);
function expand(deployment_id, row_id)
$("#row_expansion_" + row_id).load('/' + deployment_id + '/expand/' +
function search_form(deployment_id)
var field = $("#field").val();
var value = $("#query").val();
var data = {'field':field, 'value':value};
$("#detail").load('/' + deployment_id + '/search/', data);
return false;
{% endblock %}
{% block extra_init_js %}
{% endblock %}
{% block body %}
<div class='status-title'>Recent Activity
{% if deployment %}- {{deployment.name}}{%else%}- ALL{%endif%}
<div id='host-box' class='status std-height'>
<div id='host_activity' class='status-inner'>
{% include "host_status.html" %}
<div class='status-title'>Commands</div>
<div class='status'>
<div class='status-inner' style='padding-bottom:0em; margin-bottom:0em; margin-top:1em'>
<form action="">
<select id='field'>
<option value='routing_key'>source
<option value='tenant'>tenant
<option selected='true'>instance
<input type='text' id='query' size='60' value=''/>
<input type='submit' value='Search' onclick='return search_form({{deployment_id}});'/>
<div class='status-title'>Details</div>
<div class='status'>
<div id='detail' class='status-inner'>
<div>click on an item above to see more of the same type.</div>
{% endblock %}

<script type="text/javascript">
$(document).oneTime(2000, function() {

<table style='font-size:1em;'>
<th class='title'></th>
<th class='title'>deployment</th>
<th class='title'>source</th>
<th class='title'>tenant</th>
<th class='title'>service</th>
<th class='title'>host</th>
<th class='title'>event</th>
<th class='title'>instance</th>
<th class='title'>when</th>
{% if not rows %}
<tr><td>No results</td></tr>
{% endif %}
{% for row in rows %}
<tr {% if row.highlight %}style='background-color:#FFD88F;'{% endif %} >
{% if allow_expansion %}
<a href='#' onclick='expand({{deployment_id}}, {{row.id}});'>[+]</a>
{% endif %}
<a href='/{{row.deployment.id}}'>{{row.deployment.name}}</a>
<td class='cell-border'>
<span style='{% if row.is_error %}background-color:#ffaaaa;{% endif %}'>
<a href='#'
onclick='details({{deployment_id}}, "routing_key", {{row.id}});'>
<td class='cell-border'>
<a href='#' onclick='details({{deployment_id}}, "tenant", {{row.id}});'>
{% if row.tenant %}{{row.tenant}}{% endif %}</a>
<td class='cell-border'>
<a href='#'
onclick='details({{deployment_id}}, "service", {{row.id}});'>
<td class='cell-border'>
<a href='#' onclick='details({{deployment_id}}, "host", {{row.id}});'>
<td class='cell-border'>
<a href='#' onclick='details({{deployment_id}}, "event", {{row.id}});'>
<td class='cell-border'>
{% if row.instance %}
<a href='#'
onclick='details({{deployment_id}}, "instance", {{row.id}});'>
{% endif %}
<td class='cell-border'>
<a href='#' onclick='details({{deployment_id}}, "when", {{row.id}});'>
{% if show_absolute_time %}{{row.when}} (+{{row.when.microsecond}})
{%else%}{{row.when|timesince:utc}} ago{%endif%}</a>
{% if allow_expansion %}
<td colspan=8>
<div id='row_expansion_{{row.id}}' style='font-size:1.2em'></div>
{% endif %}
{% endfor %}

@ -0,0 +1,20 @@
{% extends "base.html" %}
{% block body %}
{% if not deployments %}
<div class='status-title'>No deployments defined</div>
Do you have the worker configured and running?
{% else %}
<div class='status-title'>Choose the Deployment to monitor</div>
<div id='instance-box' class='status'>
<div id='instance_activity' class='status-inner'>
<li><a href='/0'>All</a></li>
{% for d in deployments %}
<li><a href='/{{d.id}}'>{{d.name}}</a></li>
{% endfor %}
{% endif %}
{% endblock %}

# Copyright 2012 - Dark Secret Software Inc.
# All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# This is the worker you run in your OpenStack environment. You need
# to set TENANT_ID and URL to point to your StackTach web server.
import daemon
import json
import kombu
import kombu.connection
import kombu.entity
import kombu.mixins
import logging
import threading
import time
import urllib
import urllib2
LOG = logging.getLogger(__name__)
handler = logging.handlers.TimedRotatingFileHandler('worker.log',
when='h', interval=6, backupCount=4)
from worker_conf import *
except ImportError:
# For now we'll just grab all the fanout messages from compute to scheduler ...
#scheduler_exchange = kombu.entity.Exchange("scheduler_fanout", type="fanout",
# durable=False, auto_delete=True,
# exclusive=True)
#scheduler_queues = [
# The Queue name has to be unique or we we'll end up with Round Robin
# behavior from Rabbit, even though it's a Fanout queue. In Nova the
# queues have UUID's tacked on the end.
# kombu.Queue("scheduler.xxx", scheduler_exchange, durable=False,
# auto_delete=True),
# ]
nova_exchange = kombu.entity.Exchange("nova", type="topic", exclusive=False,
durable=True, auto_delete=False)
nova_queues = [
kombu.Queue("monitor.info", nova_exchange, durable=True, auto_delete=False,
exclusive=False, routing_key='monitor.info'),
kombu.Queue("monitor.error", nova_exchange, durable=True, auto_delete=False,
exclusive=False, routing_key='monitor.error'),
class NovaConsumer(kombu.mixins.ConsumerMixin):
def __init__(self, connection, url):
self.connection = connection
self.url = url
def get_consumers(self, Consumer, channel):
return [#Consumer(queues=scheduler_queues,
# callbacks=[self.on_scheduler]),
Consumer(queues=nova_queues, callbacks=[self.on_nova])]
def _process(self, body, message):
routing_key = message.delivery_info['routing_key']
payload = (routing_key, body)
jvalues = json.dumps(payload)
raw_data = dict(args=jvalues)
cooked_data = urllib.urlencode(raw_data)
req = urllib2.Request(self.url, cooked_data)
response = urllib2.urlopen(req)
LOG.debug("Sent %s to %s", routing_key, self.url)
#page = response.read()
#print page
except urllib2.HTTPError, e:
if e.code == 401:
LOG.debug("Unauthorized. Correct URL? %s" % self.url)
page = e.read()
def on_scheduler(self, body, message):
# Uncomment if you want periodic compute node status updates.
#self._process(body, message)
def on_nova(self, body, message):
self._process(body, message)
class Monitor(threading.Thread):
def __init__(self, deployment):
super(Monitor, self).__init__()
self.deployment = deployment
def run(self):
tenant_id = self.deployment.get('tenant_id', 1)
url = self.deployment.get('url', 'http://www.example.com')
url = "%s/%d/data/" % (url, tenant_id)
host = self.deployment.get('rabbit_host', 'localhost')
port = self.deployment.get('rabbit_port', 5672)
user_id = self.deployment.get('rabbit_userid', 'rabbit')
password = self.deployment.get('rabbit_password', 'rabbit')
virtual_host = self.deployment.get('rabbit_virtual_host', '/')
LOG.info("StackTach %s" % url)
LOG.info("Rabbit: %s %s %s %s" %
(host, port, user_id, virtual_host))
params = dict(hostname=host,
while True:
with kombu.connection.BrokerConnection(**params) as conn:
consumer = NovaConsumer(conn, url)
except Exception as e:
LOG.exception("url=%s, exception=%s. Reconnecting in 5s" % (url, e))
with daemon.DaemonContext(files_preserve=[handler.stream]):
workers = []
for deployment in DEPLOYMENTS:
LOG.info("Starting deployment: %s", deployment)
monitor = Monitor(deployment)
except Exception as e:
LOG.exception("Deployment: %s, Exception: %s" % (deployment, e))
for worker in workers:
LOG.info("Attempting to join to %s" % worker.deployment)
LOG.info("Joined to %s" % worker.deployment)

case "$1" in
echo "Starting server"
/sbin/start-stop-daemon --start --pidfile $PIDFILE --make-pidfile -b --exec $DAEMON $ARGS
echo "Stopping server"
/sbin/start-stop-daemon --stop --pidfile $PIDFILE --verbose
echo "Usage: stacktach.sh {start|stop}"
exit 1
exit 0

@ -0,0 +1,46 @@
import json
import os
import signal
import sys
from multiprocessing import Process
POSSIBLE_TOPDIR = os.path.normpath(os.path.join(os.path.abspath(sys.argv[0]),
os.pardir, os.pardir))
if os.path.exists(os.path.join(POSSIBLE_TOPDIR, 'stacktach')):
sys.path.insert(0, POSSIBLE_TOPDIR)
import worker
processes = []
def kill_time(signal, frame):
print "dying ..."
for process in processes:
print "rose"
for process in processes:
print "bud"
if __name__ == '__main__':
config_filename = os.environ['STACKTACH_DEPLOYMENTS_FILE']
config = None
with open(config_filename, "r") as f:
config = json.load(f)
deployments = config['deployments']
for deployment in deployments:
if deployment.get('enabled', True):
process = Process(target=worker.run, args=(deployment,))
process.daemon = True
signal.signal(signal.SIGINT, kill_time)
signal.signal(signal.SIGTERM, kill_time)

# Copyright 2012 - Dark Secret Software Inc.
# All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# This is the worker you run in your OpenStack environment. You need
# to set TENANT_ID and URL to point to your StackTach web server.
import json
import kombu
import kombu.connection
import kombu.entity
import kombu.mixins
import logging
import time
from stacktach import models, views
LOG = logging.getLogger(__name__)
handler = logging.handlers.TimedRotatingFileHandler('worker.log',
when='h', interval=6, backupCount=4)
nova_exchange = kombu.entity.Exchange("nova", type="topic", exclusive=False,
durable=True, auto_delete=False)
nova_queues = [
kombu.Queue("monitor.info", nova_exchange, durable=True,
exclusive=False, routing_key='monitor.info'),
kombu.Queue("monitor.error", nova_exchange, durable=True,
exclusive=False, routing_key='monitor.error'),
class NovaConsumer(kombu.mixins.ConsumerMixin):
def __init__(self, connection, deployment):
self.connection = connection
self.deployment = deployment
def get_consumers(self, Consumer, channel):
return [Consumer(queues=nova_queues, callbacks=[self.on_nova])]
def _process(self, body, message):
routing_key = message.delivery_info['routing_key']
payload = (routing_key, body)
jvalues = json.dumps(payload)
args = (routing_key, json.loads(message.body))
asJson = json.dumps(args)
views.process_raw_data(self.deployment, args, asJson)
self.logger.debug("Recorded %s ", routing_key)
def on_nova(self, body, message):
self._process(body, message)
def run(deployment_config):
name = deployment_config['name']
host = deployment_config.get('rabbit_host', 'localhost')
port = deployment_config.get('rabbit_port', 5672)
user_id = deployment_config.get('rabbit_userid', 'rabbit')
password = deployment_config.get('rabbit_password', 'rabbit')
virtual_host = deployment_config.get('rabbit_virtual_host', '/')
deployment, new = models.get_or_create_deployment(name)
print "Starting worker for '%s'" % name
LOG.info("%s: %s %s %s %s" % (name, host, port, user_id, virtual_host))
params = dict(hostname=host,
while True:
with kombu.connection.BrokerConnection(**params) as conn:
consumer = NovaConsumer(conn, deployment)
except Exception as e:
LOG.exception("name=%s, exception=%s. Reconnecting in 5s" %
(name, e))

# Copyright 2012 Openstack LLC.
# All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
# This is a sample conf file. Use it as a guide to make your own
# My fun conf
tenant_id=1, # This is the stacktach tenant, not an openstack tenant
url='http://stacktach.my-fun-nova-deploy.com', # The url for the base of the django app
rabbit_host="", # ip/host name of the amqp server to listen to
rabbit_password="some secret password",

# Copyright 2012 Openstack LLC.
# All Rights Reserved.
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
# http://www.apache.org/licenses/LICENSE-2.0
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.
import amqplib.client_0_8 as amqp
import json
import socket
import time
from stacktach import models, views
class NovaConsumer(object):
def __init__(self, channel, tenant_id, logger):
self.channel = channel
self.tenant = models.Tenant.objects.get(tenant_id=tenant_id)
self.logger = logger
channel.basic_consume('monitor.info', callback=self.onMessage)
channel.basic_consume('monitor.error', callback=self.onMessage)
def onMessage(self, message):
routing_key = message.delivery_info['routing_key']
args = (routing_key, json.loads(message.body))
asJson = json.dumps(args)
#from pprint import pformat
#self.logger.debug("Saving %s", pformat(args))
views._parse(self.tenant, args, asJson)
self.logger.debug("Recorded %s ", routing_key)
def run(deployment, logger):
tenant_id = deployment.get('tenant_id', 1)
host = deployment.get('rabbit_host', 'localhost')
port = deployment.get('rabbit_port', 5672)
user_id = deployment.get('rabbit_userid', 'rabbit')
password = deployment.get('rabbit_password', 'rabbit')
virtual_host = deployment.get('rabbit_virtual_host', '/')
logger.info("Rabbit: %s %s %s %s" %
(host, port, user_id, virtual_host))
while 1:
conn = amqp.Connection(host, userid=user_id, password=password, virtual_host=virtual_host)
ch = conn.channel()
ch.access_request(virtual_host, active=True, read=True)
ch.exchange_declare('nova', type='topic', durable=True, auto_delete=False)
ch.queue_declare('monitor.info', durable=True, auto_delete=False, exclusive=False)
ch.queue_declare('monitor.error', durable=True, auto_delete=False, exclusive=False)
ch.queue_bind('monitor.info', 'nova')
ch.queue_bind('monitor.error', 'nova')
consumer = NovaConsumer(ch, tenant_id, logger)
# Loop as long as the channel has callbacks registered
while ch.callbacks:
except socket.error, e:
logger.warn("Socket error: %s" % e)