From b5dd6cfa2af890b507fabe394a09809baa06f5e8 Mon Sep 17 00:00:00 2001 From: Clint Byrum Date: Thu, 1 Oct 2015 15:17:07 -0700 Subject: [PATCH] Counter-inspection for scaling controls A spec to add counter-inspection to the gate to prevent scaling problems. Change-Id: I3ad971cfa33789ade733e8fb92ff51d487078638 --- doc/source/index.rst | 2 +- specs/devstack/counter-inspection.rst | 103 ++++++++++++++++++++++++++ 2 files changed, 104 insertions(+), 1 deletion(-) create mode 100644 specs/devstack/counter-inspection.rst diff --git a/doc/source/index.rst b/doc/source/index.rst index 25cbd00..bdf86d3 100644 --- a/doc/source/index.rst +++ b/doc/source/index.rst @@ -20,7 +20,7 @@ DevStack Project Specifications :glob: :maxdepth: 2 -.. specs/devstack/* + specs/devstack/* Implemented Specifications -------------------------- diff --git a/specs/devstack/counter-inspection.rst b/specs/devstack/counter-inspection.rst new file mode 100644 index 0000000..c6b3878 --- /dev/null +++ b/specs/devstack/counter-inspection.rst @@ -0,0 +1,103 @@ +.. + This work is licensed under a Creative Commons Attribution 3.0 Unported + License. + http://creativecommons.org/licenses/by/3.0/legalcode + +.. + +======================================================== +Inspect counters for data collection and gating purposes +======================================================== + +Problem description +=================== + +OpenStack projects vary on their impact to the underlying infrastructure +that they rely on greatly. This is hard to measure without going to a +full scale deployment, but we should be able to measure the impact by +inspecting counters already maintained by the system. + +Proposed change +=============== + +* Create a new "OpenStack QA Tools" repository to house small tools + written in python for purposes such as this. + + * Create a tool, `os-collect-counters`, which collects relevant counters + from any backends it can reach using its own configuration and + outputs a JSON mapping with all counters. Includes ability to delta + with a previous run to allow showing impact on the counters for a + given time window. + + * Initial counters will be at a minimum a set of MySQL counters (such + as Innodb_bytes_written) and published messages from the RabbitMQ + management interface, summarized by scope that can be inferred + from each queue name. + +* Leverage existing subunit/statsd/graphite infrastructure to record results of + several tests in the devstack gate. + + * For each run, the JSON from `os-collect-counters` will be added as an + attachment to the subunit stream. + + * The counters in the attachment will be fed into statsd/graphite to + allow establishing trends. + + * This will be facilitated by adding attachment storage plugins to + subunit2sql. The plugin used for OpenStack gate jobs will be + specific to OpenStack's infrastructure and look for the specifically + named attachment to push into statsd/graphite. + +* Monitor counters for stable indicators and identify the best predictors of + problems. + + * Once stable counters are identified, create an upper bounds for + these counters to help prevent new changes in the system from + accidentally introducing an inordinate amount of cost into the tested + code paths. + + Since there are daunting social issues around failing gate tests + on global collisions, warnings and bugs about said warnings are + likely the only reasonable outcome we can achieve. It will take a + considerable amount of community agreement to make these limits hard. + + +Implementation +============== + +A new python repo, `os-performance-tools`_, has already been created, and +will be maintained for the purposes of extracting and pushing counters +into statsd/graphite. This will include a subunit2sql attachments plugin +and code to output the counters as a subunit attachment. + +.. _`os-performance-tools`: https://review.openstack.org/#/c/244428/ + +Assignee(s) +----------- + +Primary assignee: + +* Clint Byrum + +Milestones +---------- + +Target Milestone for completion: + Mitaka-2 + +Work Items +---------- + +* Create tools to emit counters from a running installation +* Modify devstack gate job output to include counters +* Add subunit2sql attachment plugin to subunit2sql workers + to push counters to infra graphite +* Analyze data for stable counters and useful trends +* Add upper bounds check to devstack gate + +Dependencies +============ + +References +========== +