RETIRED, TripleO CI Health logstash and regex queries for logs.
Go to file
Wesley Hayutin d64320bff6 make tripleo-health a little easier
Change-Id: I879057e401e3a15f41d9a5dfef8b65ad21e804fe
2021-05-26 20:33:30 +02:00
.github/workflows Validate query schema (#6) 2021-01-18 14:19:55 +00:00
build WIP 2020-12-03 13:04:13 +00:00
output add ironic registration timeout 2021-05-21 18:11:52 +02:00
playbooks Make test_results folder visible in zuul 2021-05-20 16:38:11 +02:00
samples add ironic registration timeout 2021-05-21 18:11:52 +02:00
src add ironic registration timeout 2021-05-21 18:11:52 +02:00
zuul.d Move zuul jobs to system-config queue 2021-04-16 11:18:50 +01:00
.flake8 Add black, isort and flake8 2021-04-22 14:09:01 +01:00
.gitignore Adding script to convert queries to er format 2021-04-22 12:25:14 +01:00
.gitreview Bootstrap zuul and gerrit config 2021-03-04 11:52:25 +00:00
.pre-commit-config.yaml Adds a user guide for queries.yml 2021-05-10 14:41:28 +02:00
.yamllint.yaml Enable pre-commit 2021-01-12 13:27:43 +00:00
ansible.cfg Adding script to convert queries to sova format 2021-04-16 12:30:28 +01:00
bindep.txt Adding script to convert queries to sova format 2021-04-16 12:30:28 +01:00
errors.txt make tripleo-health a little easier 2021-05-26 20:33:30 +02:00
hosts Adding script to convert queries to sova format 2021-04-16 12:30:28 +01:00
LICENSE Initial commit 2020-12-03 12:50:14 +00:00
queries.yml make tripleo-health a little easier 2021-05-26 20:33:30 +02:00
README.md make tripleo-health a little easier 2021-05-26 20:33:30 +02:00
requirements.in Adding script to convert queries to er format 2021-04-22 12:25:14 +01:00
requirements.txt Adding script to convert queries to er format 2021-04-22 12:25:14 +01:00
requirements.yml Adding script to convert queries to sova format 2021-04-16 12:30:28 +01:00
sova-patterns.yml make tripleo-health a little easier 2021-05-26 20:33:30 +02:00
tox.ini Adding script to convert queries to er format 2021-04-22 12:25:14 +01:00

How to get this going quickly...

  1. Add the query to queries.yml
  2. Add an associated local sova query to sova-pattens.yml
  3. Add a blob of text from the error you would like to test against to errors.txt

Note: the softlinks errors.txt ---> samples/errors-testing.err queries.yml ---> src/data/queries.yml sova-patterns.yml ---> src/data/sova-patterns.yml

  1. Execute tox
  2. Submit patch
  3. GREAT SUCCESS!

queries

Hosts reusable log queries which are built into a single queries.json file.

Query database structure

Queries are defined using the data model from src/model.py which builds a JSON Validation schema, making easy to validate the file.

One example of file can be seen at queries-example.yml

Both elastic-search and artcl can make use of regex searches.

Pattern is supposed to be an exact string match and if multiple are present we could easily convert them into a regex or logstash expression that uses logical AND.

Pattern

On elastic-rechheck queries we have cases with multiple entries used on patterns, like message:foo AND message:bar. This is why we also allow a list of strings.

Categories

A query can have only one category out of a determined list of possible values, currently infra and code are allowed. These can be used to list found matches in section, making them easier to read.

Tags

Tags are also used to build the logstash queries. List of known values already used inside elastic-recheck queries:

tags:
  - console
  - console.html
  - devstack-gate-setup-host.txt
  - grenade.sh.txt
  - job-output.txt
  - screen-c-api.txt
  - screen-c-bak.txt
  - screen-n-cpu.txt
  - screen-n-sch.txt
  - screen-q-agt.txt
  - syslog.txt

When logstash query is build OR is used between multiple tags.

Uncovered cases:

We do not currently support the exclusions like below (2/93 found):

query: >-
  message:"RESULT_TIMED_OUT: [untrusted : git.openstack.org/openstack/tempest/playbooks/devstack-tempest.yaml@master]" AND
  tags:"console" AND NOT
  (build_name:"tempest-all" OR
   build_name:"tempest-slow" OR
   build_name:"tempest-slow-py3")  

query2: >-
  (message: "FAILED with status: 137" OR
  message: "FAILED with status: 143" OR
  message: "RUN END RESULT_TIMED_OUT") AND
  NOT message:"POST-RUN END RESULT_TIMED_OUT" AND
  tags: "console"  

To allow us to cover for corner cases not covered byt the generic format, we could have an optional logstash key that mentions the query. When this would be present, we woudl avoid building the logstash query ourselves and just use it.

Disable queries per backend

To avoid using a particular query on a particular backend we can make use of skip: ['er', 'artcl'].

Parameters in a query

Key Required Type Sova/ER Description
id Required string, unique Sova This id will appear in sova.log when this bug is encountered.
pattern Optional string / list of strings ER (,Sova) This used as it is by Elastic recheck to form logstash queries like message:foo or message:foo AND message:bar If not present Elastic Recheck skips this entry
regex Optional string / list of strings Sova Sova searches for this regex in the log files. If not present the pattern string is converted to regex (escaping special characters) and used by sova.
tags Optional string / list of strings ER Elastic recheck uses this to form logstash queries like AND (tags:"console.html" OR tags:"job-output.txt")
url Optional string ER Launchpad bug URL. If present the bug number and details are displayed on ER dashboard. If not the graph is titled Unknow/private bug
suppress-graph Optional bool ER Decides whether or not to display a graph for this bug in Elastic recheck dashboard

Adding a new item in queries.yml

To add a new query (optional: from a bug)

  • Add a query item in src/data/queries.yml
    • id : unique ID
    • pattern: The error message(s) to look for.
    • regex: If it is a complicated error message you can add a regex for sova to use.
    • tags: tag(s) for logstash query
    • url: Launchpad Bug URL
    • suppress-graph: To display/suppress graph for the bug in ER

To move an existing sova regex from sova-patterns.json to queries.yml

  • Choose an item from regexes in output/sova-patterns.json. This is the file that sova currently uses. (Do not look at patterns for now)

  • If the regex feels still relevant, vote for it here https://docs.google.com/spreadsheets/d/16rqgaSSoQrYNjsI4q0YInJxrOOJL3xyP3t3D6ZpluFY/edit#gid=846893892

  • Add a query entry in src/data/queries.yml

    • id: name of the regexes item you chose.
    • pattern: If you want this error to be shown in ER dashboard add a string here corresponding to the regex
    • regex: regex of the regexes item you chose.
  • Add a sample string matching the pattern (or regex if you have a special regex) in samples/errors-testing.err. As the part of check job a task calls sova to check this file and makes sure there is a match for all the regexes in queries.yml.

  • Run tox. If it passes it should create files in output/elastic-recheck and update sova-pattern-generated.

  • Commit all these files.