tripleo-ci-health-queries/README.md
Wesley Hayutin d64320bff6 make tripleo-health a little easier
Change-Id: I879057e401e3a15f41d9a5dfef8b65ad21e804fe
2021-05-26 20:33:30 +02:00

136 lines
5.7 KiB
Markdown

# How to get this going quickly...
1. Add the query to queries.yml
2. Add an associated local sova query to sova-pattens.yml
3. Add a blob of text from the error you would like to test against to errors.txt
Note: the softlinks
errors.txt ---> samples/errors-testing.err
queries.yml ---> src/data/queries.yml
sova-patterns.yml ---> src/data/sova-patterns.yml
4. Execute tox
5. Submit patch
6. GREAT SUCCESS!
# queries
Hosts reusable log queries which are built into a single queries.json file.
## Query database structure
Queries are defined using the data model from src/model.py which builds
a JSON Validation schema, making easy to validate the file.
One example of file can be seen at [queries-example.yml](https://opendev.org/openstack/tripleo-ci-health-queries/raw/branch/master/src/data/queries-example.yml)
Both [elastic-search](https://www.elastic.co/guide/en/elasticsearch/reference/current/term-level-queries.html) and [artcl](https://opendev.org/openstack/ansible-role-collect-logs) can make use of `regex` searches.
Pattern is supposed to be an exact string match and if multiple are present
we could easily convert them into a regex or logstash expression that uses
logical `AND`.
### Pattern
On elastic-rechheck queries we have cases with multiple entries used on
patterns, like `message:foo AND message:bar`. This is why we also allow
a list of strings.
### Categories
A query can have only one category out of a determined list of possible
values, currently `infra` and `code` are allowed. These can be used to
list found matches in section, making them easier to read.
### Tags
Tags are also used to build the logstash queries. List of known values
already used inside elastic-recheck queries:
```yaml
tags:
- console
- console.html
- devstack-gate-setup-host.txt
- grenade.sh.txt
- job-output.txt
- screen-c-api.txt
- screen-c-bak.txt
- screen-n-cpu.txt
- screen-n-sch.txt
- screen-q-agt.txt
- syslog.txt
```
When logstash query is build `OR` is used between multiple tags.
### Uncovered cases:
We do not currently support the exclusions like below (2/93 found):
```yaml
query: >-
message:"RESULT_TIMED_OUT: [untrusted : git.openstack.org/openstack/tempest/playbooks/devstack-tempest.yaml@master]" AND
tags:"console" AND NOT
(build_name:"tempest-all" OR
build_name:"tempest-slow" OR
build_name:"tempest-slow-py3")
query2: >-
(message: "FAILED with status: 137" OR
message: "FAILED with status: 143" OR
message: "RUN END RESULT_TIMED_OUT") AND
NOT message:"POST-RUN END RESULT_TIMED_OUT" AND
tags: "console"
```
To allow us to cover for corner cases not covered byt the generic format,
we could have an optional `logstash` key that mentions the query. When this
would be present, we woudl avoid building the logstash query ourselves and
just use it.
## Disable queries per backend
To avoid using a particular query on a particular backend we can make use of
``skip: ['er', 'artcl']``.
## Parameters in a query
| Key | Required | Type | Sova/ER | Description |
| ------------- |:--------:| ------------------------:|-------------:|------------:|
| id | Required | string, unique | Sova | This id will appear in sova.log when this bug is encountered. |
| pattern | Optional | string / list of strings | ER (,Sova) | This used as it is by Elastic recheck to form logstash queries like ``` message:foo ``` or ``` message:foo AND message:bar ``` If not present Elastic Recheck skips this entry |
| regex | Optional | string / list of strings | Sova | Sova searches for this regex in the log files. If not present the pattern string is converted to regex (escaping special characters) and used by sova. |
| tags | Optional | string / list of strings | ER | Elastic recheck uses this to form logstash queries like ``` AND (tags:"console.html" OR tags:"job-output.txt")``` |
| url | Optional | string | ER | Launchpad bug URL. If present the bug number and details are displayed on ER dashboard. If not the graph is titled Unknow/private bug |
| suppress-graph| Optional | bool | ER | Decides whether or not to display a graph for this bug in Elastic recheck dashboard |
## Adding a new item in queries.yml
To add a new query (optional: from a bug)
- Add a query item in [src/data/queries.yml](https://opendev.org/openstack/tripleo-ci-health-queries/src/branch/master/src/data/queries.yml)
- id : unique ID
- pattern: The error message(s) to look for.
- regex: If it is a complicated error message you can add a regex for sova to use.
- tags: tag(s) for logstash query
- url: Launchpad Bug URL
- suppress-graph: To display/suppress graph for the bug in ER
To move an existing sova regex from sova-patterns.json to queries.yml
- Choose an item from ``` regexes ``` in output/sova-patterns.json. This is the file that sova currently uses.
(Do not look at patterns for now)
- If the regex feels still relevant, vote for it here https://docs.google.com/spreadsheets/d/16rqgaSSoQrYNjsI4q0YInJxrOOJL3xyP3t3D6ZpluFY/edit#gid=846893892
- Add a query entry in src/data/queries.yml
- id: `name` of the regexes item you chose.
- pattern: If you want this error to be shown in ER dashboard add a string here corresponding to the regex
- regex: `regex` of the regexes item you chose.
- Add a sample string matching the pattern (or regex if you have a special regex) in samples/errors-testing.err.
As the part of check job a task calls sova to check this file and makes sure there is a match for all the regexes
in queries.yml.
- Run tox. If it passes it should create files in output/elastic-recheck and update sova-pattern-generated.
- Commit all these files.