Commit Graph

502 Commits (840cdd7a13136dd9746a55d886064a2e9ff9d71b)

Author SHA1 Message Date
frenzyfriday 840cdd7a13 Removes a verification step in sova molecule test
Lets remove the faulty test (_No_valid_host_was_found)
 till bug #1947133 is fixed

Related-Bug: #1947133
Change-Id: I4f3b994fd98262fb7e69ccf01734dff8fc16d913
2021-10-15 16:56:56 +02:00
Zuul dcfaffbfbd Merge "Update url for sova query source" 2021-08-31 10:46:12 +00:00
frenzyfriday 0f2adf2857 Update url for sova query source
Moving the source of sova queries as we are now generating sova queries from a single readable query file for both Elastic recheck and Sova.

Depends-On: https://review.opendev.org/c/openstack/tripleo-ci-health-queries/+/798958

Change-Id: I7a4cea605fe39fc086333d51cc75720c9b8243ad
2021-08-26 13:10:26 +00:00
Sorin Sbarnea bfef65d57a Upgrade ansible-lint/molecule runs
* bump linters
* sort newly identified rule violations
* fixes sanity by correcting missing excludes from galaxy.yml file
* fix incorrect pinning of python version in tox.ini

Change-Id: Iba0cecfc46a97d0ee56e30e48638f3a1b37bdec5
2021-07-30 11:11:22 +01:00
frenzyfriday c9358e211b Handling long file names in sova
Sova writes a file with the combined failure reasons as filename.
When sova finds multiple error matches [1] this filename can be too long. This patch limits the filename to 100 chars. The full failure reasons are still written inside the file.

Example: https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_a4a/798958/8/check/openstack-tox-py38/a4af6fe/job-output.txt

[1] https://opendev.org/openstack/tripleo-ci-health-queries/src/branch/master/samples/errors-testing.err - This happens specially here.
We add a sample string for each of the regex/patterns being added and run sova to verify the regexes are indeed correct.
As the number of regexes increase the filename goes out of limit.

Change-Id: I2682a7e2b9316b4c3cd11b546f35822c53a3c489
2021-07-05 13:18:20 +00:00
Sandeep Yadav ed5c048f4f Add errors.txt in artcl_logstash_files
Removing duplicate .txt extension from errors.txt..

Also, Adding errors.txt in artcl_logstash_files so that logstash pick
this up.

As size of logstash file is a factor, Adding errors.txt instead of
individual files seems better approach as errors.txt only contains
error/traceback from different log files.

Change-Id: Ib4519943d6b0ca4607f168318dff11a6d1713796
2021-06-25 01:02:57 +00:00
Zuul 64f56ca032 Merge "Correct the tox option for skipping sdist generation" 2021-06-19 09:46:19 +00:00
Sorin Sbarnea cc6ff6f07c Refactor zuul jobs
- Avoid using content provider job and the second standalone one
- Integration jobs are now running as experimental

Change-Id: I8a2f19fd091e827458f6769612c46341259047f3
2021-06-18 15:52:49 +01:00
Jeremy Stanley 4b54d972a4 Correct the tox option for skipping sdist generation
The tox option to skip source distribution building is skipsdist,
but this seems to be often misspelled skipdist instead, which gets
silently ignored and so does not take effect. Correct it
everywhere, in hopes that new projects will finally stop copying
this mistake around.

See https://tox.readthedocs.io/en/latest/config.html#conf-skipsdist
and https://github.com/tox-dev/tox/issues/1388 for details.

Change-Id: I42016fd82d01db25b1506f07d8631f98e8da399f
2021-06-17 16:56:08 +00:00
Zuul a9af906a30 Merge "Add container update rpms info to collect logs" 2021-06-08 01:51:59 +00:00
Wesley Hayutin 2f4efe8028 add tempest_run.log to logstash
We need to be able to see which jobs are failing
on tempest via tripleo-health.  The initial
query would be the sova query ( ... FAILED )

In the future.. we could add the tempest test name
and build_status:failed to determine how often
certain tempest tests are failing.

Change-Id: Id3a20500830855da8b01bf0caae96f26671a4590
2021-06-05 16:14:40 +00:00
Ronelle Landy 79f16c5ddb Add container update rpms info to collect logs
Change-Id: I32755e0f7352852a34b944fe8dfd0c30a83acc20
2021-06-04 14:00:27 +00:00
Wesley Hayutin c3483e8830 Avoid using openstack constraints
This repository should not make use of openstack containt files and addresses sanity job failure
caused by switch of nodeset image used for testing.

Change-Id: I7f3eab6ac44f53451114dcd8841bf4300500ab11
2021-06-04 13:52:42 +00:00
Zuul 9111b1c1bd Merge "Add 2 minute timeout for repoquery" 2021-04-30 12:28:10 +00:00
Martin Kopec 1e37a62f64 Add 2 minute timeout for repoquery
We encountered an issue when repoquery took several minutes
which has lead to timeouts and unfinished log collections.
To avoid that this commit adds a timeout which is basically
a fail-safe from collect_logs point of view.

The patch replaces for loop for record_available_packages command
by a single command.

Change-Id: Ie3007414aac14db47696fca62b07e1efa4e1de16
2021-04-29 08:50:10 +00:00
Zuul 3b46f036ff Merge "add sealert diagosis of selinux errors" 2021-04-27 05:30:56 +00:00
Zuul 04302378a6 Merge "add cephadm.log to logstash for indexing" 2021-04-24 21:32:19 +00:00
Wes Hayutin b4aa76a819 add sealert diagosis of selinux errors
Change-Id: I698ffb89477a7bca29a83ad943816c0c30d0d3f5
2021-04-22 22:21:42 +00:00
Wes Hayutin ce97a99cfe dump all the shell variables from the system
We see some deploy failures due to the shell
lang being set to latin-1.
e.g https://bugzilla.redhat.com/show_bug.cgi?id=1910416

Change-Id: I92dc70077f31432afdc62f17c024478a03b0e22a
2021-04-22 01:35:29 +00:00
Martin Kopec 8f57778228 Use centos8 stream images with molecule scenarios
There have been problems with centos7 lately with py2/py3.
Ansible's pip module used python2 instead of python3 for
creating virtual venv which lead to dependency issues with
infrared which should be installed only by py3.

This changes also bumps the linters, removing yamllint which is now
included in ansible-lint.

Change-Id: If117b438fd55b17ead4016aabc3feae6632e722b
2021-04-21 16:33:53 +01:00
Wes Hayutin a7397018be add cephadm.log to logstash for indexing
new ceph deployments log the install
to /var/log/ceph/cephadm.log

Change-Id: Icc852641c654294e6ac4da18d2cc608ad8dddb9b
2021-04-12 19:48:24 -06:00
Chandan Kumar (raukadah) a89834e557 Collect nested virtualization info
We run our deployment on multiple cloud, it is good to
collect nested virt info in order to investigate
server creation related issue.

It is needed for sc10 kvm jobs which is still in
development here:
https://review.rdoproject.org/r/q/topic:%22fix_octavia_kvm%22+(status:open%20OR%20status:merged)

Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
Change-Id: Ib9d89c9ce0e889a7ee6250f66b55c99a245d6d00
2021-03-26 16:09:30 +05:30
Sorin Sbarnea 354650bcbb Update location of sova patterns
As we moved the queries repo to gerrit, we should now point to its
new location.

Change-Id: I7d795e73473965691f5308151a4f7b724f99a2d4
2021-03-04 09:58:00 +00:00
Sorin Sbarnea 0478d5983f Move sova patterns outside the repository
Removes internal sova-patterns.yml and use uri module to load the
same content as JSON from queries repository.

Change-Id: Ic9fd06f911e8c50819b2a10eb91fb9814787b025
Story: TRIPLEOCI-287
2021-02-22 13:10:10 +00:00
Sorin Sbarnea a5ce785e24 Enable ansible-test units
- converts lonely unit test to use the official unittest format
for ansible collections.
- adds two tests to sova module
- moves sovalib into sova module as this is required in order to
  make the module compatible with both role and collection deployments.

Story: TRIPLEOCI-284
Change-Id: I6e0b2fa4a4b02fbf4133c28d29adaf0e3c16d344
2021-02-19 11:21:07 +00:00
Sorin Sbarnea 51160038a5 Bump linters
- upgrade linters
- enable black formatting so we don't waste time making flake8 happy
- all the .py files modified by this patch were modified by black itself

Change-Id: I947cf8934a57ad519242757c777b23155fcbe7f4
2021-02-17 11:08:44 +00:00
Sorin Sbarnea 0608040b24 Fix ansible-test sanity
Change-Id: I9ce4d3dbd8a9ca1c3d0610f3a07f063632c67bfc
2021-02-10 14:18:11 +00:00
Sorin Sbarnea 3f0881dfff Make rst files collection compatible
Fixes RST issues reported by ansible-test sanity. Because collections
do not use sphinx, we cannot make use of |project| substitution.

Change-Id: I4d42d25ec1c6de3eebaed8cf78c8d7a648e1ba5d
Related: https://review.opendev.org/c/openstack/ansible-role-collect-logs/+/773725
2021-02-03 17:37:14 +00:00
Sorin Sbarnea 1c71e5098f Transform artcl into a collection
Because Ansible official testing tools (ansible-test) cannot be used
without a collection, we change the code layout to make conformant.

WARNING: The role is no longer considered to be named
"ansible-role-collect-logs" but "collect_logs" instead, with a
temporary alias called "collect-logs".

Checklist:
- [x] ansible-test sanity checks runs (does not need to pass)
- [x] zuul is still able to use the role
- [x] infrared is still able to use the role
- [x] molecule tests are running and passing
- [x] tripleo-ci jobs still collect the files

One symlink is still needed for infrared until related patch lands:
https://review.gerrithub.io/c/redhat-openstack/infrared/+/508861

Change-Id: Ib87622797a284d837ee579d9cccec0ed73306626
Story: TRIPLEOCI-305
2021-01-29 11:34:14 +00:00
Zuul dd904e9518 Merge "Collect all *rc files" 2021-01-28 23:36:24 +00:00
Zuul 14740ecab6 Merge "Find correct python interpreter" 2021-01-27 00:20:00 +00:00
Zuul 1366515526 Merge "Fix roles_path" 2021-01-26 12:26:59 +00:00
Jesse Pretorius (odyssey4me) 439d0890b1 Collect all *rc files
If the stack's name is not 'overcloud' then its rc file is not
collected. Rather than relying on a hard-coded stackrc,
overcloudrc, etc we can use a wildcard to catch them all. This
might also catch a few other files, but that's a small price to
pay.

Change-Id: I4892e3d5ff73eda3cc92406cecaf7611e6b6c304
2021-01-26 12:18:38 +01:00
Zuul 3d873672ef Merge "Also collect journal errors and send to logstash" 2021-01-24 15:33:50 +00:00
Zuul 18cf7eab4a Merge "Revert "Disable voting of broken job"" 2021-01-23 13:42:24 +00:00
Sorin Sbârnea 0548c38671 Revert "Disable voting of broken job"
This reverts commit 9733aaaf2b.

Reason for revert: Test code must run an pass.

Change-Id: I8da10ef160e21fcc3ae18d07e63d91b8b840a688
2021-01-22 15:07:44 +00:00
Sorin Sbarnea 0ab1c4196e Reinstate functional testing
None of our molecule tests were running but the job was executed.

This includes a fail-safe guard that should prevent such accidents
in the future. This assures the expected number of tests passed.

Change-Id: I6e1f80a7ecea65f42479376ce1a53c519f56bd68
2021-01-22 12:38:53 +00:00
Jesse Pretorius (odyssey4me) ff01bc786d Find correct python interpreter
If /usr/bin/python is not available when trying to collect
the logs, then none are collected. In this patch we ensure
that alternative python interpreters are considered when
attempting the log collection. The interpreters are
considered in order of preference, with the system python
being the last and the default when the host is unreachable.

This resolves the issue when infrared has an interpreter
set in the inventory, but it's the wrong one due to the
host recently being upgraded from RHEL7 to RHEL8.

Change-Id: I62ec9b5ed1978806c82a973bb901d527c731d3a9
2021-01-22 11:49:10 +00:00
Bogdan Dobrelya 33d4c58669 Also collect journal errors and send to logstash
Change-Id: I8e058a9c0292259748f758e3a04b6326992528b9
Signed-off-by: Bogdan Dobrelya <bdobreli@redhat.com>
2021-01-21 17:47:21 +01:00
Sagi Shnaidman 503fa8250f Add pattern to packages download failure
Change-Id: Id2e48021c8c92facba829c38205bee6d35aa55b5
2021-01-19 13:19:53 +02:00
Sagi Shnaidman e5d9714bb3 Fix pattern message for tempest
Change-Id: I641f86639906e1be107703aea0f039ef3f8242e8
2021-01-18 10:08:38 +02:00
yatinkarel f8e33c105b Fix roles_path
Follow up of https://review.opendev.org/768309.

With [1] collect-logs get's installed in share/ansible/roles,
so including that in roles_path.

[1] https://review.opendev.org/c/openstack/ansible-role-collect-logs/+/762980

Change-Id: I0e3461318457a804d0c8ac616f861d0a54953487
2021-01-14 19:16:51 +05:30
Sagi Shnaidman b3ea8622a2 Add tempest pattern and standardize it
Change-Id: I53f906233be80d5ce84e337c0c92b409fa8e652a
2021-01-13 17:02:58 +02:00
Sagi Shnaidman 28fc4f0659 Add another packages failure pattern
Change-Id: Id6c892bfcea48dad839d449d5b5ac226264e0aed
2021-01-08 00:55:24 +02:00
Sagi Shnaidman 29be5e2c22 Add pattern for proxy failure
Change-Id: Icb0dfd17d5f885728d1ab042b59ef4a2fd0a3821
2021-01-08 00:25:13 +02:00
Zuul 0b5f5f9593 Merge "Changing bash executable" 2021-01-07 00:08:48 +00:00
Sagi Shnaidman be7cd38a7b Add conflict packages pattern
Change-Id: I4100d1712a92cd797d1d4525798745b2c3ddfe33
2021-01-06 11:09:35 +02:00
yatinkarel eff595b074 Remove cap for sphinx
Since we use the requirements from upper-constraints,
don't add upper cap to 1.3 which is too old version and
u-c is beyond that since pike.

Related-Bug: #1908054
Change-Id: If17755178dcf8f0505f58b3d68db404e878a958d
2020-12-30 13:12:46 +05:30
Chandan Kumar (raukadah) 37392fc9b1 Collect dnf module related infos
For dnf module specific issues, where one module is getting disabled
and others getting enabled will give a better insight to debug
podman or package related issues.

Change-Id: Iddc80d30b060b78185b8f4b9b119abc2bbd58ff8
Signed-off-by: Chandan Kumar (raukadah) <chkumar@redhat.com>
2020-12-28 16:21:59 +05:30
Zuul 2eea37c360 Merge "Enable verbose output for the openstack cli commands" 2020-12-26 03:15:31 +00:00