tripleo-ci/docs/tripleo-quickstart-logs.html

243 lines
10 KiB
HTML

<!DOCTYPE HTML>
<html lang="en-US">
<head>
<title>README for Quickstart Logs</title>
</head>
<body>
<style>
.collapsible {
background-color: #777;
color: white;
cursor: pointer;
padding: 18px;
width: 100%;
border: none;
text-align: left;
outline: none;
font-size: 15px;
}
.active, .collapsible:hover {
background-color: #555;
}
.content {
padding: 0 18px;
max-height: 0;
overflow: hidden;
transition: max-height 0.2s ease-out;
background-color: #f1f1f1;
}
</style>
<h3>Links to important log files</h3>
<ul>
<li><a href='undercloud/var/log/extra/errors.txt.gz'>undercloud/var/log/extra/errors.txt.gz</a>
- the concatenation of all the errors on any node in a single file</li>
<li><a href='undercloud/home/zuul/'>undercloud/home/zuul/</a>
- the source and log output of all templated shell scripts</li>
<li><a href='undercloud/var/log/'>undercloud/var/log/</a> -
directories and files copied from /var/log on the undercloud.
If other overcloud/subnodes exist, similar $node/var/log
directories will also exist in these logs.</li>
<li><a href='undercloud/var/log/extra/'>undercloud/var/log/extra/</a> -
Available on each node, extra system details like package list, and cpu info gathered from the
undercloud</li>
<li> /var/log/messages is captured in the journal files in /var/log/extras on each node
</li>
<li><a href='undercloud/var/lib/mistral'>undercloud/var/lib/mistral</a>
- output of all ansible used by config-download to drive the overcloud deployment</li>
</ul>
<h3>Links yum repo and rpm info</h3>
<ul>
<li><a href='delorean_logs'>delorean_logs</a>
- Note: if content-providers are used, the rpm build is done on the content-provider.
The source code change is built into rpm and the build logs are here</li>
<li><a href='etc/yum.repos.d/gating.repo'>gating.repo</a>
- the yum repo where the gerrit change is built and provided. i.e. content-provider</li>
<li><a href='undercloud/etc/yum.repos.d/'>undercloud/etc/yum.repos.d/</a>
- link to the yum.repos.d directory </li>
<li><a href='undercloud/etc/yum.repos.d/delorean.repo'>undercloud/etc/yum.repos.d/delorean.repo</a>
- link to the delorean.repo ( the build ) </li>
<li><a href='undercloud/var/log/extra/rpm-list.txt.gz'>undercloud/var/log/extra/rpm-list.txt.gz</a>
- rpms installed to the undercloud, container rpms can be found in extra/<podman|docker>/containers/$container/info.log</li>
<li><a href='undercloud/etc/dnf/modules.d/'>undercloud/etc/dnf/modules.d/</a>
- link to yum module information </li>
</ul>
<h3>Links container log files</h3>
<ul>
<li><a href='undercloud/var/log/containers/'>undercloud/var/log/containers/</a>
- the system logs for each container</li>
<li><a href='undercloud/var/log/extra/podman/'>undercloud/var/log/extra/podman/</a>
- the podman container setup configuration and setup logs</li>
<li><a href='undercloud/var/log/extra/docker/'>undercloud/var/log/extra/docker/</a>
- the docker container setup configuration and setup logs</li>
<li><a href='undercloud/var/log/tripleo-container-image-prepare.log'>undercloud/var/log/tripleo-container-image-prepare.log</a>
- the container download, container update and provision log </li>
</ul>
<h3>Links to tempest results</h3>
<ul>
<li><a href='undercloud/var/log/tempest/'>undercloud/var/log/tempest/</a>
- Tempest run results logs </li>
<li><a href='undercloud/home/zuul/tempest/etc/'>undercloud/home/zuul/tempest/etc/</a>
- tempest.conf and tempest blacklist and whitelist tests </li>
<li><a href='stackviz/#/testrepository.subunit'>stackviz</a> - stackviz tempest test results</li>
</ul>
<h3>Links to puppet logs</h3>
<ul>
<li><a href='undercloud/var/lib/config-data/'>undercloud/var/lib/config-data/</a>
- the configuration files for each openstack service</li>
<li><a href='undercloud/etc/puppet/hieradata/'>undercloud/etc/puppet/hieradata/</a>
- the hieradata of the deployment, service-configs, net and vip info</li>
<lib><a href='undercloud/var/lib/container-puppet'>undercloud/var/lib/container-puppet</a>
- puppet config for each container
</ul>
<h3>SELinux Alerts</h3>
<ul>
<li><a href='undercloud/var/log/extra/selinux_denials_detail.txt'>selinux_denials_detail</a>
<li>Available on each node under /var/log/extra/selinux_denials_detail</li>
</ul>
<button class="collapsible">How to recreate this job</button>
<div class="content">
<p> Please refer to the <a href="README-reproducer.html">recreation
instructions</a>
</p>
</div>
<button class="collapsible">Additional logs for OVB jobs, a.k.a openstack virtual baremetal</button>
<div class="content">
<p>
Note: These logs are only available in jobs that use OVB.
<ul>
<li>
<a href='https://openstack-virtual-baremetal.readthedocs.io/en/latest/'>OpenStack Virtual Baremetal Documentation</a>
</li>
<li><a href='baremetal_0-console.log'>baremetal boot log</a>
- all the overcloud nodes should have boot logs associated e.g. baremetal_1, baremetal_2 etc.</li>
<li><a href='bmc-console.log'>baremetal controller log, a.k.a BMC</a></li>
<li><a href='overcloud-controller-0'>logs collected from overcloud-controller-[0-2] node</a></li>
<li><a href='overcloud-novacompute-0'>logs collected from the overcloud-novacompute-[0-1] node</a></li>
</p>
</div>
<button class="collapsible">How to figure out what went wrong?</button>
<div class="content">
<p> Check the base directory for a file called <b>*failure_reason*</b> for
automatic failure detection. If no known error has been found the file will
be named "No_failure_reason_found".
<p>Most of the undercloud and overcloud deployment log files can
be found in <a href='undercloud/home/zuul'>undercloud/home/zuul</a>
<b>tracebacks and other errors are collected in the following log per node:</b>
<ul>
<li><a href='undercloud/var/log/extra/errors.txt.gz'>undercloud/var/log/extra/errors.txt.gz</a>
- the concatenation of all the errors on any node in a single file</li>
</ul>
<p>Next check the console log and search for <b>PLAY RECAP</b>. There are sometimes
multiple ansible runs in a job, usually the last one is the relevant.
<br>If no <b>PLAY RECAP</b> text is found that usually means an infra failure
before Quickstart could even start.
<br>
If this is a different Ansible error, that could mean either an infra
problem (often has <b>UNREACHABLE</b> in the line) or a bug in Quickstart.
</p>
<p>
Ask on <i>#tripleo</i> to get help or open a bug on
<a href='https://bugs.launchpad.net/tripleo/+filebug'>Launchpad</a>. Add the
"ci" tag if it's a CI issue and "quickstart" if you suspect that the bug is in
Quickstart itself.</p>
Finally try rechecking or asking on <i>#tripleo</i>
</p>
</div>
<button class="collapsible">Variables used in the job run</button>
<div class="content">
<p>The logs contain files showing variables used in the job run.</p>
<ul>
<li><a href='undercloud/var/log/extra/dump_variables_vars.json.txt.gz'>undercloud/var/log/extra/dump_variables_vars.json.txt.gz</a>
- contains the variables used in the running the actual test</li>
<li><a href='releases.sh'>releases.sh</a>
- the output of the script setting release-related variables</li>
<li><a href='playbook_executions.log'>playbook_executions.log</a>
- prints out the complete commands, with all expanded arguments,
to run the playbooks</li>
</ul>
</div>
<button class="collapsible">Additional tools to help</button>
<div class="content">
<p> Upstream OpenStack Health, Elastic Search and Kibana
</p>
<ul>
<li><a href='http://status.openstack.org/elastic-recheck/'>http://status.openstack.org/elastic-recheck/</a>
- A tool to track the impact of known bugs in OpenStack CI</li>
<li><a href='http://logstash.openstack.org/#/dashboard/file/logstash.json'>http://logstash.openstack.org</a>
- filter in details in the log files from OpenStack CI</li>
<li><a href='http://status.openstack.org/openstack-health/#/?searchProject=tripleo'>OpenStack Health</a>
- upstream job results by project</li>
</ul>
<p> Tools that will help you spot a trend in TripleO CI
</p>
<ul>
<li><a href='http://dashboard-ci.tripleo.org/d/jobs/jobs-exploration?orgId=1&var-influxdb_filter=job_name%7C%3D%7Ctripleo-ci-centos-7-containers-multinode'>Job Exploration</a>
- check the job history across upstream clouds</li>
<li><a href='http://zuul.openstack.org/builds?job_name=tripleo-ci-centos-7-containers-multinode%09'>zuul job filter</a>
- zuul's job filter per job</li>
<li><a href='http://cistatus.tripleo.org/'>http://cistatus.tripleo.org</a>
- Overall check job status</li>
<li><a href='http://cistatus.tripleo.org/gates/'>http://cistatus.tripleo.org/gates/</a>
- Overall gate job status</li>
</ul>
<p>
Tools to compare one job to another.
<ul>
<li><a href='https://github.com/sshnaidm/jcomparison'>jcomparison</a>
- A tool to compare results from one job to another</li>
<li><a href='https://pypi.org/project/logreduce/'>log reduce</a>
- A tool that uses AI features to reduce the noise in logs and present only what is needed for debug</li>
</ul>
</p>
</div>
<button class="collapsible">Dry run option</button>
<div class="content">
<p>As a debugging step, a job can be run manually with '-dryrun'
appended to the job name. When the "playbook dry run" option is invoked,
the playbooks will not execute and collect logs will not run but
certain log files, including 'toci_env_args_output.log', which
contains the environment variables used in the job, and
playbook_executions.log will still be produced in the logs
directory for inspection. This option serves to assist with
debugging and to test the testing scripts themselves.</p>
</div>
<script>
var coll = document.getElementsByClassName("collapsible");
var i;
for (i = 0; i < coll.length; i++) {
coll[i].addEventListener("click", function() {
this.classList.toggle("active");
var content = this.nextElementSibling;
if (content.style.maxHeight){
content.style.maxHeight = null;
} else {
content.style.maxHeight = content.scrollHeight + "px";
}
});
}
</script>
</body>
</html>