There were some jobs recently that showed an unexpected processor count.
Add some data to allow to debug this.
Change-Id: I587a492d1aa94b0886c7e9a2260a3e2eb384e788
Newer ansbile-lint finds "when" or "become" statements that are at the
end of blocks. Ordering these before the block seems like a very
logical thing to do, as we read from top-to-bottom so it's good to see
if the block will execute or not.
This is a no-op, and just moves the places the newer linter found.
Change-Id: If4d1dc4343ea2575c64510e1829c3fe02d6c273f
This is pretty trivial, but consistency is probably better in this
regard and it does guide you to writing a sentence that is human
parsable, which is the point of it.
Change-Id: Iaab9bb6aec0ad0f1d3cae10364c1f1b37d02801e
This commit in Ansible:
9142be2f6c
now allows Python modules to specify their interpreter with the shebang.
We expect our roles to use the discovered python interpreter on remote
nodes, and on the executor, we need them to use the virtualenv. Removing
the specific shebang accomplishes this under Ansible 6, and has no effect
under older versions of Ansible.
Without this, for example, the log upload roles would not have access to
their cloud libraries.
Also update our ansible/cli check in our module files. Many of our modules
can be run from the command line for ease of testing, but the check that we
perform to determine if the module is being invoked from the command line
or Ansible fails on Ansible 5. Update it to a check that should work in
all 4 versions of Ansible that Zuul uses.
Change-Id: I4e6e85156459cca032e6c3e1d8a9284be919ccca
This fixes a number of places where we do not have spaces between
filters. I think that this is a reasonable rule for readability (I
also think it probably was enforced, but maybe later versions got
better at detecting it?).
These are detected by a later version of Ansible lint; this change
should have no operational change to any roles but prepares us to
update in a follow-on change.
Change-Id: I07e1a109b87adce86f483d14d7e02fcecb8313d5
Change Iba195e7c5cec372c6ba4daf7059da5b6fb6740ec implemented
collection of output for `df -i` (inode counts) and `df -m`
(megabytes data) in validate-host, but did not add them to the
report file template. Correct this oversight so that the collected
information will be included in that file.
Change-Id: I8c2c4a90f18394a04fde84355a89a15bf5aa66b4
Include calls to `df -i` (inode counts) and `df -m` (megabytes data)
in validate-host, to aid in troubleshooting build failures where the
builds start out with too little free space. This way the initial
capacity and utilization of all available filesystems will be
recorded with other basic node diagnostic data.
Change-Id: Iba195e7c5cec372c6ba4daf7059da5b6fb6740ec
Make it possible for a site to demand that the validate-host role
finds IPv4 and/or IPv6 routes, making one or both explicitly
mandatory, instead of the default behavior of succeeding as long as
at least one is available. This allows a site to, for example,
discard nodes during a pre playbook if they lack IPv4 connectivity.
Change-Id: Icaa82212468a659a3756ed51cac442de33065b55
- replaces ignore with a warn, which displays issue without affecting
the linting outcome (allowing gradual fixing)
- bumps linter to enable the warn_list feature
- fixes a set of issues, others will be fixed in follow-up
Change-Id: I7d6f8c156b06f68f681943e88860930968e7c9f9
Adds yamllint to the linters with a minimal configuration, some
rules are disabled to allow us to fix them in follow-ups, if
we agree on them.
Fixes invalid YAML file containing characters inside block.
Fixes few minor linting issues.
Change-Id: I936fe2c997597972d884c5fc62655d28e8aaf8c5
Inventory hostnames like "abc/123" is valid in both ansible and zuul
but this role breaks since it uses it as part of a path. This sanitizes
the hostname from "abc/123" to "abc_123".
Change-Id: Ic89d595b6f004b5ca4805f1af8387e8ba56564aa
The argument here is an integer "limit", not the exception.
I think that we only notice this on Python 3 because of exception
chaining. It causes a real failure though because the exception
handler that is meant to fall into "pass" raises another exception
when ipv6 doesn't work.
Change-Id: I0908a0a3dbb2356caabbffd062379751a0b61c41
This is because it is not python2.6 compatible, where supported
versions of ansible still allow for python2.6.
Change-Id: Ie1b3a30e1d6b5206ba81558a34937071a951ce15
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
Split the network testing component of the validate-host rule into a
separate task, so it can be retried a couple of times in case
something is a bit slow about bringing up external networking. Add
failure collection of unbound logs if they appear to be in some common
locations (such as will be there on infra nodes).
Change-Id: Id12f1ba064fa2e5f75b9a5cfba76d238d23d3f57
Add a testenv:py27 environment that overrides basepython to 2.7
Unfortunately implicit namespace packages are a Python3 thing [1] so
we have to scatter a few __init__.py's around for the test loader
under python2 to be able to find the unit test directories.
Update documenation to mention this
Needed-By: https://review.openstack.org/592768
[1] https://www.python.org/dev/peps/pep-0420/
Change-Id: I9a653666e8a083fb7f3fbb92589fe0467a41e6e6
Add a unit testing framework for python roles. Thanks to Matthew
Treinish for the suggestion of how to perform discovery (and much of the
code which is copied from Tempest).
Change-Id: Iec95dd1026a41614def57c65c3faa0516a682a5a
These traceroutes currently fail in a very opaque way. We are
occasionally seeing Fedora nodes fail in here (which is odd, because
obviously networking is up enough for zuul to connect) and don't have
much to go on.
When an exception is caught, add the output, return code and basic
traceback to the return attributes, and include them in the failure
case.
Depends-On: https://review.openstack.org/563702
Change-Id: I047bf2b1daa22a5b6bfc12b3f42b108975097409
We remove the privileged inventory file copy from validate-host so that
validate-host can be tested more easily. Users should use log-inventory
in their base jobs to get this functionality.
Depends-On: https://review.openstack.org/563789
Change-Id: I8cd1748395abfe868f1716a9b4394850f962d436
The zuul_debug_info library calls traceroute, which is in /usr/sbin
and not in /sbin on SUSE (and those two are not linked to each other).
Also capture the OSError that occurs when the binary isn't there.
Change-Id: Ic5e31a417415f830d7697abfbb2ae71f2ae20935
Change the default parameters to the role to be zuul site variables.
Because of variable precedence, having these not be site variables means
someone could override them in a job. Since one of the actions is to
read and log the contents of a file, we likely don't want to give people
the ability to do that with an arbitrary file.
The traceroute host isn't as important to be a site variable, but it's
still not actually something that jobs should override - it's a feature
of the deployment.
Both variables work if they are not set, so deployers should still be
able to use this role without defining site-variables. But it should be
made clear to them that if they want those features they really should
define the locations in a site-variable and not in a normal job
variable.
configure-mirror similarly allows in-job override, but maybe that's ok
for now and leaving the site-variable value as a default is fine?
Finally, add a new zuul_site_image_manifest_files list, so that we can
specify more than one file to read. Set the defaults of it to be the
files that the dib nodepool elements emit. We'll also look in to pushing
those manifest files up a level into dib so that expecting nodepool
nodes to have them is even more reasonable.
Change-Id: I632a32fdfac4bfe57eb269ac8e183fb8df34d48f
We are collcting information about the running zuul system in
validate-host. Move the inventory collection here with the rest of
it. Also, put it in zuul-info. It's debugging info about the ansible
layer.
Change-Id: I5dded8f3545e725cbc11c1eae194857fe9623ab1
Did didn't have ansible-lint setup properly, as a results our roles
weren't actually linted properly.
Fix variable linting issues and ignore ANSIBLE0012.
Change-Id: I07aa940245e700c9f08df0f1920720f0ed9d3de0
Signed-off-by: Paul Belanger <pabelanger@redhat.com>
hostvars can potentially leak secrets. setup doesn't, and records what
we're interested in, which is the information ansible knows about the
remote host.
Change-Id: Ice585cb3beddf4e3ecc1e692ecf4e7da8c5754b8