Document what it takes to be a check/gate test.
This is my recollection of the consensus from some infra folk on a train late at night: it's probably wrong, but I wanted something I can point contributors at. Change-Id: Ic1ad99335ce41481995322f0ee5daadb08a09c2a
This commit is contained in:
parent
c53349587b
commit
dd9a25646f
@ -29,6 +29,7 @@ Contents:
|
||||
:maxdepth: 2
|
||||
|
||||
project
|
||||
test-infra-requirements
|
||||
sysadmin
|
||||
systems
|
||||
|
||||
|
74
doc/source/test-infra-requirements.rst
Normal file
74
doc/source/test-infra-requirements.rst
Normal file
@ -0,0 +1,74 @@
|
||||
Test infrastructure Requirements
|
||||
################################
|
||||
|
||||
Overview
|
||||
========
|
||||
|
||||
There are multiple different ways that tests can be run. Each has different
|
||||
trade-offs between cost, reliability and test coverage.
|
||||
|
||||
The primary goal for OpenStack test infrastructure is to deliver highly
|
||||
reliable testing: with 500 patches successfully getting through the OpenStack
|
||||
gate on a peak day, even short service interruptions have a significant impact
|
||||
on project velocity.
|
||||
|
||||
The same velocity makes it extremely risky to disable tests: once disabled a
|
||||
test is likely to bitrot quickly, making re-enabling such tests hard.
|
||||
|
||||
This gives the following principle:
|
||||
|
||||
* Test runs that can stop a patch landing must be highly available - there must
|
||||
be at least two distinct places the test can be run, with no shared failure
|
||||
domains other than things that the infra team itself is responsible for.
|
||||
|
||||
Test run styles
|
||||
===============
|
||||
|
||||
Experimental
|
||||
------------
|
||||
|
||||
Experimental jobs have low reliability requirements: they are run by hand on
|
||||
developer request, typically as part of bringing up a new job definition.
|
||||
Failures in experimental jobs are not the responsibility of openstack-infra,
|
||||
though they will offer best-effort assistance to developers.
|
||||
|
||||
Silent
|
||||
------
|
||||
|
||||
Silent jobs are jobs that are not yet ready to vote on code changes. They might
|
||||
not be ready because of known failures, a lack of redundancy in the
|
||||
infrastructure or some other reason. In all other regards they are the same
|
||||
as Check jobs, which means we find out about the test reliability and can
|
||||
accurately assess whether the job is ready to promote to Check status.
|
||||
|
||||
Third party
|
||||
-----------
|
||||
|
||||
Third party test jobs are able to vote on code changes (+/- 1 only). These jobs
|
||||
are run by third parties on code pushes, but are not able to prevent code
|
||||
landing. (Developers of projects usually take negative votes from third party
|
||||
systems seriously however). Third party test jobs cannot be gates, and cannot
|
||||
set the '+2 verified' flag on review.
|
||||
|
||||
Check
|
||||
-----
|
||||
|
||||
Check jobs are used to verify each patch pushed to Gerrit. Like a third party
|
||||
test job they run against a single pushed patch, rather than the proposed
|
||||
merged state of the repository. A failure reported by a check job will prevent
|
||||
the patch being approved. As such check jobs have to run in a highly available
|
||||
environment with only infra controlled components permitted to have shared
|
||||
failure domains.
|
||||
|
||||
Gate
|
||||
----
|
||||
|
||||
Gate jobs are used to detect failures in patches after they are approved. They
|
||||
run against the state the OpenStack projects will have if the code is merged
|
||||
(rather than the state of the pushed code). This allows detection of semantic
|
||||
conflicts cross-patch (and the usual state for OpenStack is that multiple
|
||||
patches are going through the gate at once, so this is crucial). Failures in
|
||||
the gate both prevent the patch landing and cause all the pending patches after
|
||||
it to be retested. Gate jobs, like check jobs, have to run in a highly
|
||||
available environment with only infra controlled components permitted to have
|
||||
shared failure domains.
|
Loading…
Reference in New Issue
Block a user