zuul/zuul/executor
James E. Blair 61822ec737 Use a transaction for BuildCompletedEvent
There is a race condition where an executor may crash and leave
a stuck build.  We can avoid that by performing the following two
actions in a transaction:

* Update the build request state to COMPLETED
* Submit the BuildCompletedEvent to the event queue

The race condition occurs when the build request is marked as completed
but no BuildCompletedEvent arrives.  In that case, Zuul sees the
completed build request and assumes that the event will be forthcoming;
therefore the build request itself is not considered lost.  The only way
for a build request to be removed in that case is in the case of a
buildset reset.

By including these operations in a transaction, only the following
states are possible if the executor crashes:

* It crashes before the build is complete: the build is declared lost
  and restarted.
* It crashes after the build is complete: the scheduler doesn't care.

Transactions are limited to 1MB just like any other ZK network operation,
and the result data can be large, but we already put that in a side-channel
if it exceeds a certain size, so only the actual event znode and request
znode need to be involved in the transaction.

Change-Id: Ibedf2c5db825fb444f652b60e1c6f2c7aadc6950
2022-03-14 09:11:18 -07:00
..
sensors Enable starting executors in paused mode 2019-11-04 13:13:38 +01:00
__init__.py Rename zuul-launcher to zuul-executor 2017-03-15 12:21:24 -04:00
client.py Implement job freezing API in zuul-web 2021-11-10 09:25:49 +01:00
common.py Load job from pipeline state on executors 2021-11-23 15:16:32 -08:00
server.py Use a transaction for BuildCompletedEvent 2022-03-14 09:11:18 -07:00