zuul/zuul/driver/gerrit
James E. Blair 81fe0a50d1 Stop using submitted-together for submitWholeTopic
During development of the recent change to reduce the number of
queries sent to Gerrit during circular dependency event storms,
it was noted that the emulated submit-whole-topic behavior in Zuul
need not be subject to the same behavior, because it only acts after
a change is enqueued into a pipeline.  When a change is enqueued,
it can query Gerrit once to obtain all changes in the topic, and then
adds them to the cycle at once.  It does not do that today -- in fact,
it will query gerrit once for each change in the cycle just like the
gerrit driver, however, we can cache the results of that query so
that when dealing with a cycle, we only perform the query once.

Rather than handling submitWholeTopic changes directly in the gerrit
driver as we do now, let's remove that code and just use the
"emulated" path for both cases.  We will automatically enable the
"emulated" path if the server has submitWholeTopic enabled, so the
user-visible functionality is the same.  Moreover, this fits better
with our desire to handle dependencies in the pipeline manager as
much as possible.

This means that if a user uploads 100 changes, we will query
gerrit 4 times for each change; the four queries being:

* The change info itself
* Related changes (git parent/child)
* Files list
* Reverve commit dependencies (Depends-On that point to this change)

And that is it until a change is enqueued.  Since there is a built-in
delay in the Gerrit driver, at least 10 seconds should elapse between
the first change in a cycle being uploaded and Zuul enqueing that change
into a pipeline.  Assuming that all the changes are able to be uploaded
within that window (or if the topic is being created by updating change
topics), then only one more query should need to be performed: to get
the list of changes in the topic on enqueue.

In this fast case, the total queries are:

  queries = 5*count

  100 changes gives 500 queries

If changes are updated outside of that 10 second window, more queries
will happen as items are removed from the pipeline and re-added as their
dependency cycle cahnges, but that is no different than today, and that
is action on a human timescale, and less likely to impact Zuul's
performance.  However, extra queries may be performed due to the
following:

When the scheduler forwards a change event to pipelines, it updates the
change's dependencies first in order to decide if it is relevant to
changes already in the pipeline.  That will cause a topic query to be
performed.  Then, once the pipeline manager runs, it will update the
dependencies of all the changes in the queue item, performing the query
again; but that query will be cached for the rest of the cycle.  This
means that when changes are added slowly to the pipeline, we will perform
two queries for each change, one when forwarding the event to the pipeline,
and one for the cycle in the pipeline.

That means the total queries are:

  queries = 4*count + 2*count - 1

  100 changes gives 599 queries

This change retains the implementation and testing of the submitted-together
fake gerrit API endpoint, even though it is no longer used, for completeness
in case we find we want to use it again in the future.

One of the tests for max-dependencies inside the gerrit driver is updated
because without using the submitted-together endpoint, the driver no
longer recursively follows all git dependencies, so a series of depends-on
footers is used to achieve the same effect.  Keep in mind that there is
a separate pipeline max-dependencies option which will still protect
pipelines from having too many dependencies, no matter the source.

The check to exit early from processing the dependency graph is removed
because it behaves erroneously in the case where a change is enqueued into
a pipeline with no dependencies and then another change is added to its
topic.  This bug was likely masked by extra queries and updates performed
in other circumstances.  It is now covered by tests.

The isChangeRelevantToPipeline method is also corrected.  It only effectively
checked direct dependencies; the topic checking was erroneous and actually
checked whether the change being added was its own dependency (oddly: yes!
in the case of emulated topic dependencies, which is also corrected in this
change).  It now correctly checks whether dependencies are in the pipeline.

Change-Id: I20c7a8f6f1b8a869e163a1524d96fe53ef20a291
2024-06-10 18:13:45 -07:00
..
__init__.py Interface to get a driver's trigger event class 2021-03-18 09:23:49 +01:00
auth.py gerrit: Fix 'form' auth 2022-04-25 09:30:48 -05:00
gcloudauth.py Add gcloud_service auth option for Gerrit driver 2020-01-30 08:09:00 -08:00
gerritconnection.py Stop using submitted-together for submitWholeTopic 2024-06-10 18:13:45 -07:00
gerriteventawskinesis.py Add AWS Kinesis support 2023-07-25 11:04:19 -07:00
gerriteventchecks.py Refactor Gerrit driver event sources 2023-07-13 14:02:46 -07:00
gerriteventgcloudpubsub.py Add gcloud pubsub support to Gerrit driver 2023-08-02 14:50:28 -07:00
gerriteventkafka.py Add Kafka support to Gerrit 2023-07-15 14:41:23 -07:00
gerriteventssh.py Refactor Gerrit driver event sources 2023-07-13 14:02:46 -07:00
gerritmodel.py gerrit: Add approval-change trigger 2024-05-03 15:39:46 -07:00
gerritreporter.py Make the test change database serializable 2024-05-14 10:53:20 -07:00
gerritsource.py Stop using submitted-together for submitWholeTopic 2024-06-10 18:13:45 -07:00
gerrittrigger.py gerrit: Add approval-change trigger 2024-05-03 15:39:46 -07:00