The current system uses an ExternalId cache that is serialized and
written to disk (if configured). The cache holds an entire generation
of all external IDs, keyed by the SHA1 of `refs/meta/external-ids`.
This is roughly `Cache<ObjectId, List<ExternalID>>`.
Prior to this commit, on servers where an update does not originate
(on other masters or slaves), the cache loader would re-read all
external IDs from Git when it was called. On googlesource.com, these
regenerations can take up to 60 seconds. Within this time, the Gerrit
instance comes to a grinding halt as lots of code paths depend on a
value of this cache being present. Authentication is one of them.
This commit rewrites the loader and implements a differential
computation approach to compute the new state from a previously cached
state by applying the modifications using a Git diff.
Given the SHA1 (tip in refs/meta/external-ids) that is requested, the
logic first tries to find a state that we have cached by walking Git
history. This is best-effort and we allow at most 10 commits to be
walked.
Once a prior state is found, we use that state's SHA1 to do a tree diff
between that and the requested state. The new state is then generated by
applying the same mutations.
JGit's DiffFormatter is smart in that it only traverses trees that have
changed and doesn't load any file content which ensures that we only
perform the minimal number of Git operations necessary. This is
necessary because NotesMap (the storage format of external ids on disk)
shards pretty aggressively and we don't want to load all trees when
applying only deltas.
Once the (tree) diff is computed, we read the newly added external IDs
using an ObjectReader.
There is currently a broader discussion going on about if the primary
storage format of external IDs should be changed (I87119506ec04).
This commit doesn't answer or interfere with that discussion. However,
if that redesign is required will - apart from other things - depend on
the impact of this commit on the problems that I87119506ec04 outlines.
We hope that this commit already mitigates a large chunk of the slow
loading issues. We will use micro benchmarking and look closer at how
the collections are handled if there is a need after this commit.
Change-Id: I0e67d3538e2ad17812598a1523e78fd71a7bd88a