Data on googlesource.com suggests that we spend a significant amount of
time loading accounts from NoteDb. This is true for all Gerrit
installations, but especially for distributed setups or setups that
restart often.
This commit serializes the AccountCache using established mechanisms.
To do that, we decompose AccountState - the entity that we currently
cache - into smaller chunks that can be cached individually:
1) External IDs + user name (cached in ExternalIdCache)
2) CachedAccountDetails (newly cached)
3) Gerrit's default settings (we start caching this in a follow-up
change)
CachedAccountDetails - a new class representing all information stored
under the user's ref (refs/users/<sharded-id>) is now cached in the
'accounts' cache instead of AccountState. AccountState is contructed
when requested from the sources 1-3 and not cached itself as it's
just a plain wrapper around other state, that we already cache.
This has the following advantages:
1) CachedAccountDetails contains only details from
refs/users/<sharded-id>.
By that, we can use the SHA1 of that ref as cache key and start
serializing the cache to eliminate cold start penalty as well as
router assignment change penalty (for distributed setups).
It also means that we don't have to do invalidation ourselves
anymore.
2) When the server's default preferences change, we don't have to
invalidate all accounts anymore. This is a shortcoming of the
current approach.
3) The projected speed improvements that come from persisting the
cache makes it so that we can remove the logic to load accounts
in parallel.
The new aproach also means that:
1) We now need to get the SHA1 from refs/users/<sharded-id> for
every account that we look up. Data suggests that this is not an
issue for latency as ref lookups are cheap. We retain the method
in AccountCacheImpl that allows the caller to load a
Set<AccountState> so that in the cases where we want many many
accounts (change queries, ...) we have to open All-Users only once.
In case we discover that - against our assumptions - this is a
bottleneck we can add a small in-memory cache for AccountState.
Related prework:
The new aproach shows that the way we handle user preferences is
suboptimal, because:
1) We pipe through API data types to the storage
2) We overlay defaults directly in the storage
3) Use reflection to get/set fields.
I considered and prototyped a rewrite of this and initially thought I
could get it done before serializing the account cache. However it turned
out to be significantly more work and the impact of that work (besides
being a much desired cleanup) is rather low. So I decided to get the
cache serialized independently.
Change-Id: I61ae57802f37c62ee9e3552e4a0f19fe3d8d762b