gerrit/gerrit-oauth
Edwin Kempin 744d2b8967 Migrate external IDs to NoteDb (part 1)
In NoteDb external IDs are stored in the All-Users repository in a Git
Notes branch called refs/meta/external-ids where the sha1 of the
external ID is used as note name. Each note content is a Git config
file that contains an external ID. It has exactly one externalId
subsection with an accountId and optionally email and password:

  [externalId "username:jdoe"]
     accountId = 1003407
     email = jdoe@example.com
     password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7

Storing the external IDs in a Git Notes branch with using the sha1 of
the external ID as note name ensures that external IDs are unique and
are only assigned to a single account. If it is tried to assign the
same external ID concurrently to different accounts, only one Git
update succeeds while the other Git updates fail with LOCK_FAILURE.
This means assigning external IDs is also safe in a multimaster setup
if a consensus algorithm for updating Git refs is implemented (which
is needed for multimaster in any case). Alternatively it was
considered to store the external IDs per account as Git config file in
the refs/users/<sharded-id> user branches in the All-Users repository
(see abandoned change 9f9f07ef). This approach was given up because in
race conditions it allowed to assign the same external ID to different
accounts by updating different branches in Git.

To support a live migration on a multi-master Gerrit installation, the
migration of external IDs from ReviewDb to NoteDb is done in 2 steps:

- part 1 (this change):
  * always write to both backends (ReviewDb and NoteDb)
  * always read external IDs from ReviewDb
  * upgraded instances write to both backends, old instances only
    write to ReviewDb
  * after upgrading all instances (all still read from ReviewDb)
    run a batch to copy all external IDs from the ReviewDb to NoteDb
- part 2 (next change):
  * bump the database schema version
  * migrate the external IDs from ReviewDb to NoteDb (for single instance
    Gerrit servers)
  * read external IDs from NoteDb
  * delete the database table

With this change reading external IDs from NoteDb is not implemented
yet. This is because the storage format of external IDs in NoteDb
doesn't support efficient lookup of external IDs by account and this
problem is only addressed in the follow-up change (it adds a cache for
external IDs, but this cache uses the revision of the notes branch as
key, and hence can be only implemented once the external IDs are fully
migrated to NoteDb and storing external IDs in ReviewDb is dropped).

The ExternalIdsUpdate class implements updating of external IDs in
both NoteDb and ReviewDb. It provides various methods to update
external IDs (e.g. insert, upsert, delete, replace). For NoteDb each
method invocation leads to one commit in the Git notes branch.
ExternalIdsUpdate has two factories, User and Server. This allows to
record either the calling user or the Gerrit server identity as
committer for an update of the external Ids.

External IDs are now represented by a new AutoValue class called
ExternalId. This class replaces the usage of the old gwtorm entity
AccountExternalId class. For ExternalId scheme names are the same as for
AccountExternalId but no longer include the trailing ':'.

The class ExternalIdsOnInit makes it possible to update external IDs
during the init phase. This is required for inserting external IDs for
the initial admin user which is created by InitAdminUser. We need a
special class for this since not all dependencies of ExternalIdsUpdate
are available during init.

The class ExternalIdsBatchUpdate allows to do batch updates to
external IDs. For NoteDb all updates will result in a single commit to
the refs/meta/external-ids Git notes branch.

LocalUsernamesToLowerCase is now always converting the usernames in a
single thread only. This allows us to get a single commit for the
username convertion in NoteDb (this would not be possible if workers
do updates in parallel). Since LocalUsernamesToLowerCase is rather
light-weight being able to parallelize work is not really needed and
removing the workers simplifies the code significantly.

To protect the refs/meta/external-ids Git notes branch in the All-Users
repository read access for this ref is only allowed to users that have
the 'Access Database' global capability assigned. In addition
there is a commit validator that disallows updating the
refs/meta/external-ids branch by push. This is to prevent that the
external IDs in NoteDb diverge from the external IDs in ReviewDb while
the migration to NoteDb is not fully done yet.

Change-Id: Ic9bd5791e84ee8d332ccb1f709970b59ee66b308
Signed-off-by: Edwin Kempin <ekempin@google.com>
2017-02-28 09:09:39 +01:00
..
2017-02-28 09:09:39 +01:00