f7f400c9bf
External IDs are also used to store the account username and email addresses, but the introduction gives the impression that it is only for external IDs such as LDAP or OAuth. Change-Id: Ie31163ecc3337f75677b99313abcaa110a8b3cbf
430 lines
17 KiB
Plaintext
430 lines
17 KiB
Plaintext
= Gerrit Code Review - Accounts
|
||
|
||
== Overview
|
||
|
||
Starting from 2.15 Gerrit accounts are fully stored in
|
||
link:note-db.html[NoteDb].
|
||
|
||
The account data consists of a sequence number (account ID), account
|
||
properties (full name, preferred email, registration date, status,
|
||
inactive flag), preferences (general, diff and edit preferences),
|
||
project watches, SSH keys, external IDs, starred changes and reviewed
|
||
flags.
|
||
|
||
Most account data is stored in a special link:#all-users[All-Users]
|
||
repository, which has one branch per user. Within the user branch there
|
||
are Git config files for the link:#account-properties[
|
||
account properties], the link:#preferences[account preferences] and the
|
||
link:#project-watches[project watches]. In addition there is an
|
||
`authorized_keys` file for the link:#ssh-keys[SSH keys] that follows
|
||
the standard OpenSSH file format.
|
||
|
||
The account data in the user branch is versioned and the Git history of
|
||
this branch serves as an audit log.
|
||
|
||
The link:#external-ids[external IDs] are stored as Git Notes inside the
|
||
`All-Users` repository in the `refs/meta/external-ids` notes branch.
|
||
Storing all external IDs in a notes branch ensures that each external
|
||
ID is only used once.
|
||
|
||
The link:#starred-changes[starred changes] are represented as
|
||
independent refs in the `All-Users` repository. They are not stored in
|
||
the user branch, since this data doesn't need versioning.
|
||
|
||
The link:#reviewed-flags[reviewed flags] are not stored in Git, but are
|
||
persisted in a database table. This is because there is a high volume
|
||
of reviewed flags and storing them in Git would be inefficient.
|
||
|
||
Since accessing the account data in Git is not fast enough for account
|
||
queries, e.g. when suggesting reviewers, Gerrit has a
|
||
link:#account-index[secondary index for accounts].
|
||
|
||
[[all-users]]
|
||
== `All-Users` repository
|
||
|
||
The `All-Users` repository is a special repository that only contains
|
||
user-specific information. It contains one branch per user. The user
|
||
branch is formatted as `refs/users/CD/ABCD`, where `CD/ABCD` is the
|
||
link:access-control.html#sharded-user-id[sharded account ID], e.g. the
|
||
user branch for account `1000856` is `refs/users/56/1000856`. The
|
||
account IDs in the user refs are sharded so that there is a good
|
||
distribution of the Git data in the storage system.
|
||
|
||
A user branch must exist for each account, as it represents the
|
||
account. The files in the user branch are all optional. This means
|
||
having a user branch with a tree that is completely empty is also a
|
||
valid account definition.
|
||
|
||
Updates to the user branch are done through the
|
||
link:rest-api-accounts.html[Gerrit REST API], but users can also
|
||
manually fetch their user branch and push changes back to Gerrit. On
|
||
push the user data is evaluated and invalid user data is rejected.
|
||
|
||
To hide the implementation detail of the sharded account ID in the ref
|
||
name Gerrit offers a magic `refs/users/self` ref that is automatically
|
||
resolved to the user branch of the calling user. The user can then use
|
||
this ref to fetch from and push to the own user branch. E.g. if user
|
||
`1000856` pushes to `refs/users/self`, the branch
|
||
`refs/users/56/1000856` is updated. In Gerrit `self` is an established
|
||
term to refer to the calling user (e.g. in change queries). This is why
|
||
the magic ref for the own user branch is called `refs/users/self`.
|
||
|
||
A user branch should only be readable and writeable by the user to whom
|
||
the account belongs. To assign permissions on the user branches the
|
||
normal branch permission system is used. In the permission system the
|
||
user branches are specified as `refs/users/${shardeduserid}`. The
|
||
`${shardeduserid}` variable is resolved to the sharded account ID. This
|
||
variable is used to assign default access rights on all user branches
|
||
that apply only to the owning user. The following permissions are set
|
||
by default when a Gerrit site is newly installed or upgraded to a
|
||
version which supports user branches:
|
||
|
||
.All-Users project.config
|
||
----
|
||
[access "refs/users/${shardeduserid}"]
|
||
exclusiveGroupPermissions = read push submit
|
||
read = group Registered Users
|
||
push = group Registered Users
|
||
label-Code-Review = -2..+2 group Registered Users
|
||
submit = group Registered Users
|
||
----
|
||
|
||
The user branch contains several files with account data which are
|
||
described link:#account-data-in-user-branch[below].
|
||
|
||
In addition to the user branches the `All-Users` repository also
|
||
contains a branch for the link:#external-ids[external IDs] and special
|
||
refs for the link:#starred-changes[starred changes].
|
||
|
||
Also the next available value of the link:#account-sequence[account
|
||
sequence] is stored in the `All-Users` repository.
|
||
|
||
[[account-index]]
|
||
== Account Index
|
||
|
||
There are several situations in which Gerrit needs to query accounts,
|
||
e.g.:
|
||
|
||
* For sending email notifications to project watchers.
|
||
* For reviewer suggestions.
|
||
|
||
Accessing the account data in Git is not fast enough for account
|
||
queries, since it requires accessing all user branches and parsing
|
||
all files in each of them. To overcome this Gerrit has a secondary
|
||
index for accounts. The account index is either based on
|
||
link:config-gerrit.html#index.type[Lucene or Elasticsearch].
|
||
|
||
Via the link:rest-api-accounts.html#query-account[Query Account] REST
|
||
endpoint link:user-search-accounts.html[generic account queries] are
|
||
supported.
|
||
|
||
Accounts are automatically reindexed on any update. The
|
||
link:rest-api-accounts.html#index-account[Index Account] REST endpoint
|
||
allows to reindex an account manually. In addition the
|
||
link:pgm-reindex.html[reindex] program can be used to reindex all
|
||
accounts offline.
|
||
|
||
[[account-data-in-user-branch]]
|
||
== Account Data in User Branch
|
||
|
||
A user branch contains several Git config files with the account data:
|
||
|
||
* `account.config`:
|
||
+
|
||
Stores the link:#account-properties[account properties].
|
||
|
||
* `preferences.config`:
|
||
+
|
||
Stores the link:#preferences[user preferences] of the account.
|
||
|
||
* `watch.config`:
|
||
+
|
||
Stores the link:#project-watches[project watches] of the account.
|
||
|
||
In addition it contains an
|
||
link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
|
||
authorized_keys] file with the link:#ssh-keys[SSH keys] of the account.
|
||
|
||
[[account-properties]]
|
||
=== Account Properties
|
||
|
||
The account properties are stored in the user branch in the
|
||
`account.config` file:
|
||
|
||
----
|
||
[account]
|
||
fullName = John Doe
|
||
preferredEmail = john.doe@example.com
|
||
status = OOO
|
||
active = false
|
||
----
|
||
|
||
For active accounts the `active` parameter can be omitted.
|
||
|
||
The registration date is not contained in the `account.config` file but
|
||
is derived from the timestamp of the first commit on the user branch.
|
||
|
||
When users update their account properties by pushing to the user
|
||
branch, it is verified that the preferred email exists in the external
|
||
IDs.
|
||
|
||
Users are not allowed to flip the active value themselves; only
|
||
administrators and users with the
|
||
link:access-control.html#capability_modifyAccount[Modify Account]
|
||
global capability are allowed to change it.
|
||
|
||
Since all data in the `account.config` file is optional the
|
||
`account.config` file may be absent from some user branches.
|
||
|
||
[[preferences]]
|
||
=== Preferences
|
||
|
||
The account properties are stored in the user branch in the
|
||
`preferences.config` file. There are separate sections for
|
||
link:intro-user.html#preferences[general],
|
||
link:user-review-ui.html#diff-preferences[diff] and edit preferences:
|
||
|
||
----
|
||
[general]
|
||
showSiteHeader = false
|
||
[diff]
|
||
hideTopMenu = true
|
||
[edit]
|
||
lineLength = 80
|
||
----
|
||
|
||
The parameter names match the names that are used in the preferences REST API:
|
||
|
||
* link:rest-api-accounts.html#preferences-info[General Preferences]
|
||
* link:rest-api-accounts.html#diff-preferences-info[Diff Preferences]
|
||
* link:rest-api-accounts.html#edit-preferences-info[Edit Preferences]
|
||
|
||
If the value for a preference is the same as the default value for this
|
||
preference, it can be omitted in the `preferences.config` file.
|
||
|
||
Defaults for preferences that apply for all accounts can be configured
|
||
in the `refs/users/default` branch in the `All-Users` repository.
|
||
|
||
[[project-watches]]
|
||
=== Project Watches
|
||
|
||
Users can configure watches on projects to receive email notifications
|
||
for changes of that project.
|
||
|
||
A watch configuration consists of the project name and an optional
|
||
filter query. If a filter query is specified, email notifications will
|
||
be sent only for changes of that project that match this query.
|
||
|
||
In addition, each watch configuration can contain a list of
|
||
notification types that determine for which events email notifications
|
||
should be sent. E.g. a user can configure that email notifications
|
||
should only be sent if a new patch set is uploaded and when the change
|
||
gets submitted, but not on other events.
|
||
|
||
Project watches are stored in a `watch.config` file in the user branch:
|
||
|
||
----
|
||
[project "foo"]
|
||
notify = * [ALL_COMMENTS]
|
||
notify = branch:master [ALL_COMMENTS, NEW_PATCHSETS]
|
||
notify = branch:master owner:self [SUBMITTED_CHANGES]
|
||
----
|
||
|
||
The `watch.config` file has one project section for all project watches
|
||
of a project. The project name is used as subsection name and the
|
||
filters with the notification types, that decide for which events email
|
||
notifications should be sent, are represented as `notify` values in the
|
||
subsection. A `notify` value is formatted as
|
||
"<filter> [<comma-separated-list-of-notification-types>]". The
|
||
supported notification types are described in the
|
||
link:user-notify.html#notify.name.type[Email Notifications documentation].
|
||
|
||
For a change event, a notification will be sent if any `notify` value
|
||
of the corresponding project has both a filter that matches the change
|
||
and a notification type that matches the event.
|
||
|
||
In order to send email notifications on change events, Gerrit needs to
|
||
find all accounts that watch the corresponding project. To make this
|
||
lookup fast the secondary account index is used. The account index
|
||
contains a repeated field that stores the projects that are being
|
||
watched by an account. After the accounts that watch the project have
|
||
been retrieved from the index, the complete watch configuration is
|
||
available from the account cache and Gerrit can check if any watch
|
||
matches the change and the event.
|
||
|
||
[[ssh-keys]]
|
||
=== SSH Keys
|
||
|
||
SSH keys are stored in the user branch in an `authorized_keys` file,
|
||
which is the
|
||
link:https://en.wikibooks.org/wiki/OpenSSH/Client_Configuration_Files#.7E.2F.ssh.2Fauthorized_keys[
|
||
standard OpenSSH file format] for storing SSH keys:
|
||
|
||
----
|
||
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCgug5VyMXQGnem2H1KVC4/HcRcD4zzBqSuJBRWVonSSoz3RoAZ7bWXCVVGwchtXwUURD689wFYdiPecOrWOUgeeyRq754YWRhU+W28vf8IZixgjCmiBhaL2gt3wff6pP+NXJpTSA4aeWE5DfNK5tZlxlSxqkKOS8JRSUeNQov5Tw== john.doe@example.com
|
||
# DELETED
|
||
# INVALID ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQDm5yP7FmEoqzQRDyskX+9+N0q9GrvZeh5RG52EUpE4ms/Ujm3ewV1LoGzc/lYKJAIbdcZQNJ9+06EfWZaIRA3oOwAPe1eCnX+aLr8E6Tw2gDMQOGc5e9HfyXpC2pDvzauoZNYqLALOG3y/1xjo7IH8GYRS2B7zO/Mf9DdCcCKSfw== john.doe@example.com
|
||
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAAAgQCaS7RHEcZ/zjl9hkWkqnm29RNr2OQ/TZ5jk2qBVMH3BgzPsTsEs+7ag9tfD8OCj+vOcwm626mQBZoR2e3niHa/9gnHBHFtOrGfzKbpRjTWtiOZbB9HF+rqMVD+Dawo/oicX/dDg7VAgOFSPothe6RMhbgWf84UcK5aQd5eP5y+tQ== john.doe@example.com
|
||
----
|
||
|
||
When the SSH API is used, Gerrit needs an efficient way to lookup SSH
|
||
keys by username. Since the username can be easily resolved to an
|
||
account ID (via the account cache), accessing the SSH keys in the user
|
||
branch is fast.
|
||
|
||
To identify SSH keys in the REST API Gerrit uses
|
||
link:rest-api-accounts.html#ssh-key-id[sequence numbers per account].
|
||
This is why the order of the keys in the `authorized_keys` file is
|
||
used to determine the sequence numbers of the keys (the sequence
|
||
numbers start at 1).
|
||
|
||
To keep the sequence numbers intact when a key is deleted, a
|
||
'# DELETED' line is inserted at the position where the key was deleted.
|
||
|
||
Invalid keys are marked with the prefix '# INVALID'.
|
||
|
||
[[external-ids]]
|
||
== External IDs
|
||
|
||
External IDs are used to link identities, such as the username and email
|
||
addresses, and external identies such as an LDAP account or an OAUTH
|
||
identity, to an account in Gerrit.
|
||
|
||
External IDs are stored as Git Notes in the `All-Users` repository. The
|
||
name of the notes branch is `refs/meta/external-ids`.
|
||
|
||
As note key the SHA1 of the external ID key is used, for example the key
|
||
for the external ID `username:jdoe` is `e0b751ae90ef039f320e097d7d212f490e933706`.
|
||
This ensures that an external ID is used only once (e.g. an external ID can
|
||
never be assigned to multiple accounts at a point in time).
|
||
|
||
[IMPORTANT]
|
||
If the external ID key is changed manually you must adapt the note key
|
||
to the new SHA1, otherwise the external ID becomes inconsistent and is
|
||
ignored by Gerrit.
|
||
|
||
The note content is a Git config file:
|
||
|
||
----
|
||
[externalId "username:jdoe"]
|
||
accountId = 1003407
|
||
email = jdoe@example.com
|
||
password = bcrypt:4:LCbmSBDivK/hhGVQMfkDpA==:XcWn0pKYSVU/UJgOvhidkEtmqCp6oKB7
|
||
----
|
||
|
||
The config file has one `externalId` section. The external ID key, which
|
||
consists of scheme and ID in the format '<scheme>:<id>', is used as
|
||
subsection name.
|
||
|
||
The `accountId` field is mandatory. The `email` and `password` fields
|
||
are optional.
|
||
|
||
The external IDs are maintained by Gerrit. This means users are not
|
||
allowed to manually edit their external IDs. Only users with the
|
||
link:access-control.html#capability_accessDatabase[Access Database]
|
||
global capability can push updates to the `refs/meta/external-ids`
|
||
branch. However Gerrit rejects pushes if:
|
||
|
||
* any external ID config file cannot be parsed
|
||
* if a note key does not match the SHA of the external ID key in the
|
||
note content
|
||
* external IDs for non-existing accounts are contained
|
||
* invalid emails are contained
|
||
* any email is not unique (the same email is assigned to multiple
|
||
accounts)
|
||
* hashed passwords of external IDs with scheme `username` cannot be
|
||
decoded
|
||
|
||
[[starred-changes]]
|
||
== Starred Changes
|
||
|
||
link:dev-stars.html[Starred changes] allow users to mark changes as
|
||
favorites and receive email notifications for them.
|
||
|
||
Each starred change is a tuple of an account ID, a change ID and a
|
||
label.
|
||
|
||
To keep track of a change that is starred by an account, Gerrit creates
|
||
a `refs/starred-changes/YY/XXXX/ZZZZZZZ` ref in the `All-Users`
|
||
repository, where `YY/XXXX` is the sharded numeric change ID and
|
||
`ZZZZZZZ` is the account ID.
|
||
|
||
A starred-changes ref points to a blob that contains the list of labels
|
||
that the account set on the change. The label list is stored as UTF-8
|
||
text with one label per line.
|
||
|
||
Since JGit has explicit optimizations for looking up refs by prefix
|
||
when the prefix ends with '/', this ref format is optimized to find
|
||
starred changes by change ID. Finding starred changes by change ID is
|
||
e.g. needed when a change is updated so that all users that have
|
||
the link:dev-stars.html#default-star[default star] on the change can be
|
||
notified by email.
|
||
|
||
Gerrit also needs an efficient way to find all changes that were
|
||
starred by an account, e.g. to provide results for the
|
||
link:user-search.html#is-starred[is:starred] query operator. With the
|
||
ref format as described above the lookup of starred changes by account
|
||
ID is expensive, as this requires a scan of the full
|
||
`refs/starred-changes/*` namespace. To overcome this the users that
|
||
have starred a change are stored in the change index together with the
|
||
star labels.
|
||
|
||
[[reviewed-flags]]
|
||
== Reviewed Flags
|
||
|
||
When reviewing a patch set in the Gerrit UI, the reviewer can mark
|
||
files in the patch set as reviewed. These markers are called ‘Reviewed
|
||
Flags’ and are private to the user. A reviewed flag is a tuple of patch
|
||
set ID, file and account ID.
|
||
|
||
Each user can have many thousands of reviewed flags and over time the
|
||
number can grow without bounds.
|
||
|
||
The high amount of reviewed flags makes a storage in Git unsuitable
|
||
because each update requires opening the repository and committing a
|
||
change, which is a high overhead for flipping a bit. Therefore the
|
||
reviewed flags are stored in a database table. By default they are
|
||
stored in a local H2 database, but there is an extension point that
|
||
allows to plug in alternate implementations for storing the reviewed
|
||
flags. To replace the storage for reviewed flags a plugin needs to
|
||
implement the link:dev-plugins.html#account-patch-review-store[
|
||
AccountPatchReviewStore] interface. E.g. to support a multi-master
|
||
setup where reviewed flags should be replicated between the master
|
||
nodes one could implement a store for the reviewed flags that is
|
||
based on MySQL with replication.
|
||
|
||
[[account-sequence]]
|
||
== Account Sequence
|
||
|
||
The next available account sequence number is stored as UTF-8 text in a
|
||
blob pointed to by the `refs/sequences/accounts` ref in the `All-Users`
|
||
repository.
|
||
|
||
Multiple processes share the same sequence by incrementing the counter
|
||
using normal git ref updates. To amortize the cost of these ref
|
||
updates, processes increment the counter by a larger number and hand
|
||
out numbers from that range in memory until they run out. The size of
|
||
the account ID batch that each process retrieves at once is controlled
|
||
by the link:config-gerrit.html#notedb.accounts.sequenceBatchSize[
|
||
notedb.accounts.sequenceBatchSize] parameter in the `gerrit.config`
|
||
file.
|
||
|
||
[[replication]]
|
||
== Replication
|
||
|
||
To replicate account data the following branches from the `All-Users`
|
||
repository must be replicated:
|
||
|
||
* `refs/users/*` (user branches)
|
||
* `refs/meta/external-ids` (external IDs)
|
||
* `refs/starred-changes/*` (star labels)
|
||
* `refs/sequences/accounts` (account sequence numbers, not needed for Gerrit
|
||
slaves)
|
||
|
||
GERRIT
|
||
------
|
||
Part of link:index.html[Gerrit Code Review]
|
||
|
||
SEARCHBOX
|
||
---------
|