fd6ddff55c
Change-Id: I812bcedb07d8986d938b47d0a70bc1e89c7f05d6
180 lines
8.5 KiB
Plaintext
180 lines
8.5 KiB
Plaintext
= Gerrit Code Review - NoteDb Backend
|
|
|
|
NoteDb is the next generation of Gerrit storage backend, which replaces the
|
|
traditional SQL backend for change and account metadata with storing data in the
|
|
same repository as code changes.
|
|
|
|
.Advantages
|
|
- *Simplicity*: All data is stored in one location in the site directory, rather
|
|
than being split between the site directory and a possibly external database
|
|
server.
|
|
- *Consistency*: Replication and backups can use a snapshot of the Git
|
|
repository refs, which will include both the branch and patch set refs, and
|
|
the change metadata that points to them.
|
|
- *Auditability*: Rather than storing mutable rows in a database, modifications
|
|
to changes are stored as a sequence of Git commits, automatically preserving
|
|
history of the metadata. +
|
|
There are no strict guarantees, and meta refs may be rewritten, but the
|
|
default assumption is that all operations are logged.
|
|
- *Extensibility*: Plugin developers can add new fields to metadata without the
|
|
core database schema having to know about them.
|
|
- *New features*: Enables simple federation between Gerrit servers, as well as
|
|
offline code review and interoperation with other tools.
|
|
|
|
== Current Status
|
|
|
|
- Storing change metadata is fully implemented in master, and is live on the
|
|
servers behind `googlesource.com`. In other words, if you use
|
|
link:https://gerrit-review.googlesource.com/[gerrit-review], you're already
|
|
using NoteDb. +
|
|
- Storing some account data, e.g. user preferences, is implemented in releases
|
|
back to 2.13.
|
|
- Storing the rest of account data is a work in progress.
|
|
- Storing group data is a work in progress.
|
|
|
|
To use a configuration similar to the current state of `googlesource.com`, paste
|
|
the following config snippet in your `gerrit.config`:
|
|
|
|
----
|
|
[noteDb "changes"]
|
|
write = true
|
|
read = true
|
|
primaryStorage = NOTE_DB
|
|
disableReviewDb = true
|
|
----
|
|
|
|
This is the configuration that will be produced if you enable experimental
|
|
NoteDb support on a new site with `init`.
|
|
|
|
`googlesource.com` additionally uses `fuseUpdates = true`, because its repo
|
|
backend supports atomic multi-ref transactions. Native JGit doesn't, so setting
|
|
this option on a dev server would fail.
|
|
|
|
For an example NoteDb change, poke around at this one:
|
|
----
|
|
git fetch https://gerrit.googlesource.com/gerrit refs/changes/70/98070/meta \
|
|
&& git log -p FETCH_HEAD
|
|
----
|
|
|
|
== Configuration
|
|
|
|
Account and group data is migrated to NoteDb automatically using the normal
|
|
schema upgrade process during updates. The remainder of this section details the
|
|
configuration options that control migration of the change data, which is mostly
|
|
but not fully implemented.
|
|
|
|
Change migration state is configured in `gerrit.config` with options like
|
|
`noteDb.changes.*`. These options are undocumented outside of this file, and the
|
|
general approach has been to add one new option for each phase of the migration.
|
|
Assume that each config option in the following list requires all of the
|
|
previous options, unless otherwise noted.
|
|
|
|
- `noteDb.changes.write=true`: During a ReviewDb write, the state of the change
|
|
in NoteDb is written to the `note_db_state` field in the `Change` entity.
|
|
After the ReviewDb write, this state is written into NoteDb, resulting in
|
|
effectively double the time for write operations. NoteDb write errors are
|
|
dropped on the floor, and no attempt is made to read from ReviewDb or correct
|
|
errors (without additional configuration, below). +
|
|
This state allows for a rolling update in a multi-master setting, where some
|
|
servers can start reading from NoteDb, but older servers are still reading
|
|
only from ReviewDb.
|
|
- `noteDb.changes.read=true`: Change data is written
|
|
to and read from NoteDb, but ReviewDb is still the source of truth. During
|
|
reads, first read the change from ReviewDb, and compare its `note_db_state`
|
|
with what is in NoteDb. If it doesn't match, immediately "auto-rebuild" the
|
|
change, copying data from ReviewDb to NoteDb and returning the result.
|
|
- `noteDb.changes.primaryStorage=NOTE_DB`: New changes are written only to
|
|
NoteDb, but changes whose primary storage is ReviewDb are still supported.
|
|
Continues to read from ReviewDb first as in the previous stage, but if the
|
|
change is not in ReviewDb, falls back to reading from NoteDb. +
|
|
Migration of existing changes is described in the link:#migration[Migration]
|
|
section below. +
|
|
Due to an implementation detail, writes to Changes or related tables still
|
|
result in write calls to the database layer, but they are inside a transaction
|
|
that is always rolled back.
|
|
- `noteDb.changes.disableReviewDb=true`: All access to Changes or related tables
|
|
is disabled; reads return no results, and writes are no-ops. Assumes the state
|
|
of all changes in NoteDb is accurate, and so is only safe once all changes are
|
|
NoteDb primary. Otherwise, reading changes only from NoteDb might result in
|
|
inaccurate results, and writing to NoteDb would compound the problem. +
|
|
Thus it is up to an admin of a previously-ReviewDb site to ensure
|
|
MigratePrimaryStorage has been run for all changes. Note that the current
|
|
implementation of the `migrate-to-note-db` program does not do this. +
|
|
In this phase, it would be possible to delete the Changes tables out from
|
|
under a running server with no effect.
|
|
- `noteDb.changes.fuseUpdates=true`: Code and meta updates within a single
|
|
repository are fused into a single atomic `BatchRefUpdate`, so they either
|
|
all succeed or all fail. This mode is only possible on a backend that
|
|
supports atomic ref updates, which notably excludes the default file repository
|
|
backend.
|
|
|
|
[[migration]]
|
|
== Migration
|
|
|
|
Once configuration options are set, migration to NoteDb is primarily
|
|
accomplished by running the `migrate-to-note-db` program. Currently, this program
|
|
bulk copies ReviewDb data into NoteDb, but leaves primary storage of these
|
|
changes in ReviewDb, so the site is runnable with
|
|
`noteDb.changes.{write,read}=true`, but ReviewDb is still required.
|
|
|
|
Eventually, `migrate-to-note-db` will set primary storage to NoteDb for all
|
|
changes by default, so a site will be able to stop using ReviewDb for changes
|
|
immediately after a successful run.
|
|
|
|
There is code in `PrimaryStorageMigrator.java` to migrate individual changes
|
|
from NoteDb primary to ReviewDb primary. This code is not intended to be used
|
|
except in the event of a critical bug in NoteDb primary changes in production.
|
|
It will likely never be used by `migrate-to-note-db`, and in fact it's not
|
|
recommended to run `migrate-to-note-db` until the code is stable enough that the
|
|
reverse migration won't be necessary.
|
|
|
|
=== Zero-Downtime Multi-Master Migration
|
|
|
|
Single-master Gerrit sites can use `migrate-to-note-db` on an offline site to
|
|
rebuild NoteDb, but this doesn't work in a zero-downtime environment like
|
|
googlesource.com.
|
|
|
|
Here, the migration process looks like:
|
|
|
|
- Turn on `noteDb.changes.write=true` to start writing to NoteDb.
|
|
- Run a parallel link:https://research.google.com/pubs/pub35650.html[FlumeJava]
|
|
pipeline to write NoteDb data for all changes, and update all `note_db_state`
|
|
fields. (Sorry, this implementation is entirely closed-source.)
|
|
- Turn on `noteDb.changes.read=true` to start reading from NoteDb.
|
|
- Turn on `noteDb.changes.primaryStorage=NOTE_DB` to start writing new changes
|
|
to NoteDb only.
|
|
- Run a Flume to migrate all existing changes to NoteDb primary. (Also
|
|
closed-source, but basically just a wrapper around `PrimaryStorageMigrator`.)
|
|
- Turn off access to ReviewDb changes tables.
|
|
|
|
== Configuration
|
|
|
|
This section describes general configuration for the NoteDb backend that is not
|
|
directly related to the migration process. These options will continue to be
|
|
supported after the migration is complete and the migration-related options are
|
|
obsolete.
|
|
|
|
[[noteDb.retryMaxWait]]noteDb.retryMaxWait::
|
|
+
|
|
Maximum time to wait between attempts to retry update operations when one
|
|
attempt fails due to contention (aka lock failure) on the underlying ref
|
|
storage. Operations are retried with exponential backoff, plus some random
|
|
jitter, until the interval reaches this limit. After that, retries continue to
|
|
occur after a fixed timeout (plus jitter), up to
|
|
link:#noteDb.retryTimeout[`noteDb.retryTimeout`].
|
|
+
|
|
Only applicable when `noteDb.changes.fuseUpdates=true`.
|
|
+
|
|
Defaults to 5 seconds; unit suffixes are supported, and assumes milliseconds if
|
|
not specified.
|
|
|
|
[[noteDb.retryTimeout]]noteDb.retryTimeout::
|
|
+
|
|
Total timeout for retrying update operations when one attempt fails due to
|
|
contention (aka lock failure) on the underlying ref storage.
|
|
+
|
|
Only applicable when `noteDb.changes.fuseUpdates=true`.
|
|
+
|
|
Defaults to 20 seconds; unit suffixes are supported, and assumes milliseconds if
|
|
not specified.
|