gerrit/Documentation/note-db.txt
David Pursehouse 0e78b7aeaf Merge branch 'stable-2.16' into stable-3.0
* stable-2.16:
  Doc: Fix SSH keys documentation
  REST: Fix bad content type description
  Add reference to remove-notedb-refs.sh script

Change-Id: I4478ff00f6e2e39e46bc2350fb2e484328f09669
2019-11-30 11:24:21 +09:00

202 lines
8.6 KiB
Plaintext

= Gerrit Code Review - NoteDb Backend
NoteDb is the next generation of Gerrit storage backend, which replaces the
traditional SQL backend for change, account and group metadata with storing
data in the same repository as code changes.
.Advantages
- *Simplicity*: All data is stored in one location in the site directory, rather
than being split between the site directory and a possibly external database
server.
- *Consistency*: Replication and backups can use a snapshot of the Git
repository refs, which will include both the branch and patch set refs, and
the change metadata that points to them.
- *Auditability*: Rather than storing mutable rows in a database, modifications
to changes are stored as a sequence of Git commits, automatically preserving
history of the metadata. +
There are no strict guarantees, and meta refs may be rewritten, but the
default assumption is that all operations are logged.
- *Extensibility*: Plugin developers can add new fields to metadata without the
core database schema having to know about them.
- *New features*: Enables simple federation between Gerrit servers, as well as
offline code review and interoperation with other tools.
== Current Status
- Storing change metadata is fully implemented in the 2.15 release, and is the
default for new sites.
- Admins may use an link:#offline-migration[offline] or
link:#online-migration[online] tool to migrate change data in an existing
site from ReviewDb.
- Storing link:config-accounts.html[account data] is fully implemented in the
2.15 release. Account data is migrated automatically during the upgrade
process by running `gerrit.war init`.
- Storing link:config-groups.html[group metadata] is fully implemented
in the 2.16 release. Group data is migrated automatically during
the upgrade process by running `gerrit.war init`
- Account, group and change metadata on the servers behind `googlesource.com` is fully
migrated to NoteDb. In other words, if you use
link:https://gerrit-review.googlesource.com/[gerrit-review], you're already
using NoteDb.
- NoteDb is the only database format supported by Gerrit 3.0. The change data
migration tools are only included in Gerrit 2.15 and 2.16; they are not
available in 3.0.
For an example NoteDb change, poke around at this one:
----
git fetch https://gerrit.googlesource.com/gerrit refs/changes/70/98070/meta \
&& git log -p FETCH_HEAD
----
[[migration]]
== Migration
Migrating change metadata can take a long time for large sites, so
administrators choose whether to do the migration offline or online, depending
on their available resources and tolerance for downtime.
Only change metadata requires manual steps to migrate it from ReviewDb; account
and group data is migrated automatically by `gerrit.war init`.
[[online-migration]]
=== Online
Note that online migration is only available in 2.x. To do the online migration
from 2.14.x or 2.15.x to 3.0, it is necessary to first upgrade to 2.16.x.
To start the online migration, set the `noteDb.changes.autoMigrate` option in
`gerrit.config` and restart Gerrit:
----
[noteDb "changes"]
autoMigrate = true
----
Alternatively, pass the `--migrate-to-note-db` flag to
`gerrit.war daemon`:
----
java -jar gerrit.war daemon -d /path/to/site --migrate-to-note-db
----
Both ways of starting the online migration are equivalent. Once started, it is
safe to restart the server at any time; the migration will pick up where it left
off. Migration progress will be reported to the Gerrit logs.
*Advantages*
* No downtime required.
*Disadvantages*
* Only available in 2.x; not available in Gerrit 3.0.
* Much slower than offline; uses only a single thread, to leave resources
available for serving traffic.
* Performance may be degraded, particularly of updates; data needs to be written
to both ReviewDb and NoteDb while the migration is in progress.
[[offline-migration]]
=== Offline
To run the offline migration, run the `migrate-to-note-db` program:
----
java -jar gerrit.war migrate-to-note-db -d /path/to/site
----
Once started, it is safe to cancel and restart the migration process, or to
switch to the online process.
[NOTE]
Migration requires a heap size comparable to running a Gerrit server. If you
normally run `gerrit.war daemon` with an `-Xmx` flag, pass that to the migration
tool as well.
*Advantages*
* Much faster than online; can use all available CPUs, since no live traffic
needs to be served.
* No degraded performance of live servers due to writing data to 2 locations.
*Disadvantages*
* Available in Gerrit 2.15 and 2.16 only.
* May require substantial downtime; takes about twice as long as an
link:pgm-reindex.html[offline reindex]. (In fact, one of the migration steps is a
full reindex, so it can't possibly take less time.)
[[trial-migration]]
==== Trial mode
The migration tool also supports "trial mode", where changes are
migrated to NoteDb and read from NoteDb at runtime, but their primary storage
location is still ReviewDb, and data is kept in sync between the two locations.
To run the migration in trial mode, add `--trial` to `migrate-to-note-db` or
`daemon`:
----
java -jar gerrit.war migrate-to-note-db --trial -d /path/to/site
# OR
java -jar gerrit.war daemon -d /path/to/site --migrate-to-note-db --trial
----
Or, set `noteDb.changes.trial=true` in `gerrit.config`.
There are several use cases for trial mode:
* Help test early releases of the migration tool for bugs with lower risk.
* Try out new NoteDb-only features like
link:rest-api-changes.html#get-hashtags[hashtags] without running the full
migration.
To continue with the full migration after running the trial migration, use
either the online or offline migration steps as normal. To revert to
ReviewDb-only, remove `noteDb.changes.read` and `noteDb.changes.write` from
`notedb.config` and restart Gerrit.
== Configuration
The migration process works by setting a configuration option in `notedb.config`
for each step in the process, then performing the corresponding data migration.
Config options are read from `notedb.config` first, falling back to
`gerrit.config`. If editing config manually, you may edit either file, but the
migration process itself only touches `notedb.config`. This means if your
`gerrit.config` is managed with Puppet or a similar tool, it can overwrite
`gerrit.config` without affecting the migration process. You should not manage
`notedb.config` with Puppet, but you may copy values back into `gerrit.config`
and delete `notedb.config` at some later point after completing the migration.
In general, users should not set the options described below manually; this
section serves primarily as a reference.
- `noteDb.changes.write=true`: During a ReviewDb write, the state of the change
in NoteDb is written to the `note_db_state` field in the `Change` entity.
After the ReviewDb write, this state is written into NoteDb, resulting in
effectively double the time for write operations. NoteDb write errors are
dropped on the floor, and no attempt is made to read from ReviewDb or correct
errors (without additional configuration, below).
- `noteDb.changes.read=true`: Change data is written
to and read from NoteDb, but ReviewDb is still the source of truth. During
reads, first read the change from ReviewDb, and compare its `note_db_state`
with what is in NoteDb. If it doesn't match, immediately "auto-rebuild" the
change, copying data from ReviewDb to NoteDb and returning the result.
- `noteDb.changes.primaryStorage=NOTE_DB`: New changes are written only to
NoteDb, but changes whose primary storage is ReviewDb are still supported.
Continues to read from ReviewDb first as in the previous stage, but if the
change is not in ReviewDb, falls back to reading from NoteDb. +
Migration of existing changes is described in the link:#migration[Migration]
section above. +
Due to an implementation detail, writes to Changes or related tables still
result in write calls to the database layer, but they are inside a transaction
that is always rolled back.
- `noteDb.changes.disableReviewDb=true`: All access to Changes or related tables
is disabled; reads return no results, and writes are no-ops. Assumes the state
of all changes in NoteDb is accurate, and so is only safe once all changes are
NoteDb primary. Otherwise, reading changes only from NoteDb might result in
inaccurate results, and writing to NoteDb would compound the problem. +
== NoteDB to ReviewDB rollback
In case of rollback from NoteDB to ReviewDB, all the meta refs and the
sequence ref need to be removed.
The [remove-notedb-refs.sh](https://gerrit.googlesource.com/gerrit/+/refs/heads/master/contrib/remove-notedb-refs.sh)
script has been written to automate this process.