These classes do not depend on any Gerrit server functionality, and
could even be used to define an index without depending on the
gerrit-server package. This allows for a clearer separation of BUILD
rules; the QueryParser and antlr targets don't escape the gerrit-index
package.
The general layout thus far is to put index definition code in
com.google.gerrit.index, and query-related code (predicates, etc.) in
com.google.gerrit.index.query.
The gerrit-index package is still of limited utility on its own, because
QueryProcessor and InternalQuery still live in the server package, and
untangling their dependencies will still be a bit more work.
Change-Id: I3c4616d08ecf19d5ccd1b9b91b3fd0b1fcedd901
This change extends the existing UpdateUI to make it possible
to read string from users during the schema upgrade process.
One usage of this change is that we can allow administrators to
decide the target change status (work-in-progress or private)
during the migration of draft changes.
Change-Id: I8f7f09618e2bc76129e7fbb9613aa5b90a5e1558
Even after removing one of the factory methods, there were still 4
assisted injected constructors, which all need to take 17 identical
arguments. Adding more has been painful for a while.
Separate an AssistedFactory class that only has a single method, so we
can keep ChangeData to a single constructor. Convert Factory to a
hand-written Factory that delegates to AssistedFactory as appropriate.
This refactoring made it clear that project can never be null. We can
thus remove OrmException from the project() method, which ripples
outwards.
Change-Id: Id053561ee1e1d8a79b2ce9be501bd69834932ba7
After naively moving the classes, there were almost no incoming
references from the rest of the server packages into the new
server.receive package. This means with only a little more work, it was
possible to create a new java_library target containing just the srcs
in this new package. This is a modest step in the direction of breaking
up the giant //gerrit-server:server package, which will improve compile
times when making modifications that don't change the interface.
Change-Id: I449018a4933a999c688611142dc7ed9c18b4c828
With change I1c24da1378 there is a new Emails class that allows looking
up accounts by email. To find accounts by email it gets external IDs by
email from the ExternalIdCache and extracts the account IDs from the
external IDs. This is exactly what AccountByEmailCacheImpl.Loader was
doing. In addition the Emails class does an index lookup to also find
accounts by preferred email (see commit message of change I1c24da1378
for an explanation of why this is needed).
Change I991d21b1ac removed all usages of AccountByEmailCache by using
the Emails class instead. Hence the AccountByEmailCache can be removed
now.
Change-Id: I3a4279f5abda7ff3f03268258bb1755ce528f0d4
Signed-off-by: Edwin Kempin <ekempin@google.com>
Moving the logic to determine if a commit is reachable into
CommitsCollection pulls most of it out of the legacy ProjectControl
and RefControl.
Improve the heads or tags code slightly by determining the size of the
two collections and presizing the map for that count. This avoids an
intermediate ArrayList copy.
Rename the related tests into CommitsCollectionTest.
Change-Id: I4c4a7624a4b50335509034a968c31da90e981795
Now with atomic support for traditional on-disk repos, and finer-grained
control over reflogs.
RefUpdate.Result also grew some new error values, so extend existing
switch statements to cover them.
Change-Id: If685ed7f34d965e82cf11fcf59dd832394f2bb4a
Initializing a new site failed with:
fatal: 1) Error injecting constructor, java.lang.IllegalStateException: gerrit.basePath must be configured
fatal: at com.google.gerrit.server.git.LocalDiskRepositoryManager.<init>(LocalDiskRepositoryManager.java:118)
fatal: at com.google.gerrit.server.git.LocalDiskRepositoryManager.class(LocalDiskRepositoryManager.java:55)
fatal: while locating com.google.gerrit.server.git.LocalDiskRepositoryManager
fatal: while locating com.google.gerrit.server.git.GitRepositoryManager
fatal: Caused by: java.lang.IllegalStateException: gerrit.basePath must be configured
fatal: at com.google.gerrit.server.git.LocalDiskRepositoryManager.<init>(LocalDiskRepositoryManager.java:121)
fatal: at com.google.gerrit.server.git.LocalDiskRepositoryManager$$FastClassByGuice$$341d02c5.newInstance(<generated>)
....
We can't use LocalDiskRepositoryManager since gerrit.basePath may not be
set when it is created.
Change-Id: I368ad324141ea74a82c3406bc2895e89bc55b743
Signed-off-by: Edwin Kempin <ekempin@google.com>
Accounts have been migrated to NoteDb, hence also the account sequence
should be moved to NoteDb. In NoteDb the current account sequence number
is stored as UTF-8 text in a blob pointed to by the
'refs/sequences/accounts' ref in the 'All-Users' repository. Multiple
processes share the same sequence by incrementing the counter using
normal git ref updates. To amortize the cost of these ref updates,
processes can increment the counter by a larger number and hand out
numbers from that range in memory until they run out. The size of the
account ID batch that each process retrieves at once is controlled by
the 'notedb.accounts.sequenceBatchSize' configuration parameter in
'gerrit.config'. By default the value is 1 since it's unlikely that a
process ever creates more than one account.
This follows the example of storing the change sequence in NoteDb. A
difference is that the account sequence is stored in the 'All-Users'
repository while the change sequence is stored in the 'All-Projects'
repository. Storing the account sequence in the 'All-Users' repository
makes more sense since this repository already contains all of the other
account data.
Injecting the Sequences class that provides new sequences numbers
requires request scope. There are 2 places outsite of request scope
where new account sequence numbers are required, AccountManager and
AccountCreator (only used for tests). These classes need to create a
request context to get an account sequence number. For AccountManager
the request scope is only created when a new account is created and not
when an existing account is authenticated.
Since there is an init step that creates an initial admin user we must
make the account sequence available during the init phase. For this the
class SequencesOnInit is added which can only generate account IDs, but
only depends on classes that are available during the init phase. For
this class the account ID batch size is hard-coded to 1, since init only
creates a single account and we don't want to waste account IDs when a
new Gerrit server is initialized.
This change also contains a schema migration that ensures that the
account sequence is created in NoteDb.
To support a live migration on a multi-master Gerrit installation, there
is a configuration parameter ('notedb.accounts.readSequenceFromNoteDb')
that controls whether account sequence numbers are read from NoteDb or
ReviewDb. By default the value for this parameter is `false` so account
sequence numbers are read from ReviewDb.
If account sequence numbers are read from ReviewDb the sequence numbers
in NoteDb will be kept in sync. This is achieved by writing the next
available sequence number to NoteDb whenever a sequence number from
ReviewDb is retrieved. If writing to NoteDb fails, an exception is
raised to the caller and the sequence number that was retrieved from
ReviewDb is not used. Writing to NoteDb is retried several times so that
the caller only gets an exception if writing to NoteDb fails
permanently.
For the case where two threads try to update the sequence number in
NoteDb concurrently we must make sure that the value in NoteDb is never
decreased. E.g.:
1. Thread 1 retrieves account ID 14 from ReviewDb
2. Thread 2 retrieves account ID 15 from ReviewDb
3. Thread 2 writes the next available account ID 16 to NoteDb
4. Thread 1 tries to write the next available account ID 15 to NoteDb
but fails since Thread 2 updated the value concurrently.
5. Thread 1 finds that it doesn't need to update the account ID in
NoteDb anymore since Thread 2 already updated the account ID to a
higher value
This means at any point in time it is safe to switch to reading account
IDs from NoteDb. However once this switch is done it is not possible to
switch back to reading account IDs from ReviewDb, since ReviewDb will be
out of sync as soon as the first account ID was retrieved from NoteDb.
The migration on a multi-master Gerrit installation will be done with
the following steps:
1. rollout this change to all nodes:
- account sequence numbers are read from ReviewDb
- the sequence numbers in NoteDb are kept in sync
2. wait some time until we are sure that we don't need to roll back to a
release that doesn't contain this change
3. run an offline migration to ensure that the account sequence number
in NoteDb is initialized on all nodes
4. set 'notedb.accounts.readSequenceFromNoteDb' to true so that account
sequence numbers are now read from NoteDb (this setting cannot be
reverted since the account sequence in ReviewDb will be outdated once
account IDs are retrieved from NoteDb)
After this is done a follow-up change can remove the handling for
'notedb.accounts.readSequenceFromNoteDb' so that account IDs are now
always retrieved from NoteDb.
Change-Id: I023d2de643ed0c15197c09fa19105cc2acb5091e
Signed-off-by: Edwin Kempin <ekempin@google.com>
It used to be that ConfigNotesMigration was the only kind of
NotesMigration in a real server, but it was always immutable, while
TestNotesMigration was the main kind of migration in acceptance tests,
which was mutable. However, now that we support modifying
ConfigNotesMigration at runtime as part of the online NoteDb migration
process, TestNotesMigration is no longer strictly necessary, and
continuing to support it is becoming more trouble than it's worth.
One major problem was that only TestNotesMigration was being populated
via NoteDbMode, and the NoteDbMode was not reflected in the
ConfigNotesMigration at all, so callers that were depending on
ConfigNotesMigration directly would not know about the NoteDb migration
state from the GERRIT_NOTEDB env var in tests.
We could have fixed this (and other) problems directly, but there is a
better solution: get rid of the test implementation entirely, and use
the same implementation of NotesMigration in tests as in a running
server.
The class hierarchy now contains only two classes: NotesMigration and
MutableNotesMigration. Most callers just care about inspecting the
state, so they can inject a NotesMigration. The few callers (migration,
tests) that care about mutating the state at runtime can inject/create
MutableNotesMigrations instead. As an implementation detail, the actual
NotesMigration instance continues to be mutable, containing a reference
to the Snapshot, but the base class does not contain any public methods
to mutate the state. We then ensure with Guice that there is only one
actual NotesMigration instance (the MutableNotesMigration), and callers
just may or may not have access to the mutation methods depending on
what they chose to inject.
Ensuring this gets set up correctly in tests requires a bit of tweaking.
* Since the NotesMigration is populated in the @UseLocalDisk case from
reading gerrit.config on disk, we need to prepopulate gerrit.config
with the right config values at startup time.
* Since MutableNotesMigration is not in the testutil package, it can't
have its own setFromEnv() method that depends on NoteDbMode.
Instead, construct MutableNotesMigrations from the test env by using
a static factory method in NoteDbMode.
Change-Id: If06db3d025cf3e3c9fe464989d5f38a22ce70b56
When an initial admin user is created by the InitAdminUser init step we
only created a user branch in the All-Users repository, but we forgot to
write the account.config file to it.
Change-Id: I09cfc058ab3aaeaa2fe2adf38c779f6f672eef18
Signed-off-by: Edwin Kempin <ekempin@google.com>
LocalUsernamesToLowerCase is changing the case of external IDs in the
"gerrit" scheme. Since the external IDs are stored as fields in the
account index the corresponding accounts must be reindexed.
LocalUsernamesToLowerCase as a site program doesn't have the account
index available and hence can't do the reindexing itself (at least not
without blowing up the Guice injector stack). Instead invoke the reindex
program to reindex the accounts. This is the same approach that was
taken for the MigrateToNoteDb program which was added in master. This
will reindex all accounts and also means that the
LocalUsernamesToLowerCase program cannot run in parallel to the Gerrit
server. This should be okay since running LocalUsernamesToLowerCase is a
one time effort when you want to configure case-insensitive login for
Gerrit.
Change-Id: I6f2804ece996b22ec834aaebf209dac3b5b89415
Signed-off-by: Edwin Kempin <ekempin@google.com>
The class no longer controls capabilities. It now only provides
limits over server resources consumed during a request.
Change-Id: I70408bd5dda68b05502c4ece989b60f55793a8dd
* changes:
AccountsUpdate: Rename atomicUpdate to update
AccountsUpdate: Rename update method to replace
Always update accounts atomically
Migrate accounts to NoteDb (part 2)
Disallow updates to account.config by direct push or submit
Migrate accounts to NoteDb (part 1)
Let AccountsUpdate#insert create the Account instance
AccountsUpdate: Remove upsert method
Setting noteDb.changes.autoMigrate=true is equivalent to passing
--migrate-to-note-db. This allows admins to trigger migration without
passing a flag, which might be difficult to wedge into their deploy
scripts.
Also, NoteDb migration might take a long time; so long that a server
might need to be restarted before it finishes (for reasons unrelated to
the migration). Set noteDb.changes.autoMigrate when the migration
process starts, and unset it when it finishes. This saves admins the
trouble of passing --migrate-to-note-db on subsequent restarts, so
migration can be more set-and-forget.
Change-Id: Ifa67cc7865ad6659f40f8cec4b3e69f78d4c0702
Requires some refactoring of AbstractVersionManager and OnlineReindexer
to allow the test to provide a listener that is injected into Daemon.
Use a simple OnlineUpgradeListener interface, which may end up being
useful for things other than tests.
In addition, factor out a separate LifecycleListener for starting the
online upgrade process. This is not immediately necessary, but will be
used in the near future for the NoteDb migration to hook into. In fact,
this change started life as this minor refactoring, at which point I
realized we probably need tests to make sure I don't break it.
Change-Id: Ifcbcac689cf14137784a250f025df149c90f22ef
Always write account updates to both backends, ReviewDb and NoteDb.
In NoteDb accounts are represented as user branches in the All-Users
repository. Optionally a user branch can contain a 'account.config' file
that stores account properties, such as full name, preferred email,
status and the active flag. The timestamp of the first commit on a user
branch denotes the registration date. The initial commit on the user
branch may be empty (since having an 'account.config' is optional).
The 'account.config' file is a git config file that has one 'account'
section with the properties of the account:
[account]
active = false
fullName = John Doe
preferredEmail = john.doe@foo.com
status = Overloaded with reviews
All keys are optional. This means 'account.config' may not exist on the
user branch if no properties are set.
If no value for 'active' is specified, by default the account is
considered as active.
AccountsUpdate is now sending RefUpdatedEvent's when an account is
updated. ReindexAfterRefUpdate receives the events and takes care to
evict the updated accounts from the account cache, which in turn
triggers reindex of the accounts. This is why AccountsUpdate no longer
needs to evict the updated accounts itself from the account cache. Since
AccountsUpdate doesn't reindex accounts on its own anymore the
ServerNoReindex factory can be removed.
To support a live migration on a multi-master Gerrit installation, the
migration of accounts from ReviewDb to NoteDb is done in 3 steps:
- part 1 (this change):
* always write to both backends (ReviewDb and NoteDb)
* always read accounts from ReviewDb
* upgraded instances write to both backends, old instances only
write to ReviewDb
* after upgrading all instances (all still read from ReviewDb)
run a batch to copy all accounts from the ReviewDb to NoteDb
- part 2 (next change):
* bump the database schema version
* migrate the accounts from ReviewDb to NoteDb (for single instance
Gerrit servers)
* config option to control whether accounts are read from ReviewDb or
NoteDb
- part 3:
* remove config option to control whether accounts are read from
ReviewDb or NoteDb and always read from NoteDb
* delete the database table
Change-Id: I2e0b13feb3465e086b49b2de2439a56696b5fba9
Signed-off-by: Edwin Kempin <ekempin@google.com>
This avoids using API surface of ScheduledThreadPoolExecutor, making
it possible for createQueue to return ScheduledExecutorService instead.
Change-Id: I4f44e45d663d89b1c45ee2fc1d0d29831fd5bebd
There are two reasons to support this. One is for an offline upgrade to
NoteDb at a released Gerrit version, where we fully expect there to be
index schema changes in addition to the NoteDb migration. Just kicking
off Reindex immediately after MigrateToNoteDb saves the user an extra
manual invocation. That is implemented in this change.
The other scenario is an online NoteDb migration in conjunction with an
online schema upgrade, where we don't want to increase contention by
running the migration and reindex concurrently. Implementing this will
require managing more subtle interactions between LifecycleListeners,
and is not implemented in this change.
We literally invoke Reindex's main method rather than trying to do
something smarter like reusing the injector stack from MigrateToNoteDb,
because Reindex has a custom set of modules for managing the index
versions. At that point, we could either factor out a variant of
Reindex#run that takes some arguments, or just do a tiny amount of
string manipulation. I went with the latter.
Change-Id: Idf8d0513a08db8bb9e7efda1cf2a12e0579b2fee
This option exists to handle a race condition that can only occur when
there are concurrent writes to the same document. This doesn't apply
during offline reindex, and turning it off will speed things up.
It is not especially feasible to pass this option into ChangeIndexer in
code, so we have to use a config value for it. Remove the "test" prefix
from the config name, and document it.
Change-Id: Iecc12c3ab0f068f24063c358ce50a40b25362511
Align the startup of Gerrit with a standalone Jetty container
to the WebAppInitializer and load the user-provided
Guice modules in the sysInjector instead of the DbInjector.
Allows overriding some of the default bindings of Gerrit
(e.g. repository manager or permissions backend) with custom-made
alternate implementations.
Change-Id: Ib4553ebaa5c00de911269056be14869332a60a62
Remove the tests for migrating an empty site; these were just
placeholders left over from before we were able to start and stop a
server in the same test.
Add two tests for the full migration, with and without a live index
version.
Change-Id: Ic5087e36135a51afcb5f3f30347e7eeba01e3df6
When running in offline mode, we can avoid the use of a sequence number
gap, because we don't have the risk of other threads racing to create
changes as we swap in the new sequence. Make the default sequence gap in
online mode 1000 rather than 0; this is the number we used for
googlesource.com without incident.
Change-Id: Iea89f3bffd0e549f540eb6c20f30990d485d2093
With Hana 2.0 the database port number can no longer be computed from
the instance number. Hence remove the config parameter "instance" and
add the config parameter "port" to allow explicit configuration of the
database port. Add config parameter "database" to enable configuring the
name of a specific database if it is part of a multi-database
environment (MDC).
See
https://help.sap.com/viewer/0eec0d68141541d1b07893a39944924e/2.0.00/en-US/ff15928cf5594d78b841fbbe649f04b4.html
Change-Id: I8e770a2ecf18f681b57589ddc3e161ee14a95d37
Drop the capabilities reference from all user objects. Most global
capabilities can be checked with the PermissionBackend.
QoS, query limits, and emailing reviewers still require the capability
object. Bundle its factory into the call sites that need it.
Continue caching the CapabilityControl in an opaque property on the
CurrentUser, and also in the DefaultPermissionBackend.WithUserImpl.
Both of these sites reduce evaluations for critical properties like
"administrateServer".
Change-Id: I5aae8200e0a579ac1295a3fb7005703fd39d2696
md5 and sha1 are deprecated in Guava 22. Where possible, replace
them with the recommended alternatives.
In cases where we need to keep using sha1 for compatibility, add
a warning suppression with comment.
Change-Id: I3f80d52ea4e299c2a7c86fbda25457814bff750c
GerritServer was using SshMode to determine whether to enable/disable
SSH in the Daemon, without consulting the annotations. This case is easy
to fix, because we have the Description available. In other places,
namely AccountCreator, it may be more difficult, because the Description
is long gone. Bind a boolean instead.
In my unscientific testing, this shaves ~7% off test runtime of
bazel build --build_tests_only ... && bazel test ...
Before: Elapsed time: 261.317s, Critical Path: 210.19s
After: Elapsed time: 247.257s, Critical Path: 191.53s
Change-Id: I6bb4a86fe366148fe5ca9df374657abffe566417
Leaving LifecycleManagers open can prevent JVM shutdown by not giving
listeners a chance to shut down running non-Daemon threads.
MigrateToNoteDb wasn't shutting down its LifecycleManagers at all, much
less in the case of uncaught exceptions. Make sure managers are stopped
both in the case of uncaught exceptions in MigrateToNoteDb, and by using
RuntimeShutdown to catch System.exit.
Change-Id: Iea6de2ff704b327d2db0702386a08b23627345af
The basic process is a loop over the possible NotesMigrationStates,
starting with the current state in gerrit.config, where each step runs
some process and updates the config to the next state. This works by
directly editing gerrit.config on disk after the migration work
completes, which means that migrations are easily resumable.
In theory, this process could be customized by telling it to stop at any
of the intermediate states, reexecute certain phases where that makes
sense, etc. However, that would result in significant configuration
complexity, and we don't want to burden admins with too many flags. That
said, based on our experience manually executing the migration steps on
googlesource.com, there are two flags we want to support.
First is the idea of a "trial mode": admins can try turning on NoteDb to
see if performance, resource usage, or any other behavior is acceptable,
but leave ReviewDb the source of truth. This terminates the migration in
NotesMigrationState.READ_WRITE_NO_SEQUENCE.
Second, we can force rebuilding of some or all changes, as long as we're
sure that ReviewDb is still the source of truth. This is primarily
useful for developers and debugging issues with the rebuild process
(which, having migrated googlesource.com, we're fairly confident in, but
we're prepared to be surprised).
This implementation is just a skeleton: the loop handles all the
currently-supported states, but many transitions throw
UnsupportedOperationException. It also notably does not work properly in
a running server, since it only updates gerrit.config and not the
NotesMigration singleton.
Change-Id: Ic83071c794bcddc6076306215c2e445fedffbc93
We already have several configuration options and we're going to add
more. Use the normal Builder pattern to separate option-setting from
running, and make all fields in SiteRebuilder final. The downside is
because of Guice we need to create the builder with a
siteRebuilderBuilderProvider and manually copy arguments to the
SiteRebuilder constructor. Oh well, what can you do. (And we are later
planning on renaming this class so at least it won't be a
RebuilderBuilder for long.)
Add another SiteIndexer-like option, the stream for progress output.
Change the standalone program to write to stderr instead of stdout, also
for consistency with SiteIndexer. Ensure that all output goes either
to the progress monitor or the log, rather than writing directly to
stdout.
Document the supported options.
Change-Id: Iceaf75cd94d68bb751e97df1769c41a1e97228aa
I get this warnning in debian lintian
13:48:17 W: gerrit: init.d-script-does-not-source-init-functions etc/init.d/gerrit
13:48:17 N:
13:48:17 N: The /etc/init.d script does not source /lib/lsb/init-functions. The
13:48:17 N: systemd package provides /lib/lsb/init-functions.d/40-systemd to
13:48:17 N: redirect /etc/init.d/$script calls to systemctl.
13:48:17 N:
13:48:17 N: Please add a line like this to your /etc/init.d script:
13:48:17 N:
13:48:17 N: . /lib/lsb/init-functions
13:48:17 N:
13:48:17 N: Severity: normal, Certainty: certain
13:48:17 N:
13:48:17 N: Check: systemd, Type: binary
See https://gerrit.wikimedia.org/r/#/c/343297/
Bug: Issue 6379
Change-Id: I7d4223ab96f70a2fd6eed53a70b06957a3edc7f3
CORS preflight for POST, PUT, DELETE makes every mutation operation
require 2 round trips with the server. This can increase latency for
any application running on a different origin.
There is a workaround available in modern browsers: use POST with
Content-Type: text/plain. This does not require CORS preflight, as
servers should already be using XSRF protection strategies.
Unfortunately this is incompatible with the current REST API, as many
operations require PUT or DELETE methods, and a Content-Type of
application/json. Support the requester to select a different method
using query parameter '$m' and Content-Type with '$ct' in the URL,
mocking the request with those.
Using this style of request still requires the user session to be
valid for access. Accept identity through the query parameters as
'access_token'.
The XSRF token isn't necessary in this type of request as only
permitted websites would be allowed to read cookie content to obtain
the GerritAccount cookie value and include it in the URL.
Change-Id: Ic7bc5ad2e57eef27b0d2e13523be78e8a2d0a65c