AccountResolver has a mishmash of methods with different resolution
semantics, different disambiguation semantics, and different return
types.
A short, non-exhaustive list of illustrative examples of confusing
things about this class:
* find(String) returns null if there is not exactly one match;
parse(String) throws UnprocessableEntityException.
* The difference between parse(String) and parseId(String) doesn't have
anything to do with IDs, it's that one of them respects account
visibility and the other doesn't. (Pop quiz: which one is which?)
* findAll(String) and findAllByNameOrEmail(String) don't in fact find
*all* matching accounts: some paths short-circuit. For example, if
the input is a number, it will return one result if the number is an
existing account, and zero results otherwise, even though *all*
accounts strictly speaking might include users whose full name
happens to be a numeric string.
* find(String) filters out inactive users, but findAll doesn't.
These issues at best lead to head-scratching and at worst to subtle but
potentially serious bugs like failing to respect account visibility.
What we really need is a more intentionally designed interface. This
change starts a new implementation, temporarily named AccountResolver2;
once it reaches parity with AccountResolver and all callers have been
migrated, it will be renamed.
The main benefit of the new implementation is that it separates the
_searching_ semantics from the _result_ semantics. For a given input, we
always produce a single Result object, which encapsulates zero or more
results. The Result class has several view methods which allow callers
to convert an arbitrarily sized list of AccountStates to another type
that they may find more useful, like a set of Account.Ids or a single,
unique account. This also provides us a single place to produce useful
exception messages when an account is not unique.
The other major change from AccountResolver is that the new
implementation consistently filters out invisible and inactive users.
We'll explain these one at a time.
First, AccountResolver2 never returns accounts that are not visible to
the calling user. The purpose of this class is to parse
end-user-provided strings to accounts in various contexts (REST API
URLs, REST API inputs, search queries). The search semantics are
intentionally somewhat fuzzy specifically because the input is
user-provided. Since it's user-provided, we can safely assume that there
is an end user on the other end for whom we can check visibility. It's
true that some Gerrit code needs to bypass account visibility, but that
code shouldn't use the AccountResolver interface; it should use a
lower-level interface like InternalAccountQuery.
Next, AccountResolver2 almost always filters out inactive users, except
in one specific case: when the input is a number. This reflects our
experience providing support for customers who are very confused when
their REST API operations fail because they specified some account
string that they thought was unique but that actually matched some
inactive accounts. The idea is that in normal usage, users should never
need to care about inactive accounts--it should be as if they don't
exist. In certain administrative contexts, it is necessary to care about
them, for example an admin re-enabling an account. In these contexts, an
admin can figure out the unambiguous account ID, for example by
searching with [-is:active]. This is a slightly backwards-incompatible
change (although the documentation was never clear on what type of
filtering could occur), but based on our experience, the expected
reduction in user confusion and consequent support costs makes the
change worth it. We should also be able to provide a useful exception
message listing inactive accounts, now that we have a common place to
put the logic.
Taken together, the filtering semantics mean that an end user can't
distinguish between the case where AccountResolver2 produces no results
and where it only produced inactive or invisible results. This is
considered a feature.
In terms of implementation, switch to a more declarative style of
defining the search semantics, based around the concept of a single
Searcher matching a single type of input. Searchers are then iterated
over in a loop, separating the definition of lookup semantics from the
precedence and short-circuiting behavior. This implementation allows us
to preserve the previous search implementation quite closely, while
making it crystal-clear what are the list of supported formats, and
where the intent is to short-circuit and where not. It additionally
makes it possible to write small tests of just the looping logic,
without having to build up all the dependencies required to do actual
account resolution.
Change-Id: I08e64c647378a9275e425b68c39122984ee37e77