gerrit

Author	SHA1	Message	Date
Dave Borowitz	86d070f83e	Eliminate potential NPEs with ChangeData.getChange() Callers were supposed to be calling hasChange() to determine whether getChange() would return null, but none were. Remove this error-prone pair of methods and just use change(), which includes lazy loading. There are a few places where we need to catch an additional OrmException but these are easy enough to fix. Change-Id: I23335e362715f59e2c093ffec88427739ff0cffc	2014-01-03 09:10:45 -08:00
Dave Borowitz	569f2516db	Use Lucene's QueryBuilder for building full-text queries FuzzyQuery was a mistake; it uses edit distance to find terms in the index close to the provided search term. This produces bizarre results for queries like "message:1234". Instead, use Lucene's QueryBuilder with an analyzer to convert a full-text search word/phrase into a phrase query. Add some tests for full-text matching behavior on numbers, which should hopefully not be too dependent on specific Lucene behavior. Coincidentally, a copy-paste error in the byMessageExact test prevented this poor behavior from showing up in tests sooner. Change-Id: I384f74f1455d0433433a27f880204ac8ecbf93da	2013-12-30 08:50:25 -08:00
Dave Borowitz	7547233449	Stop passing ReviewDb, etc. to ChangeData methods Many methods that are intended for lazy initialization take a ReviewDb, and some take other helper objects such as GitRepositoryManager. This achieves the intended lazy-loading behavior but requires a lot of injection of dependencies into callers, and may become untenable as lazy loading of ChangeData requires more helpers. Instead, since after all this is Java, construct ChangeData with a Factory. This is subtle because it is used from a lot of places that need control over which ReviewDb handle to pass in, so the factory methods still need to take a db. ChangeData objects are still intended (i.e. safe) for use only by one thread, so having a single ReviewDb instance stored for the lifetime of the ChangeData should be fine. We just need to be careful about which we pass in, particularly in a place like ChangeIndexer. As a side effect, clean up some other injection, particularly in the predicate class hierarchy. Change-Id: I52e1eb2a76788c12dd95767e89095ab80df7e1cc	2013-12-20 15:54:18 -08:00
Dave Borowitz	7a2d8d4e61	Fix last line of license headers This was due to a copy/paste error into my editor's snippet library, and it managed to sneak into 20 classes without anybody noticing. Change-Id: Ifeea9ab5737acbed1b93519b65709f1c0f48684b	2013-10-14 17:40:17 +00:00
Dave Borowitz	2ae0171598	Log and skip indexed fields that produce an error Change-Id: I40b8598b3e857faec7269bc380ac5ecbb78e2df7	2013-10-07 17:35:33 -07:00
Dave Borowitz	fc1bba54fd	Don't return futures from ChangeIndex methods In I1b3c5ba0 ChangeIndex.replace was modified to return a future, but ChangeIndexer.Task was not changed to call those futures before returning. In retrospect this was a mistake anyway. The only path by which these methods are currently called is from ChangeIndexerImpl, which turns around and wraps them in a Callable and submits them to the same executor, tying up twice as many threads as necessary. Similarly, to avoid additional fanout in ChangeIndexerImpl, we should not write to multiple index versions in parallel, so remove the TODO to try that. The current behavior slows down individual writes for the relatively short period of an online version upgrade, but that is acceptable given the improved thread utilization overall. Change-Id: I5fff470214ecd69a261dec1c10b9f7bdd0a1907e	2013-10-02 12:00:42 -07:00
David Pursehouse	71c198d662	Remove unnecessary @SuppressWarnings("deprecation") in SolrChangeIndex Change-Id: Ie14819c8df33bcf76ed48ff6c4ffc389b1e217e7	2013-09-26 11:27:51 +09:00
Dave Borowitz	2e0f89d814	Index full sortkey field in secondary index Using the sort key of the last element for pagination only works as long as every sort key is unique. This is true for the general definition of this field in the Change object, which includes the unique change ID at the end of the hex string. Previously, we were incorrectly truncating the change ID off, resulting in many changes in the same sort key bucket and thus broken pagination. Having two different definitions of sort key in the same running server makes it a bit ugly to handle SortKeyPredicates, since the definition of min/max value is now schema dependent, but at least we can keep the same field name. Don't @Deprecate the new SORTKEY field. We were originally hoping to remove this field and depend only on the UPDATED field (with the change ID as tiebreaker), but as long as this field is used for pagination, we have to keep it around. This is because we can't be sure a secondary index will be able to express a query like "(field X, field Y) > (N, M)" to allow us to restart a query in the middle of an UPDATED bucket. We may still decide to scrap the current pagination system but that will probably not happen until we kill the SQL index code. Change-Id: Icb760dbacd01939e5e4936ef87165b6dddcacdc0	2013-09-24 12:48:21 -07:00
Dave Borowitz	8830bb53a4	Return last N results by sortkey when using sortkey_after In the SQL index implementation, sortkey_after queries are always rewritten to use one of the ChangeAccess.*Prev() methods, which order results by sort key ascending. Update the Lucene implementations to match this sort order. Specify in the ChangeIndex javadoc that this sort order is required in such cases, since other code in QueryProcessor depends on this assumption. Change-Id: Ie35d2c6bd07963db6d33145a83493d12c6a4289d	2013-09-24 12:48:21 -07:00
Dave Borowitz	76e7b42490	Push limit term into secondary index queries Change-Id: Id4ec5fe0375229ff8b82ec46fbe300c78098dbbe	2013-08-09 01:14:53 +00:00
Shawn Pearce	eb7193b169	Fix incorrect reviewedb dependency in solr This should depend on -server not -client. Change-Id: I44880c5ee03bbb9e1739c6365ae8cefbcc3cee6d	2013-07-30 00:46:15 -07:00
Dave Borowitz	2361cefe17	Factor out batch indexing logic into a ChangeBatchIndexer Change-Id: Ief166fadc61f76a5f4ffdb879322e9883040bdd6	2013-07-08 13:22:54 -07:00
Dave Borowitz	d103b2b61c	Scan index directory to determine Lucene versions Mark index versions as "ready" when they are fully indexed and the server is not running (i.e. from Reindex). By default, search from the most recent ready index version, and write to both the most recent ready version and (if different) the most recent known version. At server startup, mark all versions except those we are about to start writing to as not ready. Change-Id: Icf42a3eb27b0445899300d60941cd701a8072d41	2013-07-08 13:22:54 -07:00
Dave Borowitz	97a01dbb05	Replace Reindex --dry-run with --output The line of what was a "dry run" was a bit arbitrary, doing all the Gerrit processing (including potential side effects like writing to persistent caches) but not all the index-specific processing, which could still be useful for evaluation or debugging purposes. Instead, provide an output base so administrators can test the indexing behavior without operating on production index data. For Lucene, this flag is interpreted as a path to the index directory; for Solr, it is a prefix added to the collection names. Change-Id: Ic4808d1e3467a780a818ac7d1c89de13610da630	2013-07-08 13:22:54 -07:00
Dave Borowitz	1d2a998298	Bind index versions dynamically We want to support a single index version for reading and multiple index versions for writing. Wrap these in an IndexCollection with getSearchIndex and getWriteIndexes methods. Still bind a single implementation at startup. Change-Id: Ibc4dbbeba064cde68f2585126d7708db5a2b3410	2013-07-08 13:22:54 -07:00
Dave Borowitz	0bd69febcb	Rewrite schema versioning in preparation for multiple versions Store indexes in versioned directories, and don't consider open/closed to have separate versions. Extract a simple base Schema class with a version and a list of fields, which is analogous to the DB's SchemaVersion. Construct indexes using specific versions. Change-Id: I11e8e3a78cd62b1a01aa122bfd747ed5295fe057	2013-07-08 13:22:54 -07:00
Shawn Pearce	9c4720405b	Don't run SQL rewrites when secondary index is enabled If the secondary index is enabled on a server the index handles all queries. There is no reason to run the SQL rewrites or to leave predicates in the predicate tree without being wrapped by the IndexedChangeQuery predicate. Change-Id: I18defed2a5a6003b9ae6ba227e29864b5c912e7b	2013-06-28 11:14:44 -07:00
Ahaan Ugale	404c8246bc	Add secondary index implementation using SolrCloud SolrCloud can be used instead of Lucene by adding "type = SOLR" under [index] and "url = <zookeeper-url>" under [index "solr"] in gerrit.config. Change-Id: I0ff8579c5e23c58b16f3605bc20eba4e80fb40fc	2013-06-27 13:52:39 -07:00

18 Commits