gerrit

Author	SHA1	Message	Date
Dave Borowitz	b82fbcb584	IndexConfig: Configure maxLimit from Config An index implementation may not have an inherent upper bound to the limit, but administrators may nonetheless want to restrict the maximum query size at the server level for performance reasons. Change-Id: Ic1f506e9857413ec860fe97785bff9e76b817db0	2015-04-23 11:30:13 -07:00
David Pursehouse	805a2da731	Move QueryBuilder.badFieldType to FieldType class This allows the method to be used without having a dependency on the gerrit-lucene module, which in turn has a dependency on a specific version of Lucene. Change-Id: I57e92c71260a51327ed706bd65f8f9e384e15666	2015-04-23 15:01:26 +09:00
Dave Borowitz	62e67bede7	SitePaths: Convert site_path itself to Path Also convert all remaining references that expect to resolve site_path to a File, notably LocalDiskRepositoryManager. Change-Id: I5ef6f6e3d7b3fe0418a470b079b4cc04bc9be4f6	2015-03-03 11:48:05 +09:00
Dave Borowitz	6b37799da1	SitePaths: Convert index_dir to Path Change-Id: I3e8234845bbde6aab9bf4f7bda3944187539cffb	2015-03-03 11:19:22 +09:00
Dave Borowitz	b4de292f8a	Add index configuration to specify maximum possible limit Implementations may not support returning up to Integer.MAX_VALUE results, so allow this to be specified per implementation. This is distinct from the configured per-user limits, as it represents a fundamental limitation of the underlying index backend rather than an administrative decision. Encapsulate this in its own class that can be injected into QueryProcessor, and use this in the limit computation. Change-Id: I7d1093a2aa3800ff5437c6b1c78b12d73b17db05	2015-01-20 11:04:08 -08:00
Dave Borowitz	90ee0cb3f2	Remove all references to deleted change fields We had lots of hard-coded logic to do slightly different things depending on which version of a particular field was present in a schema. For every field that has now been completely removed, we can remove references including these bits of extra logic. The difference is especially striking in the sortkey query path. Note that queries containing sortkeys have already been failing since servers upgraded to schema version 8 (2.9), which no longer uses this field. Change-Id: Ib1fbc9f3aa45bf068b7869ca3aada40dbe4cb1bb	2014-12-19 12:21:08 -08:00
Dave Borowitz	21485c8115	Simplify ChangeIndex by removing delete(ChangeData) Since indexing is generally done after any modification operations, using a deleted ChangeData when indexing a deleted change is just asking for trouble; most ChangeData methods (reasonably) assume that the change exists. Instead, just use the form of the method that passes in the ID. Change the method that takes an integer ID to take a Change.Id object. In almost all cases we already have one of these handy, and this is more consistent with public interfaces elsewhere in Gerrit. A fortunate side effect is that async deletes should happen slightly faster, since we no longer need to open a DB connection and produce a ChangeData in the delete codepath. Change-Id: I353522b52188de11353ca26a7ba90a2ca462e404	2014-11-12 10:35:02 -08:00
Dave Borowitz	552293b20d	Merge branch 'stable-2.10' * stable-2.10: Remove uneeded dependency in ChangesCollection Delete a change from the index when it is not in the DB Change-Id: I02a4773dd581c29704a7cc45c5225258d657f836	2014-11-12 10:13:32 -08:00
Olga Grinberg	903be04b02	Delete a change from the index when it is not in the DB If for some reason the secondary index is out of date, i.e. the change was deleted from the database but wasn't deleted from the secondary index, it was impossible to re-index (remove) that change. Add logic to automatically remove the change from the secondary index if this change is requested through the Change REST API endpoint and doesn't exist in the database. If a user click on search result from a stale change, he will a get a 404 page and the change will be removed from the index. To fix the problem without opening the change page, run a command like: curl --user <user name> -X POST http://<host>:<port>/a/changes/<change id>/index Issue: 2996 Change-Id: I1db5373e31585e99c5f45e05274d86d69b4f24e6	2014-11-12 17:03:41 +00:00
Dave Borowitz	8b42ec5bd5	Turn on many more Eclipse warnings, and fix them - Warn on empty statements, e.g. "for (;;);". These may be typos and are easily replaced by "for (;;) {}" which is more explicit. - Warn on field hiding. This allows cleanup of many acceptance test members, at the cost of a couple of renames and the occasional suppression (when the field is in a public nested enum that shadows a public constant). - Warn on unnecessary casts. - Warn on unused declared thrown exceptions. In addition to reducing method signature length and number of imports, this also eliminated some impossible catch blocks. - Warn on missing @Override annotations. - Warn on unused parameters. This is likely the most controversial, as a few relatively common patterns require unused parameters in a way that Eclipse can't ignore. However, it also resulted in cleanup of a lot of unnecessary injections and method parameters, so I think the cost was worth it. Change-Id: I7224be8b1c798613a127c88507e8cce400679e5d	2014-10-29 15:00:17 -07:00
Saša Živkov	1c48a7ee1c	Revert "Use javax.inject.Inject instead of com.google.inject.Inject" This reverts commit 5db3bba3e2d85662bacedc5d2f215fff7d8d2805. I was too fast to submit 5db3bba. Besides forgetting to submit the referenced changes in some submodules some people reported issues with the interception of the javax.inject.Inject in some JEE containers. Change-Id: I931ad329d2e7be6f6ce804b8395489a021c8240b	2014-09-30 10:53:50 +02:00
Dave Borowitz	5db3bba3e2	Use javax.inject.Inject instead of com.google.inject.Inject The Guice team is discouraging use of its specific annotation where possible, to increase JSR330 compliance. Leave optional injection alone for now, which needs to be manually replaced with OptionalBinder. Change-Id: I4f53a518ba6f36fd67af12f3540dc44cbad07ff8	2014-09-29 16:36:18 -07:00
Dave Borowitz	90f9447e18	Solr: Remove deprecated Analyzer constructor call The default changed to use the current version, with an optional call to setVersion() if the version should be changed. In this case it should not be. Change-Id: Id87515185ccf7d640ad80dd2972e0d13617df8a3	2014-09-05 15:29:41 -07:00
David Pursehouse	7bf7e49621	Remove unused insert method from secondary index interface The `insert` method is not used anywhere. The `replace` method is used instead. Change-Id: Ie47253149cacde9a0824e4dee4d80313acd33188	2014-08-12 23:52:06 +00:00
Shawn Pearce	4d42596868	Merge changes Iac79af54,Ief41eccc,Iae52ade9,I0bd9eabb,I2ef5b9e1 * changes: Solr index requires SolrCloud rather than plain Solr Handle exceptions during reindexing InitIndex: Allow user to configure the Solr index URL Solr index URL should be configured under the main "index" section Improve information message about rebuilding the index	2014-02-13 17:02:23 +00:00
Dave Borowitz	86caf9ec23	Support pagination using offsets instead of sortkey We cannot guarantee that secondary index implementations (particularly the one used by googlesource.com) can efficiently paginate based on the sortkey, and in particular can both sort and reverse sort on the same field. This particular performance issue notwithstanding, searching is now generally fast enough that it is feasible just to skip the first N results when doing pagination. Add an option S= to QueryChanges to support starting at a nonzero offset. Note that we still have to fetch n+S results from the index in order to do visibility filtering, since if we skipped at the index layer we wouldn't know how many of the skipped elements would have matched later filtering. Drop the sortkey token suffix from the legacy anchor parser; there is no reliable way to convert it to an offset, and it's unlikely that users have permalinks to specific sortkey values. On the server side, remove the sortkey field from the current index version, and use pagination by offset instead of sortkey in the new version only. Continue to support sortkey queries against old index versions, to support online reindexing while clients have an older JS version. Change-Id: I6a82965db02c4d534e2107ca6ec91217085124d6	2014-02-12 13:51:26 -08:00
David Pursehouse	ac88c36c5d	Solr index URL should be configured under the main "index" section Configuration of the Solr index URL must currently be done in a subsection, as follows: [index] type = SOLR [index "solr"] url = 127.0.0.1 Subsections are also being used for configuration of specific indexes like the "changes_open" and "changes_closed" subsections that are used for Lucene indexes, and are documented as "index.name.xyz". Having the Solr URL configration under a similar subsection may cause confusion, and it would be more intuitive to configure the URL under the main index section, like: [index] type = SOLR url = 127.0.0.1 Subsections are being used for configuration of specific indexes like the "changes_open" and "changes_closed" settings that can be specified for Lucene indexes which are documented as "index.name.xyz" Also add missing documentation of the URL parameter. Change-Id: I0bd9eabbdbfa29241688c738aa5d213e726c90b0	2014-02-06 14:05:00 +09:00
Dave Borowitz	86d070f83e	Eliminate potential NPEs with ChangeData.getChange() Callers were supposed to be calling hasChange() to determine whether getChange() would return null, but none were. Remove this error-prone pair of methods and just use change(), which includes lazy loading. There are a few places where we need to catch an additional OrmException but these are easy enough to fix. Change-Id: I23335e362715f59e2c093ffec88427739ff0cffc	2014-01-03 09:10:45 -08:00
Dave Borowitz	569f2516db	Use Lucene's QueryBuilder for building full-text queries FuzzyQuery was a mistake; it uses edit distance to find terms in the index close to the provided search term. This produces bizarre results for queries like "message:1234". Instead, use Lucene's QueryBuilder with an analyzer to convert a full-text search word/phrase into a phrase query. Add some tests for full-text matching behavior on numbers, which should hopefully not be too dependent on specific Lucene behavior. Coincidentally, a copy-paste error in the byMessageExact test prevented this poor behavior from showing up in tests sooner. Change-Id: I384f74f1455d0433433a27f880204ac8ecbf93da	2013-12-30 08:50:25 -08:00
Dave Borowitz	7547233449	Stop passing ReviewDb, etc. to ChangeData methods Many methods that are intended for lazy initialization take a ReviewDb, and some take other helper objects such as GitRepositoryManager. This achieves the intended lazy-loading behavior but requires a lot of injection of dependencies into callers, and may become untenable as lazy loading of ChangeData requires more helpers. Instead, since after all this is Java, construct ChangeData with a Factory. This is subtle because it is used from a lot of places that need control over which ReviewDb handle to pass in, so the factory methods still need to take a db. ChangeData objects are still intended (i.e. safe) for use only by one thread, so having a single ReviewDb instance stored for the lifetime of the ChangeData should be fine. We just need to be careful about which we pass in, particularly in a place like ChangeIndexer. As a side effect, clean up some other injection, particularly in the predicate class hierarchy. Change-Id: I52e1eb2a76788c12dd95767e89095ab80df7e1cc	2013-12-20 15:54:18 -08:00
Dave Borowitz	7a2d8d4e61	Fix last line of license headers This was due to a copy/paste error into my editor's snippet library, and it managed to sneak into 20 classes without anybody noticing. Change-Id: Ifeea9ab5737acbed1b93519b65709f1c0f48684b	2013-10-14 17:40:17 +00:00
Dave Borowitz	2ae0171598	Log and skip indexed fields that produce an error Change-Id: I40b8598b3e857faec7269bc380ac5ecbb78e2df7	2013-10-07 17:35:33 -07:00
Dave Borowitz	fc1bba54fd	Don't return futures from ChangeIndex methods In I1b3c5ba0 ChangeIndex.replace was modified to return a future, but ChangeIndexer.Task was not changed to call those futures before returning. In retrospect this was a mistake anyway. The only path by which these methods are currently called is from ChangeIndexerImpl, which turns around and wraps them in a Callable and submits them to the same executor, tying up twice as many threads as necessary. Similarly, to avoid additional fanout in ChangeIndexerImpl, we should not write to multiple index versions in parallel, so remove the TODO to try that. The current behavior slows down individual writes for the relatively short period of an online version upgrade, but that is acceptable given the improved thread utilization overall. Change-Id: I5fff470214ecd69a261dec1c10b9f7bdd0a1907e	2013-10-02 12:00:42 -07:00
David Pursehouse	71c198d662	Remove unnecessary @SuppressWarnings("deprecation") in SolrChangeIndex Change-Id: Ie14819c8df33bcf76ed48ff6c4ffc389b1e217e7	2013-09-26 11:27:51 +09:00
Dave Borowitz	2e0f89d814	Index full sortkey field in secondary index Using the sort key of the last element for pagination only works as long as every sort key is unique. This is true for the general definition of this field in the Change object, which includes the unique change ID at the end of the hex string. Previously, we were incorrectly truncating the change ID off, resulting in many changes in the same sort key bucket and thus broken pagination. Having two different definitions of sort key in the same running server makes it a bit ugly to handle SortKeyPredicates, since the definition of min/max value is now schema dependent, but at least we can keep the same field name. Don't @Deprecate the new SORTKEY field. We were originally hoping to remove this field and depend only on the UPDATED field (with the change ID as tiebreaker), but as long as this field is used for pagination, we have to keep it around. This is because we can't be sure a secondary index will be able to express a query like "(field X, field Y) > (N, M)" to allow us to restart a query in the middle of an UPDATED bucket. We may still decide to scrap the current pagination system but that will probably not happen until we kill the SQL index code. Change-Id: Icb760dbacd01939e5e4936ef87165b6dddcacdc0	2013-09-24 12:48:21 -07:00
Dave Borowitz	8830bb53a4	Return last N results by sortkey when using sortkey_after In the SQL index implementation, sortkey_after queries are always rewritten to use one of the ChangeAccess.*Prev() methods, which order results by sort key ascending. Update the Lucene implementations to match this sort order. Specify in the ChangeIndex javadoc that this sort order is required in such cases, since other code in QueryProcessor depends on this assumption. Change-Id: Ie35d2c6bd07963db6d33145a83493d12c6a4289d	2013-09-24 12:48:21 -07:00
Dave Borowitz	76e7b42490	Push limit term into secondary index queries Change-Id: Id4ec5fe0375229ff8b82ec46fbe300c78098dbbe	2013-08-09 01:14:53 +00:00
Shawn Pearce	eb7193b169	Fix incorrect reviewedb dependency in solr This should depend on -server not -client. Change-Id: I44880c5ee03bbb9e1739c6365ae8cefbcc3cee6d	2013-07-30 00:46:15 -07:00
Dave Borowitz	2361cefe17	Factor out batch indexing logic into a ChangeBatchIndexer Change-Id: Ief166fadc61f76a5f4ffdb879322e9883040bdd6	2013-07-08 13:22:54 -07:00
Dave Borowitz	d103b2b61c	Scan index directory to determine Lucene versions Mark index versions as "ready" when they are fully indexed and the server is not running (i.e. from Reindex). By default, search from the most recent ready index version, and write to both the most recent ready version and (if different) the most recent known version. At server startup, mark all versions except those we are about to start writing to as not ready. Change-Id: Icf42a3eb27b0445899300d60941cd701a8072d41	2013-07-08 13:22:54 -07:00
Dave Borowitz	97a01dbb05	Replace Reindex --dry-run with --output The line of what was a "dry run" was a bit arbitrary, doing all the Gerrit processing (including potential side effects like writing to persistent caches) but not all the index-specific processing, which could still be useful for evaluation or debugging purposes. Instead, provide an output base so administrators can test the indexing behavior without operating on production index data. For Lucene, this flag is interpreted as a path to the index directory; for Solr, it is a prefix added to the collection names. Change-Id: Ic4808d1e3467a780a818ac7d1c89de13610da630	2013-07-08 13:22:54 -07:00
Dave Borowitz	1d2a998298	Bind index versions dynamically We want to support a single index version for reading and multiple index versions for writing. Wrap these in an IndexCollection with getSearchIndex and getWriteIndexes methods. Still bind a single implementation at startup. Change-Id: Ibc4dbbeba064cde68f2585126d7708db5a2b3410	2013-07-08 13:22:54 -07:00
Dave Borowitz	0bd69febcb	Rewrite schema versioning in preparation for multiple versions Store indexes in versioned directories, and don't consider open/closed to have separate versions. Extract a simple base Schema class with a version and a list of fields, which is analogous to the DB's SchemaVersion. Construct indexes using specific versions. Change-Id: I11e8e3a78cd62b1a01aa122bfd747ed5295fe057	2013-07-08 13:22:54 -07:00
Shawn Pearce	9c4720405b	Don't run SQL rewrites when secondary index is enabled If the secondary index is enabled on a server the index handles all queries. There is no reason to run the SQL rewrites or to leave predicates in the predicate tree without being wrapped by the IndexedChangeQuery predicate. Change-Id: I18defed2a5a6003b9ae6ba227e29864b5c912e7b	2013-06-28 11:14:44 -07:00
Ahaan Ugale	404c8246bc	Add secondary index implementation using SolrCloud SolrCloud can be used instead of Lucene by adding "type = SOLR" under [index] and "url = <zookeeper-url>" under [index "solr"] in gerrit.config. Change-Id: I0ff8579c5e23c58b16f3605bc20eba4e80fb40fc	2013-06-27 13:52:39 -07:00

35 Commits