4d295284e2
When Gerrit is shutdown we must wait for all currently running index threads to finish because otherwise these index events are lost and the index becomes stale. There are 2 executors for indexing, one for interactive indexing and one for batch indexing. A first attempt to fix this was done by change I7bda130585, but it broke the site after online reindexing (see issue 4618) and hence it was reverted by change I6ec90eb733. Change Id332ec0215 fixed the issue partly. In addition to the Index#close() method which closes the index it added a Index#stop() method that stops the (interactive) index executor and awaits the termination of all running index threads. The Index#stop() method is only called when Gerrit is shutdown, while Index#close() is also invoked when online reindexing is done. There are several issues with this fix: 1. It only cared about the interactive index executor, but not about the batch index executor. 2. The IndexCollection#stop() method which is called on Gerrit shutdown missed to call Index#stop() on the write index if it was the same Index instance as the read index. 3. Although documented developers were confused about Index having both a close() and a stop() method and there was a danger of calling stop() accidentally and thus break all further indexing. 4. Closing the interactive index executor when stopping the change index assumed that the inactive index executor is only used by the change index. This assumption is currently true, but only because the other indexes (account, group, project) missed to use this executor. 5. The interactive index executor was only stopped when Lucene was used, but not when ElasticSearch was used. Due to 1. and 2. all query tests (e.g. LuceneQueryChangesTest) were leaking threads. Each test case accumulated more running (interactive and batch) index threads. These threads consumed memory and in the end this possibly resulted in an OutOfMemoryError (only for running LuceneQueryChangesTest on Mac, see issue 7611). To fix this IndexModule adds a new LifecycleListener that is responsible for shutting down both index executors: - Both index executors are now shutdown (interactive and batch). - The stop() method of LifecycleListeners is only invoked when Gerrit is shutdown and it's unlikely to be triggered during normal Gerrit runtime (e.g. when online reindexing is done). - There is exactly one place that shuts down the index executors (as opposed to n indexes trying to take care of it). This makes sure that the executors can be shared across indexes. - The stop() method from Index is removed so that there is less confusion about stop() vs. close(). One important thing is that the executors are shutdown before closing the indexes. This ensures that all writes have been performed in-memory before the final flush to disk is done. This means that the LifecycleListener to stop the executors must be registered after the LifecycleListeners that close the indexes (on Gerrit start the LifecycleListeners are invoked in the order in which they are registered, on shutdown of Gerrit the order is reversed). IndexModule also allows to provide external executors for the interactive and batch indexing (instead of creating new executors). In this case IndexModule doesn't register a LifecycleListener to shutdown the executors since they are externally managed. Providing external executors is only used by Google for their multi-tenant setup. In this setup the executors are global singletons in the process that are shared by many sites and shutting down one site must not shut down the global executors. This fixes the OutOfMemoryError that was observed on Mac when running the LuceneQueryChangesTest (see issue 7611). Bug: Issue 7611 Change-Id: I093fb19b56aba90fbb7579f0e59da435c4c1b702 Signed-off-by: Edwin Kempin <ekempin@google.com> |
||
---|---|---|
.. | ||
src | ||
BUILD |