in WorkQueue, explicitly cancel Runnables that are Futures.

In LuceneChangeIndex, we schedule the future by calling (essentially)

  MoreExecutors.listeningDecorator(threadPool).submit()

this returns a TrustedListenableFutureTask, a Future implemented by
Guava, and we wait on this one.

The implementation passes this off to ScheduledThreadPoolExecutor for
running. This interprets it as a Runnable in
AbstractExecutorService#submit(), and a Runnable has no call surface
for cancellation. This means that the guava future is never canceled
if the corresponding ScheduledFutureTask is canceled.

Server#stop shuts down all thread pools. Since the pools are created
with

    setExecuteExistingDelayedTasksAfterShutdownPolicy(false)

all pending work is canceled.

The problem would trigger in the following circumstances:

 * For tests that schedule two or more ref updates at the end of the
   test. Since the interactive pool has size 1, that could delay a
   piece of work to be delayed.

 * Executors are shutdown in creation order, which is random. It would
   only trigger if the interactive pool was shutdown before the batch
   pool.

The problem could be reliably reproduced by building with Bazel,
setting shard_count=30 on
//gerrit-acceptance-tests/src/test/java/com/google/gerrit/acceptance/rest/project:rest_project,
and running shard 10 (which exhibited the problem) 50-way parallel on
a 12 HT-core system.

Things to note:

* If we have to use ListenableFutures, then it would be nice if we
  could use a Executor that actually works together well with Guava.

* Server#stop discards pending work. In particular, work scheduled by
  ReindexAfterUpdate can be discarded, potentially leaving the index
  inconsistent.

* ReindexAfterUpdate runs in the batch executor, but then schedules
  its search work on the interactive executor, which is gratuitously
  parallel.

* WorkQueue.Executor is a lot of cognitive overhead for providing a
  list of processes. Can't administrators just run jstack?

* A randomized creation order for threadpools causes randomized
  shutdown order, making problems harder to reproduce.

Bug: Issue 4466
Change-Id: I55c3b85c66433de7ee9e037fc243abe705080bbc
This commit is contained in:
Han-Wen Nienhuys 2016-10-13 17:48:42 +02:00 committed by Jonathan Nieder
parent 10b05e7285
commit 405a8f53d3

View File

@ -38,6 +38,7 @@ import java.util.concurrent.CopyOnWriteArrayList;
import java.util.concurrent.Delayed;
import java.util.concurrent.ExecutionException;
import java.util.concurrent.Executors;
import java.util.concurrent.Future;
import java.util.concurrent.RunnableScheduledFuture;
import java.util.concurrent.ScheduledThreadPoolExecutor;
import java.util.concurrent.ThreadFactory;
@ -356,6 +357,15 @@ public class WorkQueue {
((CanceledWhileRunning) runnable).setCanceledWhileRunning();
}
}
if (runnable instanceof Future<?>) {
// Creating new futures eventually passes through
// AbstractExecutorService#schedule, which will convert the Guava
// Future to a Runnable, thereby making it impossible for the
// cancellation to propagate from ScheduledThreadPool's task back to
// the Guava future, so kludge it here.
((Future<?>) runnable).cancel(mayInterruptIfRunning);
}
executor.remove(this);
executor.purge();
return true;