Fix GitOverHttpServletIT flakiness
The races here are non-obvious. First, in general, HTTP audit event dispatching looks synchronous in GitOverHttpServlet, but is actually async since it happens after writing data over an HTTP connection. There's no way to know whether the thread on the other end has read the data and continued executing before #doFilter returns. More specifically in the test environment, GitOverHttpServletIT performs some Git-over-HTTP operations during setup, namely cloning the repo into an InMemoryRepository. We don't know whether the audit events generated by this operation are processed before or after the @Test method starts executing. As a result, we can't use the cleaner alternative of having a test set up its own AuditListener binding, because the DynamicSet addition may or may not be visible when GitOverHttpServlet calls the audit listener. Fundamentally, we can't avoid waiting an arbitrarily long amount of time for the audit events to appear, because the JVM provides no time-bounded guarantee about when the HTTP thread will continue executing after the test thread has finished reading the HTTP response. So ultimately, this change boils down to increasing the timeout we wait for audit events to appear. (Also note that in the pessimal JVM thread scheduling case, we are also subject to timeouts in HTTP communication, so even if we somehow got rid of timeouts for audit event processing it's not like we would be free from timeouts.) That said, this change also improves the code to be more reliable and easy to reason about in several ways. First, use BlockingQueue over wait/notify, which is a better abstraction for pumping messages between threads. Second, we now keep track in GitOverHttpServlet of how many requests were started, and this AtomicLong increment operation happens before any HTTP response data is written. This means that we know exactly how many audit events to expect from the BlockingQueue, and callers don't have to either specify the expected count in advance. This reduces the risk of a caller miscounting and waiting for an event that will never arrive. It also means that test methods can easily drain an arbitrary number of events from the setup code with a single call at the top of the method. The remainder of events are generated from entirely within the method. Now that we more reliably dequeue all events, it's clear that we actually do expect two HTTP requests in both the upload-pack and receive-pack cases. The previous code expecting 1 event in the upload-pack case was wrong and only passed due to flakiness. It's possible that the timeouts may still need to be adjusted to work in other environments (slow laptops, CI). In the worst case, we can use BlockingQueue#take(), which has no timeout at all. As a side effect, move FakeGroupAuditService to the acceptance package, since it needs a new dependency that we don't really need to add to the testing package. Like other classes in the acceptance package, it's used during acceptance test setup, not any smaller tests. Considered the following alternatives to introducing a new counter in GitOverHttpServlet: * Changing the GroupAuditService interface to explicitly record the start of requests. Getting the interface right is nontrivial and would require changes to multiple implementations, with no benefit outside of tests. * Hooking into existing HTTP request metrics, which doesn't currently work due to the bug identified by Ic7b38a01. Verified that this reduces flakiness to <0.5% on my workstation with: $ bazel test --runs_per_test=200 --test_filter='GitOverHttpServletIT#.*AuditEventLog' //javatests/com/google/gerrit/acceptance/git:git Change-Id: Idd57cc7f5e0b6504403b10bd3676e78d9f1cd792
This commit is contained in:
		| @@ -1,62 +0,0 @@ | ||||
| // Copyright (C) 2018 The Android Open Source Project | ||||
| // | ||||
| // Licensed under the Apache License, Version 2.0 (the "License"); | ||||
| // you may not use this file except in compliance with the License. | ||||
| // You may obtain a copy of the License at | ||||
| // | ||||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||||
| // | ||||
| // Unless required by applicable law or agreed to in writing, software | ||||
| // distributed under the License is distributed on an "AS IS" BASIS, | ||||
| // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||||
| // See the License for the specific language governing permissions and | ||||
| // limitations under the License. | ||||
|  | ||||
| package com.google.gerrit.testing; | ||||
|  | ||||
| import com.google.gerrit.extensions.registration.DynamicSet; | ||||
| import com.google.gerrit.server.AuditEvent; | ||||
| import com.google.gerrit.server.audit.AuditListener; | ||||
| import com.google.gerrit.server.audit.AuditService; | ||||
| import com.google.gerrit.server.audit.group.GroupAuditListener; | ||||
| import com.google.gerrit.server.group.GroupAuditService; | ||||
| import com.google.gerrit.server.plugincontext.PluginSetContext; | ||||
| import com.google.inject.AbstractModule; | ||||
| import com.google.inject.Inject; | ||||
| import com.google.inject.Singleton; | ||||
| import java.util.ArrayList; | ||||
| import java.util.List; | ||||
|  | ||||
| @Singleton | ||||
| public class FakeGroupAuditService extends AuditService { | ||||
|   public final List<AuditEvent> auditEvents = new ArrayList<>(); | ||||
|  | ||||
|   public static class Module extends AbstractModule { | ||||
|     @Override | ||||
|     public void configure() { | ||||
|       DynamicSet.setOf(binder(), GroupAuditListener.class); | ||||
|       DynamicSet.setOf(binder(), AuditListener.class); | ||||
|       bind(GroupAuditService.class).to(FakeGroupAuditService.class); | ||||
|     } | ||||
|   } | ||||
|  | ||||
|   @Inject | ||||
|   FakeGroupAuditService( | ||||
|       PluginSetContext<AuditListener> auditListeners, | ||||
|       PluginSetContext<GroupAuditListener> groupAuditListeners) { | ||||
|     super(auditListeners, groupAuditListeners); | ||||
|   } | ||||
|  | ||||
|   public void clearEvents() { | ||||
|     auditEvents.clear(); | ||||
|   } | ||||
|  | ||||
|   @Override | ||||
|   public void dispatch(AuditEvent action) { | ||||
|     super.dispatch(action); | ||||
|     synchronized (auditEvents) { | ||||
|       auditEvents.add(action); | ||||
|       auditEvents.notifyAll(); | ||||
|     } | ||||
|   } | ||||
| } | ||||
		Reference in New Issue
	
	Block a user
	 Dave Borowitz
					Dave Borowitz