2019-12-13 10:42:46 +01:00
|
|
|
:linkattrs:
|
2018-08-10 08:49:32 +02:00
|
|
|
= Request Tracing
|
|
|
|
|
2019-08-28 15:45:21 +02:00
|
|
|
[[on-demand]]
|
|
|
|
== On-demand Request Tracing
|
|
|
|
|
2018-08-10 08:49:32 +02:00
|
|
|
Gerrit supports on-demand tracing of single requests that results in
|
|
|
|
additional logs with debug information that are written to the
|
|
|
|
`error_log`. The logs that correspond to a traced request are
|
|
|
|
associated with a unique trace ID. This trace ID is returned with the
|
|
|
|
response and can be used by an administrator to find the matching log
|
|
|
|
entries.
|
|
|
|
|
|
|
|
How tracing is enabled and how the trace ID is returned depends on the
|
|
|
|
request type:
|
|
|
|
|
|
|
|
* REST API: For REST calls tracing can be enabled by setting the
|
2018-09-10 10:06:53 +02:00
|
|
|
`trace` request parameter or the `X-Gerrit-Trace` header, the trace
|
|
|
|
ID is returned as `X-Gerrit-Trace` header. More information about
|
|
|
|
this can be found in the link:rest-api.html#tracing[Request Tracing]
|
|
|
|
section of the link:rest-api.html[REST API documentation].
|
2018-08-10 08:49:32 +02:00
|
|
|
* SSH API: For SSH calls tracing can be enabled by setting the
|
|
|
|
`--trace` option. More information about this can be found in
|
|
|
|
the link:cmd-index.html#trace[Trace] section of the
|
|
|
|
link:cmd-index.html[SSH command documentation].
|
2018-12-13 09:03:37 +01:00
|
|
|
* Git Push (requires usage of git protocol v2): For Git pushes tracing
|
|
|
|
can be enabled by setting the `trace` push option, the trace ID is
|
|
|
|
returned in the command output. More information about this can be
|
|
|
|
found in the link:user-upload.html#trace[Trace] section of the
|
|
|
|
link:user-upload.html[upload documentation].
|
|
|
|
* Git Clone/Fetch/Ls-Remote (requires usage of git protocol v2): For
|
|
|
|
Git clone/fetch/ls-remote tracing can be enabled by setting the
|
|
|
|
`trace` server option. Use '-o trace=<TRACE-ID>' for `git fetch` and
|
|
|
|
`git ls-remote`, and '--server-option trace=<TRACE-ID>' for
|
|
|
|
`git clone`. If the `trace` server option is set without a value
|
|
|
|
(without trace ID) a trace ID is generated but the generated trace ID
|
|
|
|
is not returned to the client (hence a trace ID should always be
|
|
|
|
set).
|
2018-08-10 08:49:32 +02:00
|
|
|
|
2018-08-29 22:52:54 +02:00
|
|
|
When request tracing is enabled it is possible to provide an ID that
|
|
|
|
should be used as trace ID. If a trace ID is not provided a trace ID is
|
|
|
|
automatically generated. The trace ID must be provided to the support
|
2019-08-28 15:39:56 +02:00
|
|
|
team (administrator of the server) so that they can find the trace.
|
2018-08-29 22:52:54 +02:00
|
|
|
|
|
|
|
When doing traces it is recommended to specify the ID of the issue
|
|
|
|
that is being investigated as trace ID so that the traces of the issue
|
|
|
|
can be found more easily. When the issue ID is used as trace ID there
|
|
|
|
is no need to find the generated trace ID and report it in the issue.
|
|
|
|
|
2018-08-29 08:45:51 +02:00
|
|
|
Since tracing consumes additional server resources tracing should only
|
|
|
|
be enabled for single requests if there is a concrete need for
|
|
|
|
debugging. In particular bots should never enable tracing for all their
|
|
|
|
requests by default.
|
|
|
|
|
2019-08-28 17:07:17 +02:00
|
|
|
[[auto-retry]]
|
|
|
|
== Automatic Request Tracing
|
|
|
|
|
|
|
|
Gerrit can be link:config-gerrit.html#retry.retryWithTraceOnFailure[
|
|
|
|
configured] to automatically retry requests on non-recoverable failures
|
|
|
|
with tracing enabled. This allows to automatically captures traces of
|
|
|
|
these failures for further analysis by the Gerrit administrators.
|
|
|
|
|
|
|
|
The auto-retry on failure behaves the same way as if the calling user
|
|
|
|
would retry the failed operation with tracing enabled.
|
|
|
|
|
|
|
|
It is expected that the auto-retry fails with the same exception that
|
|
|
|
triggered the auto-retry, however this is not guaranteed:
|
|
|
|
|
|
|
|
* Not all Gerrit operations are fully atomic and it can happen that
|
|
|
|
some parts of the operation have been successfully performed before
|
|
|
|
the failure happened. In this case the auto-retry may fail with a
|
|
|
|
different exception.
|
|
|
|
* Some exceptions may mistakenly be considered as non-recoverable and
|
|
|
|
the auto-retry actually succeeds.
|
|
|
|
|
|
|
|
[[auto-retry-succeeded]]
|
|
|
|
If an auto-retry succeeds you may consider filing this as
|
|
|
|
link:https://bugs.chromium.org/p/gerrit/issues/entry?template=GoogleSource+Issue[
|
2019-12-10 18:01:31 +01:00
|
|
|
Gerrit issue,role=external,window=_blank] so that the Gerrit developers can fix this and treat this
|
2019-08-28 17:07:17 +02:00
|
|
|
exception as recoverable.
|
|
|
|
|
|
|
|
The trace IDs for auto-retries are generated and start with
|
2019-09-24 08:52:35 +02:00
|
|
|
`retry-on-failure-`. For REST requests they are returned to the client
|
|
|
|
as `X-Gerrit-Trace` header.
|
2019-08-28 17:07:17 +02:00
|
|
|
|
|
|
|
The best way to search for auto-retries in logs is to do a grep by
|
|
|
|
`AutoRetry`. For each auto-retry that happened this should match 1 or 2
|
|
|
|
log entries:
|
|
|
|
|
2019-12-18 08:41:21 +01:00
|
|
|
* one `FINE` log entry with the exception that triggered the auto-retry
|
2019-08-28 17:07:17 +02:00
|
|
|
* one `FINE` log entry with the exception that happened on auto-retry
|
|
|
|
(if this log entry is not present the operation succeeded on
|
|
|
|
auto-retry)
|
|
|
|
|
|
|
|
To inspect single auto-retry occurrences in detail you can do a
|
|
|
|
link:#find-trace[grep by the trace ID]. The trace ID is part of the log
|
|
|
|
entries which have been found by the previous grep (watch out for
|
|
|
|
something like: `retry-on-failure-1534166888910-3985dfba`).
|
|
|
|
|
|
|
|
[TIP]
|
|
|
|
Auto-retrying on failures is only supported by some of the REST
|
|
|
|
endpoints (change REST endpoints that perform updates).
|
|
|
|
|
|
|
|
[[auto-retry-metrics]]
|
|
|
|
=== Metrics
|
|
|
|
|
|
|
|
If auto-retry is link:config-gerrit.html#retry.retryWithTraceOnFailure[
|
|
|
|
enabled] the following metrics are reported:
|
|
|
|
|
|
|
|
* `action/auto_retry_count`: Number of automatic retries with tracing
|
|
|
|
* `action/failures_on_auto_retry_count`: Number of failures on auto retry
|
|
|
|
|
|
|
|
By comparing the values of these counters one can see how often the
|
|
|
|
auto-retry succeeds. As explained link:#auto-retry-succeeded[above] if
|
|
|
|
auto-retries succeed that's an issue with Gerrit that you may want to
|
|
|
|
report.
|
|
|
|
|
|
|
|
[[find-trace]]
|
2018-08-10 08:49:32 +02:00
|
|
|
== Find log entries for a trace ID
|
|
|
|
|
|
|
|
If tracing is enabled all log messages that correspond to the traced
|
|
|
|
request have a `TRACE_ID` tag set, e.g.:
|
|
|
|
|
|
|
|
----
|
|
|
|
[2018-08-13 15:28:08,913] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : Received REST request: GET /a/accounts/self (parameters: [trace]) [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ]
|
|
|
|
[2018-08-13 15:28:08,914] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : Calling user: admin [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ]
|
|
|
|
[2018-08-13 15:28:08,942] [HTTP-76] TRACE com.google.gerrit.httpd.restapi.RestApiServlet : REST call succeeded: 200 [CONTEXT forced=true TRACE_ID="1534166888910-3985dfba" ]
|
|
|
|
----
|
|
|
|
|
|
|
|
By doing a grep with the trace ID over the error log the log entries
|
|
|
|
that correspond to the request can be found.
|
2018-08-29 09:56:16 +02:00
|
|
|
|
2019-08-28 15:39:56 +02:00
|
|
|
[TIP]
|
|
|
|
Usually only server administrators have access to the logs.
|
|
|
|
|
2018-08-29 09:56:16 +02:00
|
|
|
== Which information is captured in a trace?
|
|
|
|
|
|
|
|
* request details
|
|
|
|
** REST API: request URL, request parameter names, calling user,
|
|
|
|
response code, response body on errors
|
|
|
|
** SSH API: parameter names
|
|
|
|
** Git API: push options, magic branch parameter names
|
|
|
|
* cache misses, cache evictions
|
|
|
|
* reads from NoteDb, writes to NoteDb
|
|
|
|
* reads of meta data files, writes of meta data files
|
|
|
|
* index queries (with parameters and matches)
|
|
|
|
* reindex events
|
|
|
|
* permission checks (e.g. which rule is responsible for a deny)
|
2018-08-29 10:31:33 +02:00
|
|
|
* timer metrics
|
2018-08-29 09:56:16 +02:00
|
|
|
* all other logs
|