755504f33f
Since we retry automatically only on non-recoverable failures it's expected that the number of failures on auto retry is the same as the number of auto retries. The number of auto retries is captured by the action/auto_retry_count metric. If the value of this metric is compared with the value of the new metric we can verify if our assumption is correct. If there is a mismatch between the values we have exceptions that are considered as non-recoverable, but which actually are recoverable. In this case we should change the code to treat them as recoverable. Signed-off-by: Edwin Kempin <ekempin@google.com> Change-Id: Id52026bf2d1a27e7c0668bcdd63ca1effdf8db09
209 lines
7.7 KiB
Plaintext
209 lines
7.7 KiB
Plaintext
= Gerrit Code Review - Metrics
|
|
|
|
Metrics about Gerrit's internal state can be sent to external monitoring systems
|
|
via plugins. See the link:dev-plugins.html#metrics[plugin documentation] for
|
|
details of plugin implementations.
|
|
|
|
== Metrics
|
|
|
|
The following metrics are reported.
|
|
|
|
=== General
|
|
|
|
* `build/label`: Version of Gerrit server software.
|
|
* `events`: Triggered events.
|
|
|
|
=== Actions
|
|
|
|
* `action/retry_attempt_count`: Number of retry attempts made
|
|
by RetryHelper to execute an action (0 == single attempt, no retry)
|
|
* `action/retry_timeout_count`: Number of action executions of RetryHelper
|
|
that ultimately timed out
|
|
* `action/auto_retry_count`: Number of automatic retries with tracing
|
|
* `action/failures_on_auto_retry_count`: Number of failures on auto retry
|
|
|
|
=== Pushes
|
|
|
|
* `receivecommits/changes`: histogram of number of changes processed
|
|
in a single upload, split up by update type (change created/updated,
|
|
change autoclosed).
|
|
* `receivecommits/latency`: latency per change for processing a push,
|
|
split up by update type (create+replace, and autoclose)
|
|
* `receivecommits/push_latency`: total latency for processing a push,
|
|
split up by update type (create+replace, autoclose, normal)
|
|
* `receivecommits/timeout`: number of timeouts during push processing.
|
|
|
|
=== Process
|
|
|
|
* `proc/birth_timestamp`: Time at which the Gerrit process started.
|
|
* `proc/uptime`: Uptime of the Gerrit process.
|
|
* `proc/cpu/usage`: CPU time used by the Gerrit process.
|
|
* `proc/num_open_fds`: Number of open file descriptors.
|
|
* `proc/jvm/memory/heap_committed`: Amount of memory guaranteed for user objects.
|
|
* `proc/jvm/memory/heap_used`: Amount of memory holding user objects.
|
|
* `proc/jvm/memory/non_heap_committed`: Amount of memory guaranteed for classes,
|
|
etc.
|
|
* `proc/jvm/memory/non_heap_used`: Amount of memory holding classes, etc.
|
|
* `proc/jvm/memory/object_pending_finalization_count`: Approximate number of
|
|
objects needing finalization.
|
|
* `proc/jvm/gc/count`: Number of GCs.
|
|
* `proc/jvm/gc/time`: Approximate accumulated GC elapsed time.
|
|
* `proc/jvm/thread/num_live`: Current live thread count.
|
|
|
|
=== Caches
|
|
|
|
* `caches/memory_cached`: Memory entries.
|
|
* `caches/memory_hit_ratio`: Memory hit ratio.
|
|
* `caches/memory_eviction_count`: Memory eviction count.
|
|
* `caches/disk_cached`: Disk entries used by persistent cache.
|
|
* `caches/disk_hit_ratio`: Disk hit ratio for persistent cache.
|
|
|
|
=== Change
|
|
|
|
* `change/submit_rule_evaluation`: Latency for evaluating submit rules on a change.
|
|
* `change/submit_type_evaluation`: Latency for evaluating the submit type on a change.
|
|
|
|
=== HTTP
|
|
|
|
* `http/server/error_count`: Rate of REST API error responses.
|
|
* `http/server/success_count`: Rate of REST API success responses.
|
|
* `http/server/rest_api/count`: Rate of REST API calls by view.
|
|
* `http/server/rest_api/change_id_type`: Rate of REST API calls by change ID type.
|
|
* `http/server/rest_api/error_count`: Rate of REST API calls by view.
|
|
* `http/server/rest_api/server_latency`: REST API call latency by view.
|
|
* `http/server/rest_api/response_bytes`: Size of REST API response on network
|
|
(may be gzip compressed) by view.
|
|
* `http/server/rest_api/change_json/to_change_info_latency`: Latency for
|
|
toChangeInfo invocations in ChangeJson.
|
|
* `http/server/rest_api/change_json/to_change_infos_latency`: Latency for
|
|
toChangeInfos invocations in ChangeJson.
|
|
* `http/server/rest_api/change_json/format_query_results_latency`: Latency for
|
|
formatQueryResults invocations in ChangeJson.
|
|
* `http/server/rest_api/ui_actions/latency`: Latency for RestView#getDescription calls.
|
|
|
|
=== Query
|
|
|
|
* `query/query_latency`: Successful query latency, accumulated over the life
|
|
of the process.
|
|
|
|
=== Core Queues
|
|
|
|
The following queues support metrics:
|
|
|
|
* default `WorkQueue`
|
|
* index batch
|
|
* index interactive
|
|
* receive commits
|
|
* send email
|
|
* ssh batch worker
|
|
* ssh command start
|
|
* ssh interactive worker
|
|
* ssh stream worker
|
|
|
|
Each queue provides the following metrics:
|
|
|
|
* `queue/<queue_name>/pool_size`: Current number of threads in the pool
|
|
* `queue/<queue_name>/max_pool_size`: Maximum allowed number of threads in the pool
|
|
* `queue/<queue_name>/active_threads`: Number of threads that are actively executing tasks
|
|
* `queue/<queue_name>/scheduled_tasks`: Number of scheduled tasks in the queue
|
|
* `queue/<queue_name>/total_scheduled_tasks_count`: Total number of tasks that have been scheduled
|
|
* `queue/<queue_name>/total_completed_tasks_count`: Total number of tasks that have completed execution
|
|
|
|
=== SSH sessions
|
|
|
|
* `sshd/sessions/connected`: Number of currently connected SSH sessions.
|
|
* `sshd/sessions/created`: Rate of new SSH sessions.
|
|
* `sshd/sessions/authentication_failures`: Rate of SSH authentication failures.
|
|
|
|
=== Topics
|
|
|
|
* `topic/cross_project_submit`: number of cross-project topic submissions.
|
|
* `topic/cross_project_submit_completed`: number of cross-project
|
|
topic submissions that concluded successfully.
|
|
|
|
=== JGit
|
|
|
|
* `jgit/block_cache/cache_used`: Bytes of memory retained in JGit block cache.
|
|
* `jgit/block_cache/open_files`: File handles held open by JGit block cache.
|
|
|
|
=== Git
|
|
|
|
* `git/upload-pack/request_count`: Total number of git-upload-pack requests.
|
|
* `git/upload-pack/phase_counting`: Time spent in the 'Counting...' phase.
|
|
* `git/upload-pack/phase_compressing`: Time spent in the 'Compressing...' phase.
|
|
* `git/upload-pack/phase_writing`: Time spent transferring bytes to client.
|
|
* `git/upload-pack/pack_bytes`: Distribution of sizes of packs sent to clients.
|
|
|
|
=== BatchUpdate
|
|
|
|
* `batch_update/execute_change_ops`: BatchUpdate change update latency,
|
|
excluding reindexing
|
|
|
|
=== NoteDb
|
|
|
|
* `notedb/update_latency`: NoteDb update latency by table.
|
|
* `notedb/stage_update_latency`: Latency for staging updates to NoteDb by table.
|
|
* `notedb/read_latency`: NoteDb read latency by table.
|
|
* `notedb/parse_latency`: NoteDb parse latency by table.
|
|
* `notedb/external_id_cache_load_count`: Total number of times the external ID
|
|
cache loader was called.
|
|
* `notedb/external_id_partial_read_latency`: Latency for generating a new external ID
|
|
cache state from a prior state.
|
|
* `notedb/external_id_update_count`: Total number of external ID updates.
|
|
* `notedb/read_all_external_ids_latency`: Latency for reading all
|
|
external ID's from NoteDb.
|
|
|
|
=== Permissions
|
|
|
|
* `permissions/project_state/computation_latency`: Latency to compute current access
|
|
sections on a project by traversing it's parents.
|
|
* `permissions/permission_collection/filter_latency`: Latency to filter access sections
|
|
by user and ref.
|
|
* `permissions/ref_filter/full_filter_count`: Rate of full ref filter operations
|
|
* `permissions/ref_filter/skip_filter_count`: Rate of ref filter operations where
|
|
we skip full evaluation because the user can read all refs
|
|
|
|
=== Reviewer Suggestion
|
|
|
|
* `reviewer_suggestion/query_accounts`: Latency for querying accounts for
|
|
reviewer suggestion.
|
|
* `reviewer_suggestion/recommend_accounts`: Latency for recommending accounts
|
|
for reviewer suggestion.
|
|
* `reviewer_suggestion/load_accounts`: Latency for loading accounts for
|
|
reviewer suggestion.
|
|
* `reviewer_suggestion/query_groups`: Latency for querying groups for reviewer
|
|
suggestion.
|
|
|
|
=== Repo Sequences
|
|
|
|
* `sequence/next_id_latency`: Latency of requesting IDs from repo sequences.
|
|
|
|
=== Plugin
|
|
|
|
* `plugin/latency`: Latency for plugin invocation.
|
|
* `plugin/error_count`: Number of plugin errors.
|
|
|
|
=== Group
|
|
|
|
* `group/guess_relevant_groups_latency`: Latency for guessing relevant groups.
|
|
|
|
=== Replication Plugin
|
|
|
|
* `plugins/replication/replication_latency`: Time spent pushing to remote
|
|
destination.
|
|
* `plugins/replication/replication_delay`: Time spent waiting before pushing to
|
|
remote destination.
|
|
* `plugins/replication/replication_retries`: Number of retries when pushing to
|
|
remote destination.
|
|
|
|
=== License
|
|
|
|
* `license/cla_check_count`: Total number of CLA check requests.
|
|
|
|
GERRIT
|
|
------
|
|
Part of link:index.html[Gerrit Code Review]
|
|
|
|
SEARCHBOX
|
|
---------
|