gerrit/Documentation/metrics.txt
Edwin Kempin 8e07cf65d8 Add metric to measure latency for guessing relevant groups
Guessing groups is invoked from some group backends to find the groups
of a user (e.g. from LdapGroupBackend). The groups of the calling user
are needed for the change etag computation. Hence if guessing groups is
slow this has direct impact on the speed of the change etag computation
which makes it interesting to know how much latency we spend on guessing
relevant groups.

Change-Id: Ibae879a702af007f1a326d01d15553487926bc4a
Signed-off-by: Edwin Kempin <ekempin@google.com>
2018-10-25 16:25:25 +02:00

203 lines
7.3 KiB
Plaintext

= Gerrit Code Review - Metrics
Metrics about Gerrit's internal state can be sent to external monitoring systems
via plugins. See the link:dev-plugins.html#metrics[plugin documentation] for
details of plugin implementations.
== Metrics
The following metrics are reported.
=== General
* `build/label`: Version of Gerrit server software.
* `events`: Triggered events.
=== Actions
* `action/retry_attempt_counts`: Distribution of number of attempts made
by RetryHelper to execute an action (1 == single attempt, no retry)
* `action/retry_timeout_count`: Number of action executions of RetryHelper
that ultimately timed out
=== Pushes
* `receivecommits/changes`: histogram of number of changes processed
in a single upload, split up by update type (new change created,
existing changed updated, change autoclosed).
* `receivecommits/latency`: latency per change for processing a push,
split up by update type (create+replace, and autoclose)
* `receivecommits/timeout`: number of timeouts during push processing.
=== Process
* `proc/birth_timestamp`: Time at which the Gerrit process started.
* `proc/uptime`: Uptime of the Gerrit process.
* `proc/cpu/usage`: CPU time used by the Gerrit process.
* `proc/num_open_fds`: Number of open file descriptors.
* `proc/jvm/memory/heap_committed`: Amount of memory guaranteed for user objects.
* `proc/jvm/memory/heap_used`: Amount of memory holding user objects.
* `proc/jvm/memory/non_heap_committed`: Amount of memory guaranteed for classes,
etc.
* `proc/jvm/memory/non_heap_used`: Amount of memory holding classes, etc.
* `proc/jvm/memory/object_pending_finalization_count`: Approximate number of
objects needing finalization.
* `proc/jvm/gc/count`: Number of GCs.
* `proc/jvm/gc/time`: Approximate accumulated GC elapsed time.
* `proc/jvm/thread/num_live`: Current live thread count.
=== Caches
* `caches/memory_cached`: Memory entries.
* `caches/memory_hit_ratio`: Memory hit ratio.
* `caches/memory_eviction_count`: Memory eviction count.
* `caches/disk_cached`: Disk entries used by persistent cache.
* `caches/disk_hit_ratio`: Disk hit ratio for persistent cache.
=== HTTP
* `http/server/error_count`: Rate of REST API error responses.
* `http/server/success_count`: Rate of REST API success responses.
* `http/server/rest_api/count`: Rate of REST API calls by view.
* `http/server/rest_api/change_id_type`: Rate of REST API calls by change ID type.
* `http/server/rest_api/error_count`: Rate of REST API calls by view.
* `http/server/rest_api/server_latency`: REST API call latency by view.
* `http/server/rest_api/response_bytes`: Size of REST API response on network
(may be gzip compressed) by view.
* `http/server/rest_api/change_json/to_change_info_latency`: Latency for
toChangeInfo invocations in ChangeJson.
* `http/server/rest_api/change_json/to_change_infos_latency`: Latency for
toChangeInfos invocations in ChangeJson.
* `http/server/rest_api/change_json/format_query_results_latency`: Latency for
formatQueryResults invocations in ChangeJson.
* `http/server/rest_api/ui_actions/latency`: Latency for RestView#getDescription calls.
=== Query
* `query/query_latency`: Successful query latency, accumulated over the life
of the process.
=== Core Queues
The following queues support metrics:
* default `WorkQueue`
* index batch
* index interactive
* receive commits
* send email
* ssh batch worker
* ssh command start
* ssh interactive worker
* ssh stream worker
Each queue provides the following metrics:
* `queue/<queue_name>/pool_size`: Current number of threads in the pool
* `queue/<queue_name>/max_pool_size`: Maximum allowed number of threads in the pool
* `queue/<queue_name>/active_threads`: Number of threads that are actively executing tasks
* `queue/<queue_name>/scheduled_tasks`: Number of scheduled tasks in the queue
* `queue/<queue_name>/total_scheduled_tasks_count`: Total number of tasks that have been scheduled
* `queue/<queue_name>/total_completed_tasks_count`: Total number of tasks that have completed execution
=== SSH sessions
* `sshd/sessions/connected`: Number of currently connected SSH sessions.
* `sshd/sessions/created`: Rate of new SSH sessions.
* `sshd/sessions/authentication_failures`: Rate of SSH authentication failures.
=== SQL connections
* `sql/connection_pool/connections`: SQL database connections.
=== Topics
* `topic/cross_project_submit`: number of cross-project topic submissions.
* `topic/cross_project_submit_completed`: number of cross-project
topic submissions that concluded successfully.
=== JGit
* `jgit/block_cache/cache_used`: Bytes of memory retained in JGit block cache.
* `jgit/block_cache/open_files`: File handles held open by JGit block cache.
=== Git
* `git/upload-pack/request_count`: Total number of git-upload-pack requests.
* `git/upload-pack/phase_counting`: Time spent in the 'Counting...' phase.
* `git/upload-pack/phase_compressing`: Time spent in the 'Compressing...' phase.
* `git/upload-pack/phase_writing`: Time spent transferring bytes to client.
* `git/upload-pack/pack_bytes`: Distribution of sizes of packs sent to clients.
=== BatchUpdate
* `batch_update/execute_change_ops`: BatchUpdate change update latency,
excluding reindexing
=== NoteDb
* `notedb/update_latency`: NoteDb update latency by table.
* `notedb/stage_update_latency`: Latency for staging updates to NoteDb by table.
* `notedb/read_latency`: NoteDb read latency by table.
* `notedb/parse_latency`: NoteDb parse latency by table.
* `notedb/auto_rebuild_latency`: NoteDb auto-rebuilding latency by table.
* `notedb/auto_rebuild_failure_count`: NoteDb auto-rebuilding attempts that
failed by table.
* `notedb/external_id_update_count`: Total number of external ID updates.
* `notedb/read_all_external_ids_latency`: Latency for reading all
external ID's from NoteDb.
=== Permissions
* `permissions/project_state/computation_latency`: Latency to compute current access
sections on a project by traversing it's parents.
* `permissions/permission_collection/filter_latency`: Latency to filter access sections
by user and ref.
* `permissions/ref_filter/full_filter_count`: Rate of full ref filter operations
* `permissions/ref_filter/skip_filter_count`: Rate of ref filter operations where
we skip full evaluation because the user can read all refs
=== Reviewer Suggestion
* `reviewer_suggestion/query_accounts`: Latency for querying accounts for
reviewer suggestion.
* `reviewer_suggestion/recommend_accounts`: Latency for recommending accounts
for reviewer suggestion.
* `reviewer_suggestion/load_accounts`: Latency for loading accounts for
reviewer suggestion.
* `reviewer_suggestion/query_groups`: Latency for querying groups for reviewer
suggestion.
=== Repo Sequences
* `sequence/next_id_latency`: Latency of requesting IDs from repo sequences.
=== Plugin
* `plugin/latency`: Latency for plugin invocation.
* `plugin/error_count`: Number of plugin errors.
=== Group
* `group/guess_relevant_groups_latency`: Latency for guessing relevant groups.
=== Replication Plugin
* `plugins/replication/replication_latency`: Time spent pushing to remote
destination.
* `plugins/replication/replication_delay`: Time spent waiting before pushing to
remote destination.
* `plugins/replication/replication_retries`: Number of retries when pushing to
remote destination.
=== License
* `license/cla_check_count`: Total number of CLA check requests.
GERRIT
------
Part of link:index.html[Gerrit Code Review]
SEARCHBOX
---------