gerrit/Documentation/metrics.txt
Patrick Hiesel f71bafe885 Instrument two possibly long-running access computations
We are investigating what it would take to remove PerThreadCache and
refactor the rest of the default permission backend.

One potential cause of slowness and the success of PerThreadCache are
two access computations, one in ProjectCache, one on
PermissionCollection.

This commit instruments these to validate our assumption. The metrics
will be removed once we have data.

Change-Id: I5f39ae7113d657bbb69b693367aaa0930cf21900
2018-09-25 07:48:48 +00:00

182 lines
6.5 KiB
Plaintext

= Gerrit Code Review - Metrics
Metrics about Gerrit's internal state can be sent to external monitoring systems
via plugins. See the link:dev-plugins.html#metrics[plugin documentation] for
details of plugin implementations.
== Metrics
The following metrics are reported.
=== General
* `build/label`: Version of Gerrit server software.
* `events`: Triggered events.
=== Actions
* `action/retry_attempt_counts`: Distribution of number of attempts made
by RetryHelper to execute an action (1 == single attempt, no retry)
* `action/retry_timeout_count`: Number of action executions of RetryHelper
that ultimately timed out
=== Process
* `proc/birth_timestamp`: Time at which the Gerrit process started.
* `proc/uptime`: Uptime of the Gerrit process.
* `proc/cpu/usage`: CPU time used by the Gerrit process.
* `proc/num_open_fds`: Number of open file descriptors.
* `proc/jvm/memory/heap_committed`: Amount of memory guaranteed for user objects.
* `proc/jvm/memory/heap_used`: Amount of memory holding user objects.
* `proc/jvm/memory/non_heap_committed`: Amount of memory guaranteed for classes,
etc.
* `proc/jvm/memory/non_heap_used`: Amount of memory holding classes, etc.
* `proc/jvm/memory/object_pending_finalization_count`: Approximate number of
objects needing finalization.
* `proc/jvm/gc/count`: Number of GCs.
* `proc/jvm/gc/time`: Approximate accumulated GC elapsed time.
* `proc/jvm/thread/num_live`: Current live thread count.
=== Caches
* `caches/memory_cached`: Memory entries.
* `caches/memory_hit_ratio`: Memory hit ratio.
* `caches/memory_eviction_count`: Memory eviction count.
* `caches/disk_cached`: Disk entries used by persistent cache.
* `caches/disk_hit_ratio`: Disk hit ratio for persistent cache.
=== HTTP
* `http/server/error_count`: Rate of REST API error responses.
* `http/server/success_count`: Rate of REST API success responses.
* `http/server/rest_api/count`: Rate of REST API calls by view.
* `http/server/rest_api/change_id_type`: Rate of REST API calls by change ID type.
* `http/server/rest_api/error_count`: Rate of REST API calls by view.
* `http/server/rest_api/server_latency`: REST API call latency by view.
* `http/server/rest_api/response_bytes`: Size of REST API response on network
(may be gzip compressed) by view.
* `http/server/rest_api/change_json/to_change_info_latency`: Latency for
toChangeInfo invocations in ChangeJson.
* `http/server/rest_api/change_json/to_change_infos_latency`: Latency for
toChangeInfos invocations in ChangeJson.
* `http/server/rest_api/change_json/format_query_results_latency`: Latency for
formatQueryResults invocations in ChangeJson.
* `http/server/rest_api/ui_actions/latency`: Latency for RestView#getDescription calls.
=== Query
* `query/query_latency`: Successful query latency, accumulated over the life
of the process.
=== Core Queues
The following queues support metrics:
* default `WorkQueue`
* index batch
* index interactive
* receive commits
* send email
* ssh batch worker
* ssh command start
* ssh interactive worker
* ssh stream worker
Each queue provides the following metrics:
* `queue/<queue_name>/pool_size`: Current number of threads in the pool
* `queue/<queue_name>/max_pool_size`: Maximum allowed number of threads in the pool
* `queue/<queue_name>/active_threads`: Number of threads that are actively executing tasks
* `queue/<queue_name>/scheduled_tasks`: Number of scheduled tasks in the queue
* `queue/<queue_name>/total_scheduled_tasks_count`: Total number of tasks that have been scheduled
* `queue/<queue_name>/total_completed_tasks_count`: Total number of tasks that have completed execution
=== SSH sessions
* `sshd/sessions/connected`: Number of currently connected SSH sessions.
* `sshd/sessions/created`: Rate of new SSH sessions.
* `sshd/sessions/authentication_failures`: Rate of SSH authentication failures.
=== SQL connections
* `sql/connection_pool/connections`: SQL database connections.
=== Topics
* `topic/cross_project_submit`: number of cross-project topic submissions.
* `topic/cross_project_submit_completed`: number of cross-project
topic submissions that concluded successfully.
=== JGit
* `jgit/block_cache/cache_used`: Bytes of memory retained in JGit block cache.
* `jgit/block_cache/open_files`: File handles held open by JGit block cache.
=== Git
* `git/upload-pack/request_count`: Total number of git-upload-pack requests.
* `git/upload-pack/phase_counting`: Time spent in the 'Counting...' phase.
* `git/upload-pack/phase_compressing`: Time spent in the 'Compressing...' phase.
* `git/upload-pack/phase_writing`: Time spent transferring bytes to client.
* `git/upload-pack/pack_bytes`: Distribution of sizes of packs sent to clients.
=== BatchUpdate
* `batch_update/execute_change_ops`: BatchUpdate change update latency,
excluding reindexing
=== NoteDb
* `notedb/update_latency`: NoteDb update latency by table.
* `notedb/stage_update_latency`: Latency for staging updates to NoteDb by table.
* `notedb/read_latency`: NoteDb read latency by table.
* `notedb/parse_latency`: NoteDb parse latency by table.
* `notedb/auto_rebuild_latency`: NoteDb auto-rebuilding latency by table.
* `notedb/auto_rebuild_failure_count`: NoteDb auto-rebuilding attempts that
failed by table.
* `notedb/external_id_update_count`: Total number of external ID updates.
* `notedb/read_all_external_ids_latency`: Latency for reading all
external ID's from NoteDb.
=== Permissions
* `permissions/project_state/computation_latency`: Latency to compute current access
sections on a project by traversing it's parents.
* `permissions/permission_collection/filter_latency`: Latency to filter access sections
by user and ref.
=== Reviewer Suggestion
* `reviewer_suggestion/query_accounts`: Latency for querying accounts for
reviewer suggestion.
* `reviewer_suggestion/recommend_accounts`: Latency for recommending accounts
for reviewer suggestion.
* `reviewer_suggestion/load_accounts`: Latency for loading accounts for
reviewer suggestion.
* `reviewer_suggestion/query_groups`: Latency for querying groups for reviewer
suggestion.
=== Repo Sequences
* `sequence/next_id_latency`: Latency of requesting IDs from repo sequences.
=== Replication Plugin
* `plugins/replication/replication_latency`: Time spent pushing to remote
destination.
* `plugins/replication/replication_delay`: Time spent waiting before pushing to
remote destination.
* `plugins/replication/replication_retries`: Number of retries when pushing to
remote destination.
=== License
* `license/cla_check_count`: Total number of CLA check requests.
GERRIT
------
Part of link:index.html[Gerrit Code Review]
SEARCHBOX
---------