monitoring/monitor-tools/scripts
Jim Gauld 34c2ef7865 Enhance schedtop with blocked_max, disk waiters, and watch commands
The 'schedtop' monitoring tool is used to do engineering
analysis of process scheduling, disk IO, and latency.

This enhances the schedtop monitoring tool with:
- additional fields "bmax" latency and "D" disk-sleep tasks
- command-line options to watch specific tasks and mechanism to
  trigger sysrq

The following new fields are reported:
- "bmax" milliseconds, corresponds to linux scheduler stats
  "blocked_max". This represents involuntary wait of scheduling
  and IO wait.
- "D:<n>", the current number of disk-sleep "D" tasks.

The following command line options are added to be able to watch
specific processes, and optionally trigger a sysrq (i.e., force
a crashdump) when trigger delay threshold milliseconds is reached.
[--watch-cmd=tid1,cmd1,cmd2,...] [--watch-only] [--watch-quiet]
[--trig-delay=time]

The --watch-cmd option matches process names 'comm' field pattern.

The --watch-only option watches and displays only the subset of
tasks discovered at tool startup. This dramatically reduces the
tool cpu overhead.

The --watch-quiet displays no sample output after tool startup,
the only output occurs when the --trig-delay is exceeded.

The --trig-delay=time option will trigger a sysrq to force a crash
dump any watched process "bmax" delay exceeds trigger delay time
in milliseconds.

Example: collect 1 minute of data, monitor all tasks,
         reset scheduler hiwatermark statistics

schedtop \
--period=60 --reset-hwm

Example: collect 1 minute of data, watch specific tasks

schedtop \
--period=60 --reset-hwm \
--watch-cmd=jbd2,kube-apiserver,etcd,forward-journald,containerd \
--watch-only

Example: watch specific tasks and trigger sysrq when any of the
         watched commands exceed 10000ms delay (10 seconds)

schedtop \
--period=36000 --reset-hwm \
--watch-cmd=jbd2,kube-apiserver,etcd,forward-journald,containerd \
--watch-only \
--trig-delay=10000

Testcases:
PASS: Collect standard tool output, verified new bmax and D fields
PASS: Verify --watch-cmds will detect the specified commands or tids
PASS: Verify --watch-only will only display watched commands
PASS: Verify --trig-delay will generate a sysrq
PASS: Verify comm field is limited to 15 characters wide

Closes-Bug: 1927772

Signed-off-by: Jim Gauld <james.gauld@windriver.com>
Change-Id: I5368aac66b24608f5eab366cd929be4c0d4a1f76
2021-12-01 09:34:03 -05:00
..
LICENSE Relocate monitor-tools to stx-integ/tools/monitor-tools 2018-08-01 12:28:38 -04:00
memtop Relocate monitor-tools to stx-integ/tools/monitor-tools 2018-08-01 12:28:38 -04:00
occtop Relocate monitor-tools to stx-integ/tools/monitor-tools 2018-08-01 12:28:38 -04:00
schedtop Enhance schedtop with blocked_max, disk waiters, and watch commands 2021-12-01 09:34:03 -05:00