Files
kernel/kernel-rt/debian/patches/series
Jiping Ma 19a00692b9 sched/fair: Block delayed tasks on throttled hierarchy during dequeue
Dequeuing a fair task on a throttled hierarchy returns early on
encountering a throttled cfs_rq since the throttle path has already
dequeued the hierarchy above and has adjusted the h_nr_* accounting till
the root cfs_rq.

wait_task_inactive() is special as it expects the delayed task on
throttled hierarchy to reach the blocked state on dequeue but since
__block_task() is never called, task_on_rq_queued() continues to return
true. Furthermore, since the task is now off the hierarchy, the pick
never reaches it to fully block the task even after unthrottle leading
to wait_task_inactive() looping endlessly.

Remedy this by calling __block_task() if a delayed task is being
dequeued on a throttled hierarchy.

The system frequently hangs with kernel warning "!se->on_rq" at
"!se->on_rq
WARNING: CPU: 40 PID: 49956 at kernel/sched/fair.c:704
update_entity_lag+0x7c/0x90" even we had included the commit
https://lore.kernel.org/all/tencent_3177343A3163451463643E434C61911B4208@qq.com/

https://lore.kernel.org/all/tencent_3177343A3163451463643E434C61911B4208@qq.com/
is replaced by
https://lore.kernel.org/all/20251015060359.34722-1-kprateek.nayak@amd.com/.
Please refer the discuss email
https://lore.kernel.org/all/b7d9da0c-ee50-173b-3dc9-adcddc64e156@gmail.com/

So we will remove the original patch
0015-sched-fair-Fix-DELAY_DEQUEUE-issue-related-to-cgroup.patch, and
cherry pick the upstream commit
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=e67e3e738f08
to fix the issue. It had been merged to the stable branch of v6.12.

We encouters the following two kinds of conditions.

1.
 !se->on_rq
 WARNING: CPU: 1 PID: 17007 at kernel/sched/fair.c:704 update_entity_lag+0x7c/0x90
 ......
Call Trace:
  <TASK>
  dequeue_entity+0x95/0x600
  dequeue_entities+0xc9/0x590
  dequeue_task_fair+0xd5/0x1f0
  ? sched_clock+0xc/0x30
  detach_task+0x36/0x60
  sched_balance_rq+0x77f/0xe70
  sched_balance_newidle+0x1c8/0x430
  pick_next_task_fair+0x2e/0x3c0
  __schedule+0x269/0xbb0
  ? hrtimer_start_range_ns+0x2e1/0x460
  schedule+0x23/0xf0
  do_nanosleep+0x65/0x150
  hrtimer_nanosleep+0x7a/0xf0
  ? __pfx_hrtimer_wakeup+0x10/0x10
  __x64_sys_nanosleep+0xac/0xe0
  do_syscall_64+0x77/0x180
  ? sched_clock+0xc/0x30
  ? sched_clock_cpu+0xd/0x190
  ? raw_spin_rq_lock_nested+0x11/0x20
  ? sched_balance_newidle+0x398/0x430
  ? __update_idle_core+0x5d/0xb0
  ? finish_task_switch.isra.0+0x97/0x2d0
  ? __schedule+0x473/0xbb0
  ? schedule+0x23/0xf0
  ? do_nanosleep+0x6d/0x150
  ? hrtimer_nanosleep+0x7a/0xf0
  ? __pfx_hrtimer_wakeup+0x10/0x10
  ? __ct_user_enter+0x25/0xd0
  ? syscall_exit_to_user_mode+0x100/0x1d0
  ? do_syscall_64+0x83/0x180
  ? syscall_exit_to_user_mode+0x100/0x1d0
  ? do_syscall_64+0x83/0x180
  entry_SYSCALL_64_after_hwframe+0x76/0x7e

2.
!se->on_rq
WARNING: CPU: 2 PID: 3769150 at kernel/sched/fair.c:704 update_entity_lag+0x7c/0x90

kernel BUG at kernel/sched/rt.c:1035!

Call trace:
  pick_task_fair+0x68/0x150
  pick_next_task_fair+0x30/0x3b8
  __schedule+0x180/0xb98
  preempt_schedule+0x48/0x60
  rt_mutex_slowunlock+0x298/0x340
  rt_spin_unlock+0x84/0xa0
  page_vma_mapped_walk+0x1c8/0x478
  folio_referenced_one+0xdc/0x490
  rmap_walk_file+0x11c/0x200
  folio_referenced+0x160/0x1e8
  shrink_folio_list+0x5c4/0xc60
  shrink_lruvec+0x5f8/0xb88
  shrink_node+0x308/0x940
  do_try_to_free_pages+0xd4/0x540
  try_to_free_mem_cgroup_pages+0x12c/0x2c0

Verification:
 - Build iso success for rt and std.
 - Run the stress tests for more than one week.

Closes-Bug: 2129733

Change-Id: Icc259280f6f7ab8ebdc5100ba0c163b69039a874
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
2025-10-24 02:56:20 +00:00

42 lines
2.8 KiB
Plaintext

0001-Notification-of-death-of-arbitrary-processes.patch
0002-Affine-the-kernel-threads-irqs-and-workqueues-with-k.patch
0003-Revert-sched-idle-Move-quiet_vmstate-into-the-NOHZ-c.patch
0004-intel-iommu-allow-ignoring-Ethernet-device-RMRR-with.patch
0005-turn-off-write-same-in-smartqpi-driver.patch
0006-Allow-dmar-quirks-for-broken-bioses.patch
0007-Port-negative-dentries-limit-feature-from-3.10.patch
0008-tools-Fix-the-build-errors.patch
0009-net-export-the-symbol-for-netdev_rx_queue_restart.patch
0010-Make-kernel-start-eth-devices-at-offset.patch
0011-cpufreq-intel_pstate-Update-Balance-performance-EPP-.patch
0012-intel_idle-add-Granite-Rapids-Xeon-D-support.patch
0013-tools-power-turbostat-Add-initial-support-for-Granit.patch
0014-sched-fair-Make-the-BW-replenish-timer-expire-in-har.patch
zl3073x-upstream/0001-dt-bindings-dpll-Add-DPLL-device-and-pin.patch
zl3073x-upstream/0002-dt-bindings-dpll-Add-support-for-Microchip-Azurite-c.patch
zl3073x-upstream/0003-devlink-Add-support-for-u64-parameters.patch
zl3073x-upstream/0004-devlink-Add-new-clock_id-generic-device-param.patch
zl3073x-upstream/0005-dpll-Add-basic-Microchip-ZL3073x-support.patch
zl3073x-upstream/0006-dpll-zl3073x-Fetch-invariants-during-probe.patch
zl3073x-upstream/0007-dpll-zl3073x-Read-DPLL-types-and-pin-properties-from.patch
zl3073x-upstream/0008-dpll-zl3073x-Register-DPLL-devices-and-pins.patch
zl3073x-upstream/0009-dpll-zl3073x-Implement-input-pin-selection-in-manual.patch
zl3073x-upstream/0010-dpll-zl3073x-Add-support-to-get-set-priority-on-inpu.patch
zl3073x-upstream/0011-dpll-zl3073x-Implement-input-pin-state-setting-in-au.patch
zl3073x-upstream/0012-dpll-zl3073x-Add-support-to-get-set-frequency-on-pin.patch
zl3073x-upstream/0013-dpll-zl3073x-Add-support-to-get-set-esync-on-pins.patch
zl3073x-upstream/0014-dpll-zl3073x-Add-support-to-get-phase-offset-on-conn.patch
zl3073x-upstream/0015-dpll-zl3073x-Implement-phase-offset-monitor-feature.patch
zl3073x-upstream/0016-dpll-zl3073x-Add-support-to-adjust-phase.patch
zl3073x-upstream/0017-dpll-zl3073x-Add-support-to-get-fractional-frequency.patch
zl3073x-upstream/0018-dt-bindings-dpll-Add-clock-ID-property.patch
zl3073x-upstream/0019-dpll-zl3073x-Initialize-clock-ID-from-device-propert.patch
zl3073x-custom/0001-zl3073x-Export-clock_id-via-module-parameter.patch
zl3073x-backport/0001-dpll-zl3073x-Use-kthread_create_worker.patch
zl3073x-backport/0002-zl3073x-replace-string-literal-namespace.patch
zl3073x-backport/0003-devlink-introduce-devlink_nl_put_u64.patch
zl3073x-backport/0004-dpll-zl3073x-Fix-missing-header-build-error-on-older.patch
zl3073x-backport/0005-dpll-add-phase-offset-monitor-feature-to-netlink-spe.patch
zl3073x-backport/0006-dpll-add-phase_offset_monitor_get-set-callback-ops.patch
0015-sched-fair-Block-delayed-tasks-on-throttled-hierarch.patch