Go to file
Jiping Ma d7fc25a377 kernel-rt: beware of __put_task_struct() calling context
Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping
locks. Therefore, it can't be called from an non-preemptible context.

Instead of calling __put_task_struct() directly, we defer it using
call_rcu(). A more natural approach would use a workqueue, but since
in PREEMPT_RT, we can't allocate dynamic memory from atomic context,
the code would become more complex because we would need to put the
work_struct instance in the task_struct and initialize it when we
allocate a new task_struct.

We met 5 same panics, __put_task_struct is called during the process
holding a lock that caused the kernel BUG_ON. The below is the call
trace.

We also need cherry pick the following commits, because the necessary
context is not in 5.10.18x, such as there is not definition
DEFINE_WAIT_OVERRIDE_MAP.

* commit 5f2962401c6e
  ("locking/lockdep: Exclude local_lock_t from IRQ inversions")
* commit 175b1a60e880
  ("locking/lockdep: Clean up check_redundant() a bit")
* commit bc2dd71b2836
  ("locking/lockdep: Add a skip() function to __bfs()")
* commit a1014fbc83e7
  ("lib/debugobjects: fix stat count and optimize debug_objects_mem_init")
* commit 9edf5518db25
  ("debugobject: Prevent init race with static objects")
* commit 9079ff34a1ac
  ("debugobject: Ensure pool refill")
* commit 0cce06ba859a
  ("debugobjects,locking: Annotate debug_object_fill_pool() wait type
   violation")

kernel BUG at kernel/locking/rtmutex.c:1331!
invalid opcode: 0000 [#1] PREEMPT_RT SMP NOPTI
......
Call Trace:
 rt_spin_lock_slowlock_locked+0xb2/0x2a0
 ? update_load_avg+0x80/0x690
 rt_spin_lock_slowlock+0x50/0x80
 ? update_load_avg+0x80/0x690
 rt_spin_lock+0x2a/0x30
 free_unref_page+0xc5/0x280
 __vunmap+0x17f/0x240
 put_task_stack+0xc6/0x130
 __put_task_struct+0x3d/0x180
 rt_mutex_adjust_prio_chain+0x365/0x7b0
 task_blocks_on_rt_mutex+0x1eb/0x370
 rt_spin_lock_slowlock_locked+0xb2/0x2a0
 rt_spin_lock_slowlock+0x50/0x80
 rt_spin_lock+0x2a/0x30
 free_unref_page_list+0x128/0x5e0
 release_pages+0x2b4/0x320
 tlb_flush_mmu+0x44/0x150
 tlb_finish_mmu+0x3c/0x70
 zap_page_range+0x12a/0x170
 ? find_vma+0x16/0x70
 do_madvise+0x99d/0xba0
 ? do_epoll_wait+0xa2/0xe0
 ? __x64_sys_madvise+0x26/0x30
 __x64_sys_madvise+0x26/0x30
 do_syscall_64+0x33/0x40
 entry_SYSCALL_64_after_hwframe+0x44/0xa9

Verification:
- build-pkgs; build-iso; install and boot up on aio-sx lab.
- Can not reproduce the isue during the stress-ng test for almost 24 hours.
  while true; do sudo stress-ng --sched rr --mmapfork 23 -t 20; done
  while true; do sudo stress-ng --sched fifo--mmapfork 23 -t 20; done

Closes-Bug: 2031597
Signed-off-by: Jiping Ma <jiping.ma2@windriver.com>
Change-Id: If022441d61492eaec88eede8603a6cb052af99d1
2023-08-17 01:57:36 -04:00
2023-08-02 14:49:49 +08:00
2023-04-28 12:38:51 -04:00
2022-12-26 22:28:20 +00:00
Description
StarlingX Linux kernel
13 MiB
Languages
Python 46.8%
Shell 26.5%
Makefile 23.9%
POV-Ray SDL 1.7%
Perl 1.1%