linux系统异常重启的排查思路和过程

前天有台客户的服务器自动重启了,接到通知后我对服务器进行了排查分析,顺便记录下这次排查的过程,登录重启的服务器我们首先就是需要确认的是重启的时间

journalctl --list-boots
0 46204d44f10a4fe5b256742bb30f18fb Tue 2026-01-13 11:58:55 CST—Thu 2026-01-13 14:36:07 CST

我们通过上述检查可以看到系统启动时间是2026-01-13 11:58:55,接着我们就可以根据上述线索进行排查,主要查以下几个文件

/var/log/message
/var/log/kern.log
/var/log/syslog
var/log/auth.log

实际上可以根据以上时间线索直接定位到上述几个日志中的对应位置,也可以直接使用以下命令直接查询对应的错误信息

cat /var/log/kern.log | grep -i "panic\|critical\|error"

但是通过上述过滤对于我并没有过滤有效的信息,我们继续排查,查看定时任务是否有异常

/var/log/cron # cron任务执行记录
cat /etc/crontab # 系统级cron
ls -la /etc/cron.* # 查看所有cron目录
crontab -l # 当前用户cron

没有打印任何信息,没有任何定时任务
然后确认重启模式,一般正常有序关机都在日志中有一些正常的关机顺序信息,通过以下命令可以确认是否有人为重启的操作

cat /var/log/auth.log | grep -i "shutdown\|reboot"

使用journalctl命令也可以进行查看关机原因

journalctl --list-boots -o short-precise

可以查询电源是否有异常

journalctl | grep -i "power\|acpi\|battery"

上述的检查也都没有查询到错误信息,我们还是根据此前的时间直接定位到时间附近的日志信息

Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.800868] INFO: task libuv-worker:2586075 blocked for more than 120 seconds.
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801156] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801405] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801658] task:libuv-worker state:D stack: 0 pid:2586075 ppid:2586045 flags:0x00000002
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801660] Call Trace:
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801661] <TASK>
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801662] __schedule+0x24e/0x590
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801665] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801667] schedule+0x69/0x110
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801669] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801671] ? wake_up_q+0x90/0x90
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801673] __lookup_slow+0x5f/0x150
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801676] walk_component+0x145/0x1c0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801678] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801679] ? path_init+0x2c0/0x3f0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801681] path_lookupat+0x3e/0x1c0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801682] ? default_wake_function+0x1a/0x40
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801684] filename_lookup+0xcf/0x1d0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801685] ? __check_object_size+0x1d/0x30
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801687] ? strncpy_from_user+0x44/0x150
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801689] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801691] user_path_at_empty+0x3f/0x60
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801692] vfs_statx+0x7a/0x130
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801693] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801696] do_statx+0x45/0x80
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801697] ? switch_fpu_return+0x4e/0xc0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801701] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801703] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801705] ? do_syscall_64+0x69/0xc0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801706] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801708] __x64_sys_statx+0x1f/0x30
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801710] do_syscall_64+0x59/0xc0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801711] ? do_syscall_64+0x69/0xc0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801712] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801713] RIP: 0033:0x7f4f91b388fd
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801715] RSP: 002b:00007f4f697bba98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801716] RAX: ffffffffffffffda RBX: 00007f4f697bc458 RCX: 00007f4f91b388fd
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801717] RDX: 0000000000000000 RSI: 00000000085299b0 RDI: 00000000ffffff9c
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801717] RBP: 00007f4f697bbab0 R08: 00007f4f697bbac0 R09: 00007f4f697bbb40
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801718] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801719] R13: 0000000008834ef0 R14: 0000000008834fd0 R15: 0000000008834fd0
Jan 13 11:46:02 k8s-worker30-gpu kernel: [433319.801720] </TASK>
Jan 13 11:47:01 k8s-worker30-gpu CRON[2981371]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:47:01 k8s-worker30-gpu CRON[2981372]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:48:01 k8s-worker30-gpu CRON[2983804]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:48:01 k8s-worker30-gpu CRON[2983805]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.627585] INFO: task libuv-worker:2586074 blocked for more than 241 seconds.
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628101] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628553] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628948] task:libuv-worker state:D stack: 0 pid:2586074 ppid:2586045 flags:0x00004002
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628953] Call Trace:
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628956] <TASK>
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628960] __schedule+0x24e/0x590
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628969] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628975] schedule+0x69/0x110
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628977] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628980] ? wake_up_q+0x90/0x90
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628985] __lookup_slow+0x5f/0x150
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628989] walk_component+0x145/0x1c0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628992] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628993] ? path_init+0x2c0/0x3f0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628996] path_lookupat+0x3e/0x1c0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628997] filename_lookup+0xcf/0x1d0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.628999] ? __check_object_size+0x1d/0x30
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629003] ? strncpy_from_user+0x44/0x150
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629007] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629009] user_path_at_empty+0x3f/0x60
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629010] vfs_statx+0x7a/0x130
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629013] do_statx+0x45/0x80
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629015] ? futex_wake+0x7c/0x190
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629019] ? do_futex+0x162/0x1f0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629021] __x64_sys_statx+0x1f/0x30
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629023] do_syscall_64+0x59/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629026] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629030] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629032] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629034] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629036] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629037] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629039] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629041] ? __x64_sys_write+0x19/0x20
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629043] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629044] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629046] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629047] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629049] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629051] RIP: 0033:0x7f4f91b388fd
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629055] RSP: 002b:00007f4f69fbca98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629057] RAX: ffffffffffffffda RBX: 00007f4f69fbd458 RCX: 00007f4f91b388fd
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629059] RDX: 0000000000000000 RSI: 00000000084c6d80 RDI: 00000000ffffff9c
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629060] RBP: 00007f4f69fbcab0 R08: 00007f4f69fbcac0 R09: 0000000000000001
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629061] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629062] R13: 0000000008afda20 R14: 0000000008afdb00 R15: 0000000008afdb00
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629064] </TASK>
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629065] INFO: task libuv-worker:2586075 blocked for more than 241 seconds.
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629448] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.629797] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630145] task:libuv-worker state:D stack: 0 pid:2586075 ppid:2586045 flags:0x00000002
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630148] Call Trace:
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630150] <TASK>
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630151] __schedule+0x24e/0x590
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630154] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630157] schedule+0x69/0x110
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630159] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630161] ? wake_up_q+0x90/0x90
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630164] __lookup_slow+0x5f/0x150
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630167] walk_component+0x145/0x1c0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630170] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630171] ? path_init+0x2c0/0x3f0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630173] path_lookupat+0x3e/0x1c0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630175] ? default_wake_function+0x1a/0x40
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630177] filename_lookup+0xcf/0x1d0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630179] ? __check_object_size+0x1d/0x30
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630182] ? strncpy_from_user+0x44/0x150
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630184] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630186] user_path_at_empty+0x3f/0x60
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630187] vfs_statx+0x7a/0x130
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630189] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630192] do_statx+0x45/0x80
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630193] ? switch_fpu_return+0x4e/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630197] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630199] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630201] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630203] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630205] __x64_sys_statx+0x1f/0x30
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630207] do_syscall_64+0x59/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630209] ? do_syscall_64+0x69/0xc0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630210] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630212] RIP: 0033:0x7f4f91b388fd
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630213] RSP: 002b:00007f4f697bba98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630215] RAX: ffffffffffffffda RBX: 00007f4f697bc458 RCX: 00007f4f91b388fd
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630216] RDX: 0000000000000000 RSI: 00000000085299b0 RDI: 00000000ffffff9c
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630217] RBP: 00007f4f697bbab0 R08: 00007f4f697bbac0 R09: 00007f4f697bbb40
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630218] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630218] R13: 0000000008834ef0 R14: 0000000008834fd0 R15: 0000000008834fd0
Jan 13 11:48:03 k8s-worker30-gpu kernel: [433440.630220] </TASK>
Jan 13 11:49:01 k8s-worker30-gpu CRON[2985977]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:49:01 k8s-worker30-gpu CRON[2985978]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:50:01 k8s-worker30-gpu CRON[2988131]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:50:01 k8s-worker30-gpu CRON[2988132]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.457771] INFO: task libuv-worker:2586074 blocked for more than 362 seconds.
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.458253] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.458675] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459047] task:libuv-worker state:D stack: 0 pid:2586074 ppid:2586045 flags:0x00004002
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459053] Call Trace:
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459056] <TASK>
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459060] __schedule+0x24e/0x590
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459068] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459075] schedule+0x69/0x110
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459077] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459079] ? wake_up_q+0x90/0x90
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459084] __lookup_slow+0x5f/0x150
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459088] walk_component+0x145/0x1c0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459090] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459091] ? path_init+0x2c0/0x3f0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459094] path_lookupat+0x3e/0x1c0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459095] filename_lookup+0xcf/0x1d0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459096] ? __check_object_size+0x1d/0x30
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459100] ? strncpy_from_user+0x44/0x150
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459104] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459106] user_path_at_empty+0x3f/0x60
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459108] vfs_statx+0x7a/0x130
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459111] do_statx+0x45/0x80
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459114] ? futex_wake+0x7c/0x190
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459118] ? do_futex+0x162/0x1f0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459120] __x64_sys_statx+0x1f/0x30
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459121] do_syscall_64+0x59/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459125] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459128] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459130] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459133] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459135] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459136] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459138] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459140] ? __x64_sys_write+0x19/0x20
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459142] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459144] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459145] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459147] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459148] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459151] RIP: 0033:0x7f4f91b388fd
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459154] RSP: 002b:00007f4f69fbca98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459157] RAX: ffffffffffffffda RBX: 00007f4f69fbd458 RCX: 00007f4f91b388fd
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459158] RDX: 0000000000000000 RSI: 00000000084c6d80 RDI: 00000000ffffff9c
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459159] RBP: 00007f4f69fbcab0 R08: 00007f4f69fbcac0 R09: 0000000000000001
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459160] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459161] R13: 0000000008afda20 R14: 0000000008afdb00 R15: 0000000008afdb00
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459163] </TASK>
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459164] INFO: task libuv-worker:2586075 blocked for more than 362 seconds.
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459515] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.459803] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460126] task:libuv-worker state:D stack: 0 pid:2586075 ppid:2586045 flags:0x00000002
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460130] Call Trace:
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460130] <TASK>
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460132] __schedule+0x24e/0x590
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460134] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460137] schedule+0x69/0x110
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460138] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460140] ? wake_up_q+0x90/0x90
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460143] __lookup_slow+0x5f/0x150
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460145] walk_component+0x145/0x1c0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460147] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460148] ? path_init+0x2c0/0x3f0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460150] path_lookupat+0x3e/0x1c0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460151] ? default_wake_function+0x1a/0x40
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460154] filename_lookup+0xcf/0x1d0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460156] ? __check_object_size+0x1d/0x30
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460158] ? strncpy_from_user+0x44/0x150
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460160] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460162] user_path_at_empty+0x3f/0x60
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460164] vfs_statx+0x7a/0x130
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460165] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460168] do_statx+0x45/0x80
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460169] ? switch_fpu_return+0x4e/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460173] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460175] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460177] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460179] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460181] __x64_sys_statx+0x1f/0x30
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460182] do_syscall_64+0x59/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460184] ? do_syscall_64+0x69/0xc0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460185] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460187] RIP: 0033:0x7f4f91b388fd
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460188] RSP: 002b:00007f4f697bba98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460189] RAX: ffffffffffffffda RBX: 00007f4f697bc458 RCX: 00007f4f91b388fd
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460191] RDX: 0000000000000000 RSI: 00000000085299b0 RDI: 00000000ffffff9c
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460192] RBP: 00007f4f697bbab0 R08: 00007f4f697bbac0 R09: 00007f4f697bbb40
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460192] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460193] R13: 0000000008834ef0 R14: 0000000008834fd0 R15: 0000000008834fd0
Jan 13 11:50:04 k8s-worker30-gpu kernel: [433561.460194] </TASK>
Jan 13 11:51:01 k8s-worker30-gpu CRON[2990427]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:51:01 k8s-worker30-gpu CRON[2990428]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:52:01 k8s-worker30-gpu CRON[2992794]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:52:01 k8s-worker30-gpu CRON[2992795]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.288011] INFO: task libuv-worker:2586074 blocked for more than 483 seconds.
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.288509] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.288853] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289222] task:libuv-worker state:D stack: 0 pid:2586074 ppid:2586045 flags:0x00004002
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289227] Call Trace:
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289230] <TASK>
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289234] __schedule+0x24e/0x590
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289243] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289248] schedule+0x69/0x110
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289250] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289252] ? wake_up_q+0x90/0x90
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289257] __lookup_slow+0x5f/0x150
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289261] walk_component+0x145/0x1c0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289263] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289265] ? path_init+0x2c0/0x3f0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289267] path_lookupat+0x3e/0x1c0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289269] filename_lookup+0xcf/0x1d0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289270] ? __check_object_size+0x1d/0x30
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289274] ? strncpy_from_user+0x44/0x150
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289278] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289280] user_path_at_empty+0x3f/0x60
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289282] vfs_statx+0x7a/0x130
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289284] do_statx+0x45/0x80
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289286] ? futex_wake+0x7c/0x190
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289290] ? do_futex+0x162/0x1f0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289292] __x64_sys_statx+0x1f/0x30
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289294] do_syscall_64+0x59/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289298] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289303] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289304] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289307] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289308] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289310] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289311] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289313] ? __x64_sys_write+0x19/0x20
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289315] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289316] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289318] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289319] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289321] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289323] RIP: 0033:0x7f4f91b388fd
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289327] RSP: 002b:00007f4f69fbca98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289329] RAX: ffffffffffffffda RBX: 00007f4f69fbd458 RCX: 00007f4f91b388fd
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289330] RDX: 0000000000000000 RSI: 00000000084c6d80 RDI: 00000000ffffff9c
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289331] RBP: 00007f4f69fbcab0 R08: 00007f4f69fbcac0 R09: 0000000000000001
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289332] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289333] R13: 0000000008afda20 R14: 0000000008afdb00 R15: 0000000008afdb00
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289335] </TASK>
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289336] INFO: task libuv-worker:2586075 blocked for more than 483 seconds.
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.289692] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290036] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290384] task:libuv-worker state:D stack: 0 pid:2586075 ppid:2586045 flags:0x00000002
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290387] Call Trace:
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290388] <TASK>
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290390] __schedule+0x24e/0x590
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290393] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290395] schedule+0x69/0x110
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290397] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290400] ? wake_up_q+0x90/0x90
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290402] __lookup_slow+0x5f/0x150
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290405] walk_component+0x145/0x1c0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290407] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290409] ? path_init+0x2c0/0x3f0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290411] path_lookupat+0x3e/0x1c0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290412] ? default_wake_function+0x1a/0x40
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290414] filename_lookup+0xcf/0x1d0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290416] ? __check_object_size+0x1d/0x30
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290418] ? strncpy_from_user+0x44/0x150
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290420] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290422] user_path_at_empty+0x3f/0x60
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290423] vfs_statx+0x7a/0x130
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290425] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290427] do_statx+0x45/0x80
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290429] ? switch_fpu_return+0x4e/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290433] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290435] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290437] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290439] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290441] __x64_sys_statx+0x1f/0x30
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290443] do_syscall_64+0x59/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290444] ? do_syscall_64+0x69/0xc0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290446] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290447] RIP: 0033:0x7f4f91b388fd
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290448] RSP: 002b:00007f4f697bba98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290450] RAX: ffffffffffffffda RBX: 00007f4f697bc458 RCX: 00007f4f91b388fd
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290451] RDX: 0000000000000000 RSI: 00000000085299b0 RDI: 00000000ffffff9c
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290452] RBP: 00007f4f697bbab0 R08: 00007f4f697bbac0 R09: 00007f4f697bbb40
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290452] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290453] R13: 0000000008834ef0 R14: 0000000008834fd0 R15: 0000000008834fd0
Jan 13 11:52:05 k8s-worker30-gpu kernel: [433682.290455] </TASK>
Jan 13 11:53:01 k8s-worker30-gpu CRON[2994967]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:53:01 k8s-worker30-gpu CRON[2994968]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:54:01 k8s-worker30-gpu CRON[2997202]: (root) CMD (/clistorage/tools/log/log_ha.sh &> /dev/null)
Jan 13 11:54:01 k8s-worker30-gpu CRON[2997203]: (root) CMD (/clistorage/bin/mgcd_ha_client.sh /clistorage/bin storage &> /dev/null)
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.123174] INFO: task libuv-worker:2586074 blocked for more than 604 seconds.
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.123617] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.123934] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124213] task:libuv-worker state:D stack: 0 pid:2586074 ppid:2586045 flags:0x00004002
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124218] Call Trace:
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124220] <TASK>
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124224] __schedule+0x24e/0x590
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124232] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124237] schedule+0x69/0x110
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124239] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124241] ? wake_up_q+0x90/0x90
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124246] __lookup_slow+0x5f/0x150
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124250] walk_component+0x145/0x1c0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124253] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124254] ? path_init+0x2c0/0x3f0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124256] path_lookupat+0x3e/0x1c0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124257] filename_lookup+0xcf/0x1d0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124259] ? __check_object_size+0x1d/0x30
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124263] ? strncpy_from_user+0x44/0x150
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124267] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124269] user_path_at_empty+0x3f/0x60
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124270] vfs_statx+0x7a/0x130
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124272] do_statx+0x45/0x80
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124273] ? futex_wake+0x7c/0x190
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124278] ? do_futex+0x162/0x1f0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124279] __x64_sys_statx+0x1f/0x30
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124281] do_syscall_64+0x59/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124285] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124290] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124292] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124294] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124295] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124296] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124297] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124299] ? __x64_sys_write+0x19/0x20
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124301] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124302] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124303] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124305] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124306] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124308] RIP: 0033:0x7f4f91b388fd
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124311] RSP: 002b:00007f4f69fbca98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124313] RAX: ffffffffffffffda RBX: 00007f4f69fbd458 RCX: 00007f4f91b388fd
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124314] RDX: 0000000000000000 RSI: 00000000084c6d80 RDI: 00000000ffffff9c
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124315] RBP: 00007f4f69fbcab0 R08: 00007f4f69fbcac0 R09: 0000000000000001
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124316] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124317] R13: 0000000008afda20 R14: 0000000008afdb00 R15: 0000000008afdb00
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124319] </TASK>
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124320] INFO: task libuv-worker:2586075 blocked for more than 604 seconds.
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124595] Tainted: P OE K 5.15.0-94-generic #104-Ubuntu
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.124854] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125106] task:libuv-worker state:D stack: 0 pid:2586075 ppid:2586045 flags:0x00000002
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125108] Call Trace:
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125109] <TASK>
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125111] __schedule+0x24e/0x590
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125113] ? __d_lookup_rcu+0x143/0x180
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125116] schedule+0x69/0x110
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125117] d_alloc_parallel+0x3fb/0x4b0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125119] ? wake_up_q+0x90/0x90
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125122] __lookup_slow+0x5f/0x150
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125124] walk_component+0x145/0x1c0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125126] link_path_walk.part.0.constprop.0+0x23f/0x3a0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125128] ? path_init+0x2c0/0x3f0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125130] path_lookupat+0x3e/0x1c0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125130] ? default_wake_function+0x1a/0x40
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125133] filename_lookup+0xcf/0x1d0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125134] ? __check_object_size+0x1d/0x30
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125136] ? strncpy_from_user+0x44/0x150
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125138] ? getname_flags.part.0+0x4c/0x1b0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125139] user_path_at_empty+0x3f/0x60
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125141] vfs_statx+0x7a/0x130
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125142] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125144] do_statx+0x45/0x80
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125146] ? switch_fpu_return+0x4e/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125150] ? exit_to_user_mode_prepare+0x96/0xb0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125152] ? syscall_exit_to_user_mode+0x35/0x50
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125154] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125155] ? exit_to_user_mode_prepare+0x37/0xb0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125157] __x64_sys_statx+0x1f/0x30
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125159] do_syscall_64+0x59/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125160] ? do_syscall_64+0x69/0xc0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125161] entry_SYSCALL_64_after_hwframe+0x62/0xcc
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125162] RIP: 0033:0x7f4f91b388fd
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125163] RSP: 002b:00007f4f697bba98 EFLAGS: 00000206 ORIG_RAX: 000000000000014c
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125165] RAX: ffffffffffffffda RBX: 00007f4f697bc458 RCX: 00007f4f91b388fd
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125165] RDX: 0000000000000000 RSI: 00000000085299b0 RDI: 00000000ffffff9c
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125166] RBP: 00007f4f697bbab0 R08: 00007f4f697bbac0 R09: 00007f4f697bbb40
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125167] R10: 0000000000000fff R11: 0000000000000206 R12: 0000000000000000
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125167] R13: 0000000008834ef0 R14: 0000000008834fd0 R15: 0000000008834fd0
Jan 13 11:54:05 k8s-worker30-gpu kernel: [433803.125169] </TASK>
Jan 13 11:59:39 k8s-worker30-gpu systemd-modules-load[2436]: Inserted module 'overlay'
Jan 13 11:59:39 k8s-worker30-gpu kernel: [ 0.000000] Linux version 5.15.0-94-generic (buildd@lcy02-amd64-096) (gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #104-Ubuntu SMP Tue Jan 9 15:25:40 UTC 2024 (Ubuntu 5.15.0-94.104-generic 5.15.136)
Jan 13 11:59:39 k8s-worker30-gpu systemd-modules-load[2436]: Inserted module 'br_netfilter'
Jan 13 11:59:39 k8s-worker30-gpu kernel: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-5.15.0-94-generic root=/dev/mapper/vg_sys-lv_root ro cgroup_enable=memory swapaccount=0 systemd.unified_cgroup_hierarchy=0 quiet splash selinux=0 vt.handoff=7

到Inserted module 'overlay'这段就是系统已经重启了。一般我们重启都能看到Linux version 5.15.0-94-generic 这行标识,时间和上一段打印日志间隔也有5分钟,我们往上翻可以看到libuv-worker 进程在 d_alloc_parallel 函数中被阻塞超过120秒,通常导致这种文件是由于在等待IO资源而无法被唤醒,且都是在执行statx系统调用时被阻塞,而且我们现有情况是使用的共享存储,高度怀疑存储响应超时导致产生了D进程,我们也可以看到filename_lookup、filename_lookup后面出现卡住情况,所以进一步佐证了我们的怀疑是由进程阻塞,发生存储oops时触发了系统panic,然后就是查找对应的内核参数,使用如下命令进行查询

sysctl -a | grep panic_on_oops
kernel.panic_on_oops = 1
sysctl -a | grep kernel.panic
kernel.panic = 10 # 主要查看这个

在此前当发生hung task,内核检测到问题,触发oops,由于我们panic_on_oops查看的设置是1,然后触发触发panic,10s后就重启了。
然后我们调整内核参数

echo 0 > /proc/sys/kernel/panic_on_oops

上述是临时调整,我们可以写个配置

sudo tee /etc/sysctl.d/99-kernel-panic.conf << 'EOF'
# 禁止oops导致panic
kernel.panic_on_oops = 0
# 禁止hung task导致panic
kernel.hung_task_panic = 0
# 延长hung task检测时间
kernel.hung_task_timeout_secs = 300
# 增加最大hung task警告数
kernel.hung_task_warnings = 50
# 禁止soft lockup导致panic
kernel.softlockup_panic = 0
EOF

所有节点执行上述命令,然后执行以下命令即可生效

sudo sysctl -p /etc/sysctl.d/99-kernel-panic.conf

经过上述修改后,后面再出现D进程也不会触发系统重启,但是存储问题还需要进行改善,实际在我们的工作中,一般产生D状态进程都大概率是存储问题,D状态表示进程处于不可中断的睡眠状态(Uninterruptible Sleep)。这种状态通常发生在进程等待某些资源或事件时,而这些资源或事件无法立即可用。



内容版权声明:除非注明,否则皆为本站原创文章。

转载注明出处:https://sulao.cn/post/1160

评论列表

0%