Опять начал зависать сервер. 37 суток без единого нарекания отработал и снова стал валиться в kernel panic несколько раз в день.
В dmesg иногда удаётся отловить:
mptscsih: ioc0: attempting task abort! (sc=ffff88022eadaf00)
sd 4:0:0:0: [sda] CDB: cdb[0]=0x28: 28 00 04 d8 eb 06 00 00 28 00
mptscsih: ioc0: WARNING - TaskMgmt type=1: IOC Not operational (0xffffffff)!
mptscsih: ioc0: WARNING - Issuing HardReset from mptscsih_IssueTaskMgmt!!
mptbase: ioc0: Initiating recovery
mptbase: ioc0: WARNING - Unexpected doorbell active!
INFO: task pdflush:267 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
pdflush D ffff880164968d90 0 267 2 0x00000000
ffff88022ea13650 0000000000000046 ffffffff810824f1 00011220ffffffff
ffff8801784104d0 0000000000010c80 000000000000c8e8 ffff88022f8bf080
ffff88022e76c4c0 ffff88022f8bf320 000000012ea135f0 0000000000000002
Call Trace:
[<ffffffff810824f1>] ? 0xffffffff810824f1
[<ffffffff81147ba1>] ? 0xffffffff81147ba1
[<ffffffff810548c2>] ? 0xffffffff810548c2
[<ffffffff81148e15>] 0xffffffff81148e15
[<ffffffff81054a10>] ? 0xffffffff81054a10
[<ffffffff8113d906>] ? 0xffffffff8113d906
[<ffffffff81148f9c>] 0xffffffff81148f9c
[<ffffffff8113aec7>] 0xffffffff8113aec7
[<ffffffff8113fdaa>] 0xffffffff8113fdaa
[<ffffffff81142d55>] 0xffffffff81142d55
[<ffffffff81138786>] ? 0xffffffff81138786
[<ffffffff8113a1a2>] 0xffffffff8113a1a2
[<ffffffff81192294>] ? 0xffffffff81192294
[<ffffffff8138b30c>] ? 0xffffffff8138b30c
[<ffffffff8138b3dc>] ? 0xffffffff8138b3dc
[<ffffffff81125519>] 0xffffffff81125519
[<ffffffff8108920e>] ? 0xffffffff8108920e
[<ffffffff81125a05>] 0xffffffff81125a05
[<ffffffff81088d32>] ? 0xffffffff81088d32
[<ffffffff81086fc1>] ? 0xffffffff81086fc1
[<ffffffff81126670>] ? 0xffffffff81126670
[<ffffffff811263fd>] 0xffffffff811263fd
[<ffffffff810548df>] ? 0xffffffff810548df
[<ffffffff81087348>] 0xffffffff81087348
[<ffffffff810ca6eb>] 0xffffffff810ca6eb
[<ffffffff810345ca>] ? 0xffffffff810345ca
[<ffffffff811a3eb0>] ? 0xffffffff811a3eb0
[<ffffffff810caeb1>] 0xffffffff810caeb1
[<ffffffff810cb16e>] 0xffffffff810cb16e
[<ffffffff810874fc>] 0xffffffff810874fc
[<ffffffff81088240>] ? 0xffffffff81088240
[<ffffffff81088346>] 0xffffffff81088346
[<ffffffff81087440>] ? 0xffffffff81087440
[<ffffffff81054616>] 0xffffffff81054616
[<ffffffff8100cd7a>] 0xffffffff8100cd7a
[<ffffffff81054570>] ? 0xffffffff81054570
[<ffffffff8100cd70>] ? 0xffffffff8100cd70
INFO: task lighttpd:26588 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
lighttpd D ffff88012e411ca8 0 26588 26426 0x00000000
ffff88012e411c18 0000000000000086 ffff88012e411b68 ffffffff811a5817
ffff88017f5a0600 0000000000010c80 000000000000c8e8 ffff8801cd0fddc0
ffff88018ad23e80 ffff8801cd0fe060 000000002f604000 ffff88022e580120
Call Trace:
[<ffffffff811a5817>] ? 0xffffffff811a5817
[<ffffffff8105c3cc>] ? 0xffffffff8105c3cc
[<ffffffff813896b3>] 0xffffffff813896b3
[<ffffffff810801ad>] 0xffffffff810801ad
[<ffffffff81389a52>] 0xffffffff81389a52
[<ffffffff81080180>] ? 0xffffffff81080180
[<ffffffff81080164>] 0xffffffff81080164
[<ffffffff81054a10>] ? 0xffffffff81054a10
[<ffffffff8107ff8b>] ? 0xffffffff8107ff8b
[<ffffffff810802e8>] 0xffffffff810802e8
[<ffffffff81080937>] 0xffffffff81080937
[<ffffffff810861a7>] ? 0xffffffff810861a7
[<ffffffff81093973>] 0xffffffff81093973
[<ffffffff810955d9>] 0xffffffff810955d9
[<ffffffff8102b14f>] 0xffffffff8102b14f
[<ffffffff8138ba6f>] 0xffffffff8138ba6f
INFO: task lighttpd:26590 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
lighttpd D ffff88020c581b18 0 26590 26426 0x00000000
ffff88020c581a88 0000000000000082 ffff88020c5819d8 ffffffff811a5817
ffff88022eadaf00 0000000000010c80 000000000000c8e8 ffff8801cd0f9900
ffff8801cd0fcb00 ffff8801cd0f9ba0 0000000200000038 ffff88022e580120
Call Trace:
[<ffffffff811a5817>] ? 0xffffffff811a5817
[<ffffffff8105c3cc>] ? 0xffffffff8105c3cc
[<ffffffff813896b3>] 0xffffffff813896b3
[<ffffffff810801ad>] 0xffffffff810801ad
[<ffffffff81389a52>] 0xffffffff81389a52
[<ffffffff81080180>] ? 0xffffffff81080180
[<ffffffff81080164>] 0xffffffff81080164
[<ffffffff81054a10>] ? 0xffffffff81054a10
[<ffffffff8107ff8b>] ? 0xffffffff8107ff8b
[<ffffffff810ce5c7>] 0xffffffff810ce5c7
[<ffffffff810bf4a1>] ? 0xffffffff810bf4a1
[<ffffffff810b475f>] ? 0xffffffff810b475f
[<ffffffff810bde8c>] ? 0xffffffff810bde8c
[<ffffffff812e6a60>] ? 0xffffffff812e6a60
[<ffffffff810c4d54>] ? 0xffffffff810c4d54
[<ffffffff810cd0e0>] ? 0xffffffff810cd0e0
[<ffffffff810ce789>] 0xffffffff810ce789
[<ffffffff810cc9f7>] 0xffffffff810cc9f7
[<ffffffff810cd2fc>] 0xffffffff810cd2fc
[<ffffffff810cc960>] ? 0xffffffff810cc960
[<ffffffff810cd442>] 0xffffffff810cd442
[<ffffffff810aba17>] 0xffffffff810aba17
[<ffffffff810abb25>] 0xffffffff810abb25
[<ffffffff810b9e18>] ? 0xffffffff810b9e18
[<ffffffff8100bd6b>] 0xffffffff8100bd6b
Так вот, интересно:
echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
оно просто уберёт сообщение, но система продолжит виснуть? :) А если дефолтовые тамошние 120 сек. наоборот увеличить?
За что, вообще, этот параметр отвечает? Не нагугливается что-то...