Софтовый роутер на Intel S5000, Xeon 2.66GHz, ОС- Fedora 6, 5шт. сетевых. Конкретно «виновница» (eth5) - двухпортовая гигабитная Intel PCI-X на 82546 чипе.
Драйвер е1000 7.1.9-k4-NAPI
Предсмертные записи в /var/log/messages
Jun 21 12:31:14 router kernel: NETDEV WATCHDOG: eth5: transmit timed out
Jun 21 12:31:17 router kernel: BUG: soft lockup detected on CPU#3!
Jun 21 12:31:17 router kernel: [<c04051db>] dump_trace+0x69/0x1af
Jun 21 12:31:17 router kernel: [<c0405339>] show_trace_log_lvl+0x18/0x2c
Jun 21 12:31:17 router kernel: [<c04058ed>] show_trace+0xf/0x11
Jun 21 12:31:17 router kernel: [<c04059ea>] dump_stack+0x15/0x17
Jun 21 12:31:17 router kernel: [<c044d9b5>] softlockup_tick+0xad/0xc4
Jun 21 12:31:17 router kernel: [<c042e596>] update_process_times+0x39/0x5c
Jun 21 12:31:17 router kernel: [<c0418914>] smp_apic_timer_interrupt+0x5c/0x64
Jun 21 12:31:17 router kernel: [<c0404ad3>] apic_timer_interrupt+0x1f/0x24
Jun 21 12:31:17 router kernel: DWARF2 unwinder stuck at apic_timer_interrupt+0x1f/0x24
Jun 21 12:31:17 router kernel: Leftover inexact backtrace:
Jun 21 12:31:17 router kernel: [<c0613f91>] _spin_unlock_irqrestore+0xa/0xc
Jun 21 12:31:17 router kernel: [<f893ac2b>] e1000_update_stats+0x6c3/0x6ca [e1000]
Jun 21 12:31:17 router kernel: [<f893d7fa>] e1000_watchdog+0x0/0x5dc [e1000]
Jun 21 12:31:17 router kernel: [<f893dc29>] e1000_watchdog+0x42f/0x5dc [e1000]
Jun 21 12:31:17 router kernel: [<c042e374>] __mod_timer+0x9e/0xa8
Jun 21 12:31:17 router kernel: [<c05bf7aa>] neigh_timer_handler+0x24e/0x26c
Jun 21 12:31:17 router kernel: [<f893d7fa>] e1000_watchdog+0x0/0x5dc [e1000]
Jun 21 12:31:17 router kernel: [<c042e4f6>] run_timer_softirq+0x105/0x16c
Jun 21 12:31:17 router kernel: [<c04299fe>] __do_softirq+0x5a/0xbb
Jun 21 12:31:17 router kernel: [<c0406932>] do_softirq+0x55/0xaf
Jun 21 12:31:17 router kernel: [<c0404ad3>] apic_timer_interrupt+0x1f/0x24
Jun 21 12:31:17 router kernel: [<f8941277>] e1000_get_hw_eeprom_semaphore+0xb5/0xde [e1000]
Jun 21 12:31:17 router kernel: [<f8941649>] e1000_swfw_sync_acquire+0xe6/0xf7 [e1000]
Jun 21 12:31:17 router kernel: [<c05d007b>] rt_intern_hash+0x10a/0x323
Jun 21 12:31:17 router kernel: [<f89412c0>] e1000_swfw_sync_release+0x20/0x42 [e1000]
Jun 21 12:31:17 router kernel: [<f8941768>] e1000_write_kmrn_reg+0x5e/0x67 [e1000]
Jun 21 12:31:17 router kernel: [<f89435cf>] e1000_get_speed_and_duplex+0xec/0x2d6 [e1000]
Jun 21 12:31:17 router kernel: [<c04e8872>] copy_to_user+0x40/0x56
Jun 21 12:31:17 router kernel: [<f8947d2d>] e1000_get_settings+0x96/0xd2 [e1000]
Jun 21 12:31:17 router kernel: [<c05bad09>] dev_ethtool+0xd2/0xa59
Jun 21 12:31:17 router kernel: [<c049d2d6>] proc_alloc_inode+0x3e/0x63
Jun 21 12:31:17 router kernel: [<c0457a10>] get_page_from_freelist+0x2ae/0x318
Jun 21 12:31:17 router kernel: [<c0457ae7>] __alloc_pages+0x6d/0x2aa
Jun 21 12:31:17 router kernel: [<f89e1c07>] vlan_dev_ioctl+0x7b/0xa7 [8021q]
Jun 21 12:31:17 router kernel: [<f89e1b8c>] vlan_dev_ioctl+0x0/0xa7 [8021q]
Jun 21 12:31:17 router kernel: [<c05bb67a>] dev_ethtool+0xa43/0xa59
Jun 21 12:31:17 router kernel: [<c0483130>] is_subdir+0x34/0x44
Jun 21 12:31:17 router kernel: [<c0614d95>] do_page_fault+0x0/0x4db
Jun 21 12:31:17 router kernel: [<c046b78a>] cache_alloc_refill+0x16c/0x46c
Jun 21 12:31:17 router kernel: [<c04e7b9d>] vsnprintf+0x459/0x495
Jun 21 12:31:17 router kernel: [<c05af38e>] sock_ioctl+0x0/0x1bf
Jun 21 12:31:17 router last message repeated 2 times
Jun 21 12:31:17 router kernel: [<c05b9f47>] dev_ioctl+0x2fd/0x46b
Jun 21 12:31:17 router kernel: [<c05f5c52>] inet_sock_destruct+0x175/0x1bf
Jun 21 12:31:17 router kernel: [<c0613f00>] _write_lock_bh+0x8/0x10
Jun 21 12:31:17 router kernel: [<c05af529>] sock_ioctl+0x19b/0x1bf
Jun 21 12:31:17 router kernel: [<c05af38e>] sock_ioctl+0x0/0x1bf
Jun 21 12:31:17 router kernel: [<c047ef37>] do_ioctl+0x1f/0x62
Jun 21 12:31:17 router kernel: [<c047f1c4>] vfs_ioctl+0x24a/0x25c
Jun 21 12:31:17 router kernel: [<c047f222>] sys_ioctl+0x4c/0x66
Jun 21 12:31:17 router kernel: [<c0404013>] syscall_call+0x7/0xb
Jun 21 12:31:17 router kernel: =======================
На eth5 поднято 6 VLAN, трафик порядка 100-120 Мбит, при 15-17 kpps, в пиках до ~250 Мбит и ~30 kpps.
Вопрос - где и каким образом искать причину и чем лечить?