LINUX.ORG.RU
ФорумAdmin

зависание RAID


0

0

всем привет.
один из raid-массивов зависает через пары дней работы.
нельзя ни отмонтировать, ни остановить raid.
debugreiserfs говорит что нужна проверка на badblock.
проверка на badblock говорит что все с винтчестером нормально.
что делать?

anonymous

вот часть dmesg:

2) is smallest!. raid0: checking sdd1 ... contained as device 1 raid0: zone->nb_dev: 2, size: 490223104 raid0: current zone offset: 245111552 raid0: done. raid0 : md_size is 490223104 blocks. raid0 : conf->smallest->size is 490223104 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. md: updating md1 RAID superblock on device md: sdd1 [events: 000000ca]<6>(write) sdd1's sb offset: 245111616 md: sdc1 [events: 000000ca]<6>(write) sdc1's sb offset: 245111616 md: ... autorun DONE. [events: 0000002c] [events: 0000002c] md: autorun ... md: considering hdc1 ... md: adding hdc1 ... md: adding hdb1 ... md: created md2 md: bind<hdb1,1> md: bind<hdc1,2> md: running: <hdc1><hdb1> md: hdc1's event counter: 0000002c md: hdb1's event counter: 0000002c md2: max total readahead window set to 496k md2: 2 data-disks, max readahead per data-disk: 248k raid0: looking at hdb1 raid0: comparing hdb1(199141632) with hdb1(199141632) raid0: END raid0: ==> UNIQUE raid0: 1 zones raid0: looking at hdc1 raid0: comparing hdc1(199141632) with hdb1(199141632) raid0: EQUAL raid0: FINAL 1 zones raid0: zone 0 raid0: checking hdb1 ... contained as device 0 (199141632) is smallest!. raid0: checking hdc1 ... contained as device 1 raid0: zone->nb_dev: 2, size: 398283264 raid0: current zone offset: 199141632 raid0: done. raid0 : md_size is 398283264 blocks. raid0 : conf->smallest->size is 398283264 blocks. raid0 : nb_zone is 1. raid0 : Allocating 8 bytes for hash. md: updating md2 RAID superblock on device md: hdc1 [events: 0000002d]<6>(write) hdc1's sb offset: 199141632 md: hdb1 [events: 0000002d]<6>(write) hdb1's sb offset: 199141632 md: ... autorun DONE. reiserfs: checking transaction log (device 09:00) ... Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 09:01) ... Using r5 hash to sort names ReiserFS version 3.6.25 reiserfs: checking transaction log (device 09:02) ... Using r5 hash to sort names ReiserFS version 3.6.25

anonymous
()

вот часть dmesg:

2) is smallest!.
raid0: checking sdd1 ... contained as device 1
raid0: zone->nb_dev: 2, size: 490223104
raid0: current zone offset: 245111552
raid0: done.
raid0 : md_size is 490223104 blocks.
raid0 : conf->smallest->size is 490223104 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
md: updating md1 RAID superblock on device
md: sdd1 [events: 000000ca]<6>(write) sdd1's sb offset: 245111616
md: sdc1 [events: 000000ca]<6>(write) sdc1's sb offset: 245111616
md: ... autorun DONE.
[events: 0000002c]
[events: 0000002c]
md: autorun ...
md: considering hdc1 ...
md: adding hdc1 ...
md: adding hdb1 ...
md: created md2
md: bind<hdb1,1>
md: bind<hdc1,2>
md: running: <hdc1><hdb1>
md: hdc1's event counter: 0000002c
md: hdb1's event counter: 0000002c
md2: max total readahead window set to 496k
md2: 2 data-disks, max readahead per data-disk: 248k
raid0: looking at hdb1
raid0: comparing hdb1(199141632) with hdb1(199141632)
raid0: END
raid0: ==> UNIQUE
raid0: 1 zones
raid0: looking at hdc1
raid0: comparing hdc1(199141632) with hdb1(199141632)
raid0: EQUAL
raid0: FINAL 1 zones
raid0: zone 0
raid0: checking hdb1 ... contained as device 0
(199141632) is smallest!.
raid0: checking hdc1 ... contained as device 1
raid0: zone->nb_dev: 2, size: 398283264
raid0: current zone offset: 199141632
raid0: done.
raid0 : md_size is 398283264 blocks.
raid0 : conf->smallest->size is 398283264 blocks.
raid0 : nb_zone is 1.
raid0 : Allocating 8 bytes for hash.
md: updating md2 RAID superblock on device
md: hdc1 [events: 0000002d]<6>(write) hdc1's sb offset: 199141632
md: hdb1 [events: 0000002d]<6>(write) hdb1's sb offset: 199141632
md: ... autorun DONE.
reiserfs: checking transaction log (device 09:00) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 09:01) ...
Using r5 hash to sort names
ReiserFS version 3.6.25
reiserfs: checking transaction log (device 09:02) ...
Using r5 hash to sort names
ReiserFS version 3.6.25

anonymous
()

CSLIP: code copyright 1989 Regents of the University of California
PPP generic driver version 2.4.2
ip_tables: (C) 2000-2002 Netfilter core team
ip_conntrack version 2.1 (4095 buckets, 32760 max) - 292 bytes per conntrack
attempt to access beyond end of device
09:01: rw=0, want=1295835624, limit=490223104
attempt to access beyond end of device
09:01: rw=0, want=1295835624, limit=490223104
attempt to access beyond end of device
ulsata2:[info] scsi abort success
ulsata2:[warning] scsi eh reset disk5 OK
ulsata2:[warning] submit channel 2 busy
ulsata2:[warning] scsi eh reset channel3 <3>ulsata2:[error] disk5 command abort at LBA 0x17390a3f cmd=0xc8 status=0x0 error=0x0
OK
attempt to access beyond end of device
09:01: rw=0, want=696384044, limit=490223104
ulsata2:[warning] submit channel 2 busy
scsi: device set offline - not ready or command retry failed after host reset: host 0 channel 0 id 4 lun 0
SCSI disk error : host 0 channel 0 id 4 lun 0 return code = 50000
I/O error: dev 08:21, sector 389614080
I/O error: dev 08:21, sector 389614088
journal-601, buffer write failed
kernel BUG at prints.c:334!
invalid operand: 0000
CPU: 0
EIP: 0010:[<c01ae8c8>] Not tainted
EFLAGS: 00010282
eax: 00000024 ebx: de870a00 ecx: df33a000 edx: 00000001
esi: 00000000 edi: 00000004 ebp: de870a00 esp: c15b7ed4
ds: 0018 es: 0018 ss: 0018
Process kupdated (pid: 6, stackpage=c15b7000)
Stack: c02bfbd5 c036cf40 c02d1cc0 c15b7ef4 e0e1b8d4 c01b9265 de870a00 c02d1cc0
00000007 00000005 00000000 d80f3730 00000000 00000018 d0722000 00000000
c01bce4a de870a00 e0e1b8d4 00000001 e0e24374 00000004 00000002 00000000
Call Trace: [<c01b9265>] [<c01bce4a>] [<c01bc105>] [<c01a72e0>] [<c01aba78>]
[<c013b714>] [<c013ac7e>] [<c013af1f>] [<c0105000>] [<c0107416>] [<c013ae50>]

Code: 0f 0b 4e 01 99 3e 2c c0 85 db 68 40 cf 36 c0 74 17 66 8b 43
I/O error: dev 08:21, sector 318374040
I/O error: dev 08:21, sector 318374040
ulsata2:[info] Drive 5 Offline: Maxtor 6V250F0
ulsata2:[info] Channel 2 reset status = 0x0 <6>lease broken - owner pid = 1455
I/O error: dev 08:21, sector 170312
I/O error: dev 08:21, sector 170312
I/O error: dev 08:21, sector 170312
zam-7001: io error in reiserfs_find_entry
I/O error: dev 08:21, sector 170312
zam-7001: io error in reiserfs_find_entry

anonymous
()
Ответ на: комментарий от anonymous

scsi: device set offline - not ready or command retry failed after host reset: host 0 channel 0 id 4 lun 0

много че то ошибок в i/o. Либо гонит контроллер, либо хард. Меняй железо в общем

anonymous
()
Ответ на: комментарий от anonymous

> attempt to access beyond end of device

Попробуй обнови ядро до последнего, если не поможет, то это ИМХО контроллер.

Deleted
()
Ответ на: комментарий от Deleted

У меня похожее, тоже отваливался raid на мегарэйде
помогло append=megaraid_mbox cmd_per_lun=0
посмотри может и для твоего контролера есть опции.

sova ★★
()
Ответ на: комментарий от sova

кстати сегодня перегрузили сервер - dmesg чистенький, как стекло, ни одной ошибки.
и остальные массивы как SATA, так и ATA работают без нареканий.
единственное отличие - именно на этот массив в /etc/raidtab стоит
chunk-size = 128,
на остальных (которые нормально работают) = 32 .

anonymous
()
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.