LINUX.ORG.RU

Лог со smartc. HDD умирает?

 


1

3

Приветы) Думаю, тут много экспертов по железу, которые могут подсказать что к чему. Вкраце: Проблемы начались с виндой - порой не загружалась, запускалось средство восстановления, которое «устраняло проблемы с диском», некоторые файлы не читались. Дошло до того, что средство восстановления не смогло исправить ошибки и Викторией записал диск нулями (ждал трое суток :D ). Вроде, все работает, но, полагаю, что недолго. Поставил винду и сверху Линукс. Винда грузится шустро, а вот Линукс достаточно медленно, при диск громко «шуршит».

Собственно, вот лог. Какие тут проблемы? P.S. Про охлаждение знаю - буду чинить)

Model Family:     Seagate Samsung SpinPoint M8 (AF)
Device Model:     ST1000LM024 HN-M101MBB
Serial Number:    S2SMJ9AD518921
LU WWN Device Id: 5 0004cf 20a4c74dd
Firmware Version: 2AR20002
User Capacity:    1 000 204 886 016 bytes [1,00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 3.0, 3.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Aug 18 06:32:06 2016 MSK
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(12660) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 211) minutes.
SCT capabilities: 	       (0x003f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       40115
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   090   089   025    Pre-fail  Always       -       3056
  4 Start_Stop_Count        0x0032   098   098   000    Old_age   Always       -       2589
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       13301
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       101
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2570
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       344
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   059   046   000    Old_age   Always       -       41 (Min/Max 9/54)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   099   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       2333
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       101
225 Load_Cycle_Count        0x0032   087   087   000    Old_age   Always       -       141179

SMART Error Log Version: 1
ATA Error Count: 1
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 4003 hours (166 days + 19 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 00 00 00 40  Error: 

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 9f 18 9f 18 f0 18 9f      10:54:18.685  NOP [Reserved subcommand] [OBS-ACS-2]
  00 00 00 00 00 00 00 00      00:00:04.086  NOP [Abort queued commands]
  61 00 00 a0 30 12 40 00      00:00:04.087  WRITE FPDMA QUEUED
  61 00 00 a0 2f 12 40 00      00:00:04.087  WRITE FPDMA QUEUED
  61 00 00 a0 2e 12 40 00      00:00:04.087  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     13299         1598437016
# 2  Short offline       Completed: read failure       90%     12972         20233448
# 3  Vendor (0x50)       Completed without error       00%      9415         -
# 4  Short offline       Completed: read failure       90%      9415         240811272
# 5  Vendor (0x50)       Completed without error       00%      7688         -
# 6  Short offline       Completed: read failure       90%      7688         177510760
# 7  Vendor (0x50)       Completed without error       00%      3344         -
# 8  Short offline       Completed without error       00%      3344         -
# 9  Vendor (0x50)       Completed without error       00%      2937         -
#10  Short offline       Completed without error       00%      2937         -
#11  Vendor (0x50)       Completed without error       00%      1907         -
#12  Short offline       Completed without error       00%      1907         -
#13  Vendor (0x50)       Completed without error       00%       602         -
#14  Short offline       Completed without error       00%       602         -
#15  Vendor (0x50)       Completed without error       00%       533         -
#16  Short offline       Completed without error       00%       533         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed_read_failure [90% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):

Викторией записал диск нулями (ждал трое суток :D )

Должно было бы занять часа 4.

191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       344

А диск уже не раз в рабочем состоянии встряхивало.

197 Current_Pending_Sector  0x0032   100   099   000    Old_age   Always       -       1

Всего один сектор (на самом деле целых 8 512-байтовых) просится, чтобы его перезаписали нулями, это неплохо. Но если эта процедура недавно была проведена, то вот это уже плохой знак.

gag ★★★★★
()

Попробуй проверку провести что ли средствами smart? Займёт несколько часов, если выбрать тест long, но может что-то более внятное покажет.

Xenius ★★★★★
()
Ответ на: комментарий от Xenius

Там уже в логе есть сектор 1598437016, наверное он и есть pending.
ИМХО толку от smart long test немного, она либо завершится без ошибок, либо выдаст номер первого сбойного сектора LBA_of_firtst_error.
Badblocks же может выдать сразу список сбойных секторов.
Я бы прогнал badblocks в режиме чтения или "-n",возможно будут еще ошибки. потом перезаписал этот сектор с помощью hdparm, например. Если pending обнулится и remap не появится то хорошо.

Vasily22
()
Ответ на: комментарий от Vasily22

Я бы прогнал badblocks в режиме чтения или "-n",

А чего так нежно? -w надо. Но сначала надо улучшить охлаждение, если под нагрузкой хард разогреется до 54 то бэды полезут на ровном месте

legolegs ★★★★★
()
Ответ на: комментарий от Xenius

проверку провести что ли средствами smart

Он отказывается самодиагностику делать. Говорит, Self_test_failed

illusion_Iife
() автор топика

badblocks точно вам изнасилует hdd до состояния трупа. Возьмите лучше whdd

Khnazile ★★★★★
()
Ответ на: комментарий от illusion_Iife

Давно талдычу: выкиньте эту викторию на помойку.
hdparm сектора пишет.

Он отказывается самодиагностику делать. Говорит, Self_test_failed

в логах винта что написано ?

Vasily22
()
Ответ на: комментарий от Vasily22
MART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      ( 121)	The previous self-test completed having
					the read element of the test failed.
Total time to complete Offline 
data collection: 		(12660) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 211) minutes.
SCT capabilities: 	       (0x003f)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   100   100   051    Pre-fail  Always       -       40128
  2 Throughput_Performance  0x0026   252   252   000    Old_age   Always       -       0
  3 Spin_Up_Time            0x0023   090   089   025    Pre-fail  Always       -       3058
  4 Start_Stop_Count        0x0032   098   098   000    Old_age   Always       -       2590
  5 Reallocated_Sector_Ct   0x0033   252   252   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x002e   252   252   051    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0024   252   252   015    Old_age   Offline      -       0
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       13308
 10 Spin_Retry_Count        0x0032   252   252   051    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       101
 12 Power_Cycle_Count       0x0032   098   098   000    Old_age   Always       -       2571
191 G-Sense_Error_Rate      0x0022   100   100   000    Old_age   Always       -       344
192 Power-Off_Retract_Count 0x0022   252   252   000    Old_age   Always       -       0
194 Temperature_Celsius     0x0002   058   046   000    Old_age   Always       -       42 (Min/Max 9/54)
195 Hardware_ECC_Recovered  0x003a   100   100   000    Old_age   Always       -       0
196 Reallocated_Event_Count 0x0032   252   252   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0032   100   099   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0030   252   252   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x0036   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x002a   100   100   000    Old_age   Always       -       2333
223 Load_Retry_Count        0x0032   100   100   000    Old_age   Always       -       101
225 Load_Cycle_Count        0x0032   087   087   000    Old_age   Always       -       141180

SMART Error Log Version: 1
ATA Error Count: 1
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 4003 hours (166 days + 19 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  40 51 00 00 00 00 40  Error: 

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  00 9f 18 9f 18 f0 18 9f      10:54:18.685  NOP [Reserved subcommand] [OBS-ACS-2]
  00 00 00 00 00 00 00 00      00:00:04.086  NOP [Abort queued commands]
  61 00 00 a0 30 12 40 00      00:00:04.087  WRITE FPDMA QUEUED
  61 00 00 a0 2f 12 40 00      00:00:04.087  WRITE FPDMA QUEUED
  61 00 00 a0 2e 12 40 00      00:00:04.087  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       90%     13299         1598437016
# 2  Short offline       Completed: read failure       90%     12972         20233448
# 3  Vendor (0x50)       Completed without error       00%      9415         -
# 4  Short offline       Completed: read failure       90%      9415         240811272
# 5  Vendor (0x50)       Completed without error       00%      7688         -
# 6  Short offline       Completed: read failure       90%      7688         177510760
# 7  Vendor (0x50)       Completed without error       00%      3344         -
# 8  Short offline       Completed without error       00%      3344         -
# 9  Vendor (0x50)       Completed without error       00%      2937         -
#10  Short offline       Completed without error       00%      2937         -
#11  Vendor (0x50)       Completed without error       00%      1907         -
#12  Short offline       Completed without error       00%      1907         -
#13  Vendor (0x50)       Completed without error       00%       602         -
#14  Short offline       Completed without error       00%       602         -
#15  Vendor (0x50)       Completed without error       00%       533         -
#16  Short offline       Completed without error       00%       533         -

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Completed_read_failure [90% left] (0-65535)
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
illusion_Iife
() автор топика
Ответ на: комментарий от illusion_Iife

Так понимаю, в самом низу: Completed_read_failure [90% left]
У тебя так и висит один сектор в состоянии pending.
Еще раз: тебе надо найти сбойные секторы и перезаписать их.
Badblocks в режиме чтения сканирует очень быстро, информация на винте сохраняется и должен его выдать. Перезапись - hdparm.

Vasily22
()
Ответ на: комментарий от Vasily22

Спасибо. В ближайшие дни сменю термопасту, почищу радиатор и т.д. (Линукс уже не хочет запускаться). И займусь этим делом. Я новичок в Линуксе и в программах, написанных под него. Не очень разбираюсь, значитсо. Можно поподробней что и как делать?)

illusion_Iife
() автор топика

Я не эксперт, но, думаю, если ты купишь новый диск, поступишь предусмотрительно.

Deathstalker ★★★★★
()
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.