LINUX.ORG.RU

Странная ошибка диска

 


0

2

Не сталкивался ещё с таким. Сразу под замену или последить ещё?

dmesg

[357596.477406] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[357596.477427] ata1.00: failed command: FLUSH CACHE EXT
[357596.477439] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 6
                         res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[357596.477466] ata1.00: status: { DRDY }
[357596.477476] ata1: hard resetting link
[357597.997318] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[357597.999378] ata1.00: configured for UDMA/133
[357597.999380] ata1.00: retrying FLUSH 0xea Emask 0x4
[357597.999484] ata1: EH complete
[379425.886368] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[379425.886388] ata1.00: failed command: FLUSH CACHE EXT
[379425.886400] ata1.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 7
                         res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
[379425.886426] ata1.00: status: { DRDY }
[379425.886435] ata1: hard resetting link
[379431.262527] ata1: link is slow to respond, please be patient (ready=0)
[379434.826387] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[379434.829836] ata1.00: configured for UDMA/133
[379434.829838] ata1.00: retrying FLUSH 0xea Emask 0x4
[379434.830004] ata1: EH complete
[392386.510449] ata1.00: exception Emask 0x0 SAct 0x80070000 SErr 0x0 action 0x0
[392386.510469] ata1.00: irq_stat 0x40000008
[392386.510479] ata1.00: failed command: READ FPDMA QUEUED
[392386.510491] ata1.00: cmd 60/00:80:00:00:00/01:00:00:00:00/40 tag 16 ncq dma 131072 in
                         res 51/10:f8:08:00:00/00:00:00:00:00/40 Emask 0x481 (invalid argument) <F>
[392386.510521] ata1.00: status: { DRDY ERR }
[392386.510530] ata1.00: error: { IDNF }
[392386.512876] ata1.00: configured for UDMA/133
[392386.512884] ata1: EH complete
[392403.066425] ata1.00: exception Emask 0x0 SAct 0x1c00000 SErr 0x0 action 0x0
[392403.066445] ata1.00: irq_stat 0x40000008
[392403.066454] ata1.00: failed command: READ FPDMA QUEUED
[392403.066466] ata1.00: cmd 60/00:c0:00:00:00/01:00:00:00:00/40 tag 24 ncq dma 131072 in
                         res 51/10:00:00:00:00/00:01:00:00:00/40 Emask 0x481 (invalid argument) <F>
[392403.066496] ata1.00: status: { DRDY ERR }
[392403.066505] ata1.00: error: { IDNF }
[392408.245864] ata1.00: qc timeout (cmd 0xec)
[392408.245869] ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
[392408.245870] ata1.00: revalidation failed (errno=-5)
[392408.245888] ata1: hard resetting link
[392413.577927] ata1: link is slow to respond, please be patient (ready=0)
[392418.302017] ata1: COMRESET failed (errno=-16)
[392418.302032] ata1: hard resetting link
[392423.657715] ata1: link is slow to respond, please be patient (ready=0)
[392428.345667] ata1: COMRESET failed (errno=-16)
[392428.345686] ata1: hard resetting link
[392433.705727] ata1: link is slow to respond, please be patient (ready=0)
[392448.725551] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[392448.727648] ata1.00: configured for UDMA/133
[392448.727656] ata1: EH complete

smart

smartctl 7.1 2019-12-30 r5022 [x86_64-linux-5.4.44-2-pve] (local build)
Copyright (C) 2002-19, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Hitachi/HGST Travelstar Z5K500
Device Model:     HGST HTS545050A7E380
Serial Number:    TMA55BZJ35BNAP
LU WWN Device Id: 5 000cca 71fecb81f
Firmware Version: GG2ZBF40
User Capacity:    500 107 862 016 bytes [500 GB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      2.5 inches
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   ATA8-ACS T13/1699-D revision 6
SATA Version is:  SATA 2.6, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Sat Jul 18 13:00:26 2020 +04
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00)	Offline data collection activity
					was never started.
					Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(   45) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 ( 112) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   062    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0005   100   100   040    Pre-fail  Offline      -       0
  3 Spin_Up_Time            0x0007   214   214   033    Pre-fail  Always       -       1
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       462
  5 Reallocated_Sector_Ct   0x0033   100   100   005    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000b   091   091   067    Pre-fail  Always       -       9
  8 Seek_Time_Performance   0x0005   100   100   040    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0012   001   001   000    Old_age   Always       -       50352
 10 Spin_Retry_Count        0x0013   100   100   060    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       356
191 G-Sense_Error_Rate      0x000a   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       65579
193 Load_Cycle_Count        0x0012   098   098   000    Old_age   Always       -       27668
194 Temperature_Celsius     0x0002   150   150   000    Old_age   Always       -       40 (Min/Max 11/53)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   200   200   000    Old_age   Always       -       0
223 Load_Retry_Count        0x000a   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
ATA Error Count: 2
	CR = Command Register [HEX]
	FR = Features Register [HEX]
	SC = Sector Count Register [HEX]
	SN = Sector Number Register [HEX]
	CL = Cylinder Low Register [HEX]
	CH = Cylinder High Register [HEX]
	DH = Device/Head Register [HEX]
	DC = Device Command Register [HEX]
	ER = Error register [HEX]
	ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 2 occurred at disk power-on lifetime: 50342 hours (2097 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 00 00 00 00 00  Error: IDNF at LBA = 0x00000000 = 0

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 18 20 08 78 14 40 00   8d+00:49:36.934  WRITE FPDMA QUEUED
  60 00 c0 00 00 00 40 00   8d+00:49:36.934  READ FPDMA QUEUED
  60 00 b8 00 08 00 40 00   8d+00:49:36.934  READ FPDMA QUEUED
  60 00 b0 00 08 10 40 00   8d+00:49:36.934  READ FPDMA QUEUED
  61 10 a8 c0 09 1f 40 00   8d+00:49:36.934  WRITE FPDMA QUEUED

Error 1 occurred at disk power-on lifetime: 50342 hours (2097 days + 14 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  10 51 f8 08 00 00 00  Error: IDNF at LBA = 0x00000008 = 8

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  61 10 f8 c0 09 1f 40 00   8d+00:49:25.530  WRITE FPDMA QUEUED
  60 00 90 00 08 10 40 00   8d+00:49:24.545  READ FPDMA QUEUED
  60 00 88 00 08 00 40 00   8d+00:49:24.545  READ FPDMA QUEUED
  60 00 80 00 00 00 40 00   8d+00:49:24.545  READ FPDMA QUEUED
  61 08 18 80 50 94 40 00   8d+00:49:23.340  WRITE FPDMA QUEUED

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Vendor (0x50)       Completed without error       00%         1         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Последнее исправление: Shaltay (всего исправлений: 1)

Лорчую. У меня при старте системы.

ata1: softreset failed (device not ready)

Иногда ata1, иногда ata3.

Гугл предложил игнорить ошибку, но у меня всё равно руки чешутся исправить.

Artamudo ★★★★
()

Ну это не блины. Это либо шлейф плохой, либо питание плохое, либо какие-то проблемы с контактами или электроникой внутри диска. 50352 часов наработки - это очень неплохой срок, тем более для ноутбучного 2,5" винта. Можно и на пенсию уже.

legolegs ★★★★★
()

Проверь контакты.

На десктопе вздутие конденсаторов на системной плате сопровождалось подобным (ошибки в dmesg, внезапные тормоза, зависания). Замена конденсаторов решила проблему.

http://eliasoenal.com/2012/10/31/power-supply-failures-can-be-pretty-annoying-to-find/

greenman ★★★★★
()
Последнее исправление: greenman (всего исправлений: 1)
Ответ на: комментарий от SergeySVold

А что это должно показать? У меня на всём саташном тоже 32.

legolegs ★★★★★
()
Ответ на: комментарий от Shaltay

Это показатель NCQ (Native Command Queuing) и часто помогает убрать такую ошибку отключение его

echo 1 > /sys/block/sda/device/queue_depth. Всё что не равно 1 означает что NCQ включен и работает.

SergeySVold ★★★★★
()
Ответ на: комментарий от SergeySVold

Это показатель NCQ (Native Command Queuing) и часто помогает убрать такую ошибку отключение его

Нет, не часто, а в текущих реалиях почти никогда.

И да, ты наверняка спутал NCQ с Queued TRIM.

anonymous
()
Вы не можете добавлять комментарии в эту тему. Тема перемещена в архив.