I am playing a empty HDD which smartd considered as failing.
smartctl -A /dev/sdX
smartctl 7.2 2021-01-17 r5170 [x86_64-linux-5.10.10-200.fc33.x86_64] (local build)
Copyright (C) 2002-20, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 192 192 051 Pre-fail Always - 56286
3 Spin_Up_Time 0x0027 194 165 021 Pre-fail Always - 5291
4 Start_Stop_Count 0x0032 097 097 000 Old_age Always - 3452
5 Reallocated_Sector_Ct 0x0033 133 133 140 Pre-fail Always FAILING_NOW 1265
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 084 084 000 Old_age Always - 12286
10 Spin_Retry_Count 0x0032 100 100 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 100 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 097 097 000 Old_age Always - 3226
192 Power-Off_Retract_Count 0x0032 196 196 000 Old_age Always - 3179
193 Load_Cycle_Count 0x0032 135 135 000 Old_age Always - 196190
194 Temperature_Celsius 0x0022 118 102 000 Old_age Always - 32
196 Reallocated_Event_Count 0x0032 001 001 000 Old_age Always - 943
197 Current_Pending_Sector 0x0032 001 001 000 Old_age Always - 64793
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 38
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 1
200 Multi_Zone_Error_Rate 0x0008 197 197 000 Old_age Offline - 1032
Game Target: use all the good sectors of the failing HDD to form a âperfectâ virtual block device.
My original game plan:
A/. Identify all badblocks of the device, using badblocks
$ sudo badblocks -wsv -t 0xff -o bb-sdX.txt /dev/sdX
B/. use bb-sdX.txt to produce a init-file, that dmsetup can use, as per btrfs - Can LVM mark / avoid bad blocks? - Unix & Linux Stack Exchange
$ sudo dmsetup create nobbsdX --table bb-sdX-table
C/. make a filesystem, and use f3 GitHub - AltraMayor/f3: F3 - Fight Flash Fraud
to test write/read correctness
$ sudo f3write /mnt/.nobbsdX
$ sudo f3read /mnt/.nobbsdX
Problems:
- badblocks only completed 80%, as computer is ârebootedâ, but ZERO badblocks have been reported
- I proceeded to use f3 to do the read/write test, with lots of errors, and certainly some errors are inside the checked 80% area of the HDD
- because I am using btrfs filesystem, I further use btrfs scrub /dev/sdX, and having lots of errors as expected, again lots of error inside the 80% area checked by badblocks
some errors due to btrfs scrub
Jan 29 14:23:49 amdf.lan kernel: BTRFS warning (device sdc1): i/o error at logical 181908836352 on dev /dev/sdc1, physical 181908836352, root 5, inode 425, offset 316645376, length 4096, links 1 (path: 169.h2w)
Jan 29 14:23:49 amdf.lan kernel: BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 66, rd 368, flush 0, corrupt 0, gen 0
Jan 29 14:23:49 amdf.lan kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 181908836352 on dev /dev/sdc1
Jan 29 14:23:52 amdf.lan kernel: ata3.00: exception Emask 0x0 SAct 0x4000 SErr 0x0 action 0x0
Jan 29 14:23:52 amdf.lan kernel: ata3.00: irq_stat 0x40000008
Jan 29 14:23:52 amdf.lan kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 29 14:23:52 amdf.lan kernel: ata3.00: cmd 60/08:70:50:56:2d/00:00:15:00:00/40 tag 14 ncq dma 4096 in
res 41/40:00:50:56:2d/00:00:15:00:00/40 Emask 0x409 (media error)
Jan 29 14:23:52 amdf.lan kernel: ata3.00: status: { DRDY ERR }
Jan 29 14:23:52 amdf.lan kernel: ata3.00: error: { UNC }
Jan 29 14:23:52 amdf.lan kernel: ata3.00: configured for UDMA/133
Jan 29 14:23:52 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#14 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
Jan 29 14:23:52 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#14 Sense Key : Medium Error [current]
Jan 29 14:23:52 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#14 Add. Sense: Unrecovered read error - auto reallocate failed
Jan 29 14:23:52 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#14 CDB: Read(10) 28 00 15 2d 56 50 00 00 08 00
Jan 29 14:23:52 amdf.lan kernel: blk_update_request: I/O error, dev sdc, sector 355292752 op 0x0:(READ) flags 0x800 phys_seg 1 prio class 0
Jan 29 14:23:52 amdf.lan kernel: ata3: EH complete
Jan 29 14:23:52 amdf.lan kernel: BTRFS warning (device sdc1): i/o error at logical 181908840448 on dev /dev/sdc1, physical 181908840448, root 5, inode 425, offset 316649472, length 4096, links 1 (path: 169.h2w)
Jan 29 14:23:52 amdf.lan kernel: BTRFS error (device sdc1): bdev /dev/sdc1 errs: wr 66, rd 369, flush 0, corrupt 0, gen 0
Jan 29 14:23:52 amdf.lan kernel: BTRFS error (device sdc1): unable to fixup (regular) error at logical 181908840448 on dev /dev/sdc1
Jan 29 14:25:25 amdf.lan NetworkManager[874]: [1611901525.7612] dhcp6 (br24): option dhcp6_name_servers => âfda0:7018:7174:224:7e8b:8d1c:7b91:10c1â
Jan 29 14:25:25 amdf.lan NetworkManager[874]: [1611901525.7613] dhcp6 (br24): option ip6_address => âfda0:7018:7174:224:7e8b:8d1c:0:216 2404:c804:927:8c00:9b3:a455:0:216â
Jan 29 14:25:25 amdf.lan NetworkManager[874]: [1611901525.7614] dhcp6 (br24): state changed bound â bound
Jan 29 14:25:25 amdf.lan systemd[1]: Starting Network Manager Script Dispatcher ServiceâŠ
Jan 29 14:25:25 amdf.lan audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm=âsystemdâ exe=â/usr/lib/systemd/systemdâ hostname=? add>
Jan 29 14:25:25 amdf.lan systemd[1]: Started Network Manager Script Dispatcher Service.
Jan 29 14:25:35 amdf.lan systemd[1]: NetworkManager-dispatcher.service: Succeeded.
Jan 29 14:25:35 amdf.lan audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=NetworkManager-dispatcher comm=âsystemdâ exe=â/usr/lib/systemd/systemdâ hostname=? addr>
Jan 29 14:25:35 amdf.lan kernel: ata3.00: exception Emask 0x0 SAct 0xc0001fff SErr 0x0 action 0x0
Jan 29 14:25:35 amdf.lan kernel: ata3.00: irq_stat 0x40000008
Jan 29 14:25:35 amdf.lan kernel: ata3.00: failed command: READ FPDMA QUEUED
Jan 29 14:25:35 amdf.lan kernel: ata3.00: cmd 60/00:f0:00:51:1f/05:00:16:00:00/40 tag 30 ncq dma 655360 in
res 41/40:00:60:54:1f/00:00:16:00:00/40 Emask 0x409 (media error)
Jan 29 14:25:35 amdf.lan kernel: ata3.00: status: { DRDY ERR }
Jan 29 14:25:35 amdf.lan kernel: ata3.00: error: { UNC }
Jan 29 14:25:35 amdf.lan kernel: ata3.00: configured for UDMA/133
Jan 29 14:25:35 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#30 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE cmd_age=3s
Jan 29 14:25:35 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#30 Sense Key : Medium Error [current]
Jan 29 14:25:35 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#30 Add. Sense: Unrecovered read error - auto reallocate failed
Jan 29 14:25:35 amdf.lan kernel: sd 2:0:0:0: [sdc] tag#30 CDB: Read(10) 28 00 16 1f 51 00 00 05 00 00
Jan 29 14:25:35 amdf.lan kernel: blk_update_request: I/O error, dev sdc, sector 371151968 op 0x0:(READ) flags 0x0 phys_seg 52 prio class 0
Jan 29 14:25:35 amdf.lan kernel: ata3: EH complete
Question 1:
Why badblocks cannot identify any badblocks? Any other tools can be use to identify badblocks?
Question 2:
BTRFS warning (device sdc1): i/o error at logical 499311915008 on dev /dev/sdc1, physical 499311915008, root 5, inode 2121, offset 1000136704, length 4096, links 1 (path: 1865.h2w)
I want to try testing with, as per Identify damaged files - ArchWiki
$ sudo hdparm --read-sector 4621327 /dev/sdX
$ sudo hdparm --repair-sector 4621327 --yes-i-know-what-i-am-doing /dev/sdX
output of --read-sector 499311915008
$sudo hdparm --read-sector 499311915008 /dev/sdb
/dev/sdb:
reading sector 499311915008: SG_IO: bad/missing sense data, sb: 70 00 05 00 00 00 00 0a 10 51 e0 01 21 00 00 00 a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
succeeded
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000
Given that BTRFS warning, is 499311915008 labeled âphysicalâ the correct value for --read-sector?