English

# Filesystem problems every time there is a kernel update

On Fedora 25, after a kernel update, on the first time the system reboots goes to emergency mode and i'm forced to perform fsck. It's happening consistently on the past 5 or 6 kernel updates, and only when a kernel update took place. I've done several disk tests, and they all seemed fine, no disk problems are pointed out, is there any other way of preventing this? My SysInfo log

edit retag close merge delete

I see disk errors logged in your Sysinfo log

( 2017-06-15 23:01:19 +0000 )edit

Sort by » oldest newest most voted

Try opening up GNOME Disks (installed by default.) Select your disk, click the overflow menu button (the “hamburger button”) and choose SMART Data and Self-Test. Click Start Test and choose Extended.

This will help provide insights about the health of your disk. Is it old? Flash or spinning?

more

The HD is a spinning, with 3 or 4 years old. Gnome Disks diagnose says the disk is good, with 160 damage sectors. All individual tests evaluation gives a good status. One thing i haben't mentioned is that the error happens on /dev/mapper/fedora-home

( 2017-06-17 11:17:12 +0000 )edit
 [ 1177.395245] ata1.00: status: { DRDY ERR }
[ 1177.395247] ata1.00: error: { UNC }


This is an uncorrectable read error, it is almost certainly a bad sector. The only way to fix it is to write over the sector, or sectors. Of course this means some form of data loss. It's possible the problem is intermittant which is why it sometimes doesn't happen. The simplest but longest option is to backup /home, and then just do a clean install of the OS and then restore /home. During normal writes, the bad sector will get written to, and if that fails the drive firmware will automatically correct and remap to a reserve sector.

More complicated is explaining the fast method. The gist is, find the address of the bad sector which should be in dmesg, and also learn if this is a 512 byte or 4096 byte sector drive. Use

blockdev  --getpbsz


If it's 512 bytes, you can just write a single block of zeros using dd

seek=address count=1


If it's a 4096 byte physical sector drive you have to convert the address to 4096 byte address and use

seek=address bs=4096 count=1


And then iterate with fsck -fv. If dmesg shows any UNC type errors with sector addresses, you'll need to nuke that sector and rerun fsck. Chances are, if fsck does not complain about a sector erasures, the erasure was data not file system metadata. So you might have other problems if that sector was e.g. part of some kind of system binary that's needed, you could get a crash. But chances are there are other problems anyway because UNC just ends up being an I/O error preventing it from being read anyway.

more

So, I've located the bad sectors and did perform a dd on the first sector, but it took too much time, so i haven't done the same on the others. After replacing the first bad sector with zeros, all came to normal. In fact, despite I've just dd the first one, followed by a fsck -fv, the count drop from 160 to 120 bad sectors, and currently, without performing any other repair, gnome discs shows me that i now have 80 bad sectors, which means less 100. If this where related to disk problems, shouldn't the bad sectors number be constant or increasing?

( 2017-06-30 14:34:00 +0000 )edit

[hide preview]