Dealing with bitrot

What tools are there for detecting, and especially, fixing bitrot?

I found a couple which index and hash files, then compare hashes in future scans to detect corruption.


Any others? Are there tools that can also repair bitrot?

Examples of bitrot would include music files which now suddenly stop playing before the end, while they were previously known to play fully.

The worrying fact is that backups do not necessarily protect against bitrot, so while I have a backup strategy in place, there are still ways of losing data.

EDIT: One more option, par2, which can repair, and is available in the Fedora repo as par2cmdline:

Filesystems like btrfs and ZFS (the latter is not available on Fedora) can detect bitrot, automatically when accessing the file or manually (e.g. via the scrub command on btrfs). They can also correct the error automatically, but AFAIK require a backup (RAID, or data duplicated on a single disk) to do so.

Yes, I’ve seen btrfs and ZFS mentioned, but that would involve fundamentally changing the system installation. Plus I’m not too familiar with the pros and cons of those filesystems. It seems like quite a diversion, unless protecting against bitrot is a basic requirement of a whole setup (which maybe it should be).

Tools that can deal with bitrot on any system are more what I had in mind. Looks like this isn’t such a simple problem, and the solutions involve data duplication one way or another.

Are there tools that can fix bitrot without prior creation of recovery files?

I haven’t really looked into this much, but fundamentally*, without some sort of backup, how would any system know what the “correct” bit values would be? You don’t necessarily need a full backup (see for example RAID5), but you need some sort of secondary information about what the file looked like before it broke.

Some things you can guess at if you know what kind of file you’re dealing with, so there for sure are tools that can do this for specific files/filetypes (for example photorec, which recovers media files from broken partitions by looking for bit patterns that look like the headers of those files), but generally? Again, how would a program know what myprogramsoutput.wdg was supposed to look like pre-rot?

But again, my experience here is limited, I only ever dealt with broken filesystems, not files.

*To quote a friend: Any answer that starts with “Fundamentally, …” is bad news.

Yes it makes sense that some information is needed to recover lost information, otherwise it wouldn’t really be lost, would it? Perhaps another question is whether data could be corrupted/rearranged without actual loss in information. But I digress.

Usually, some other hints are available, such as file type. photorec looks interesting (available in Fedora repos from the testdisk package). It supports a long list of file types. I’ll have to give it a try the next time I run into a corrupted file.