It sounds good on paper - download the tiny .SFV file along with whatever you're downloading, and with one click you can discover if the file is correct or not. An especially good idea on usenet, you might think, due to the unreliable transmission method, and interesting undocumented interactions between user agents. And it would be, too, if the implementation wasn't so fundamentally flawed.
SFV has never been rigidly specified, in the way that other protocols or file formats have. It can be roughly described as
name of file, <space>, CRC32 of file (in hexadecimal), <line break>. Any line beginning with a semicolon is a comment.
The upshot of this loose specification is that very few SFV programs are fully compatible with any other - Some SFV programs generate (and can parse) files with DOS
line breaks, others with UNIX
ones. When a filename contains spaces, the correct behaviour is undefined - the most common approach seems to be to read the last eight characters on a line as the CRC, and the rest as the filename. To add information that should have been in the specification from day one (such as the file size, date) without breaking backward compatibility
, many programs add proprietary
extensions as comments. It goes without saying that each program's proprietary extensions are not fully compatible with any other program; it is common to see lines such as
; Bogus line to fool Win-SFV and its lame compatiblity.
The biggest flaw with SFV, however, is the hash used. CRC32 is not cryptographically secure, meaning that it is easy to work out not only the hash that corresponds to a file, but the alterations necessary to make a file correspond to a given hash. While CRC32 is effective against random corruption, missing only 1 in 2^32 errors, it is not in any way secure against deliberate tampering. Unimportant bytes in a file can be manipulated to produce a file with the same filesize and CRC32, but different contents, in a trivial amount of time.
Trusting a SFV file to verify any form of executable code (applications, Java applets, shell scripts, WMV/ASF movies, etc.) is simply asking for it, as a malicious third party can (with a minimum of effort) produce a backdoored executable file that is verified as correct by an 'official' SFV file.
Despite their flaws, .SFV files will be around as long as uninformed people keep producing them. Please consider using any of the following solutions instead of .SFV files:
- GNU md5sum, a GPL application that generates files very similar to .SFV, but using the cryptograpically secure MD5 hash function. It is (to all intents and purposes) not possible to tamper with a file and keep the same MD5sum.
- .PAR/.PAR2, well-specified file formats that uses reed-solomon coding to detect and repair errors in a file. .PAR/.PAR2 can repair a file if sections of it are missing/damaged . PAR/PAR2 stores an MD5sum of the file, making tampering impossible.
PAR/PAR2 programs and specifications from http://parchive.sourceforge.net/
GNU md5sum as part of GNU textutils from http://www.gnu.org/software/textutils/textutils.html