The Distributed Checksum Clearinghouse is another weapon in the war against spam. Similar to Vipul's Razor, the DCC is a collaborative spam filtering network. The key difference between the two is that Razor relies on human operators, whereas DCC is completely automated.

With a network of 200 public servers, and countless more participating servers, the DCC client works by creating a checksum of incoming mail and submitting it to the nearest server. The servers periodically update statistical information based on the amount of messages received that generate the same checksum. Using this method DCC is able to deflect a major spam campaign as it is initiated by broadcasting to the rest of the network that there is a large number of unsolicited mail being delivered that matches the advertised checksum.

DCC is usually used in conjunction with a Bayesian network filtering product like SpamAssassin (which ships with support for DCC and Vipul's Razor). By default, a message matching the DCC registry of known spam gets its spam score modified by +4 so it is much more likely to get flagged as spam.

The obvious weakness in this system is that legitimate mailing list traffic can easily be mistaken as spam when it has a large number of recipients whos mail is filtered through a DCC client like SpamAssassin. In this situation it is a good idea to locally whitelist the domains of any digest-type mailing list you subscribe to.

Utilising the data already captured for DCC operation, the DCC project displays graphs on its web servers that clearly illustrate the amount of spam versus the amount of legitimate email flowing through participating mail servers. These graphs are updated in real time (generated with a nifty rrdtool script). It's quite cool to show these to clients who you're trying to sell on the merits of spam filtering. (The graphs are available he .) At the time of this writing the volume of delivered spam versus delivered legitimate mail is reported as 60%.

Used in conjunction with the Realtime Blackhole List and Bayesian filtering rules, the volume of delivered spam can be reduced drastically.