Technology Review - Published By MIT
Advertisement

Should E-mail Still Be Free?

Continued from page 2

By Technology Review

8/13/2003

smaller text tool iconmedium text tool iconlarger text tool icon

Don't Break E-Mail To Save It
By Vipul Ved Prakash
July 23, 2003

I couldn't agree more with David Crocker's post "Controlling Spam: Ready, Fire, Aim." A lot of people seem to be in a rush to blame the spam deluge on the lack of authentication provided by the venerable Simple Mail Transfer Protocol (SMTP) standard. The truth is, existence of SMTP authentication would, by itself, do almost nothing to alleviate the spam problem. It's important to understand what SMTP authentication really adds to the equation, how it might be useful in the fight against spam and what are its limitations.

There are two kinds of authentication. One is domain based authentication-a mechanism that connects each sender of an e-mail to his or her domain name. Several such schemes have been proposed recently, most notably Reverse MX and SPF, which provide the recipient with an assurance that the e-mail actually originated from the sender's domain name. For example, if you receive mail from alice@wonderland.com, and wonderland.com was participating in one of the domain authentication schemes, you could be sure that her e-mails actually originated from wonderland.com.

The second kind of authentication is identity authentication with digital signatures, in which the sender digitally signs the e-mail such that it can be verified by the recipient with help of sender's public key. PGP has been used for identity authentication for the last 10 years by the cryptography and security community, but has not been widely integrated into e-mail applications.

Both kind of authentications have one thing in common: they allow us to trust the "From" field. But what would we do next? Should we "whitelist" known correspondents and quarantine the mail from unknown people in a folder labeled something like, "potential spam"? Clearly, that just shifts the problem from Inbox to the "potential spam" folder; we'd still have to sift through the spam to get at legit messages. We could issue challenge/responses to unknown but authenticated senders. Several challenge/response tools exist today (Spam Arrest, Matador, etc.) that respond to unknown senders with a graphical or analytical question to establish they are indeed human (as opposed to spam-spewing automata). Such systems are flawed without authentication, however, since they end up sending challenges to whatever fake address the spammer puts in the From field. Authentication would benefit these systems but challenge/response interactions alter the e-mail experience (and don't work for mailing lists), and is perhaps the least desired "solution" to spam. We could use network origination information coupled with sender's e-mail addresses as a vector in a statistical spam filtration system. Cloudmark does this to some extent already and SPF based systems intend to do this as well; the technique turns out to be a decent metric to prioritize and filter e-mail. However, it's does not "solve" the spam problem better than filtration technologies available today.

A major change to a well-established protocol will invariably have an extended adoption time-line. Spam filtration systems would have to choose a cut-off point after which they start using the authentication for filtration purposes. If this adoption cut-off is 95 percent (for legitimate mail) than the false positive rate of the system would be 5 percent. I doubt we are going to see 100 percent adoption very quickly. For this reason, I believe that e-mail authentication would be useful only in a statistical context, and not as a deterministic function to drop e-mail on the floor.

My perspective on design of spam filtration solutions is centered around exploitation of the various constraints of the spammer. One thing we don't talk about enough is the fact that spammers have rather serious constraints. They have to send out a marketing message (containing the same meme) to millions of people from (at most) a few thousand different IP addresses. They have to do this in a relatively short period of time. They have to differentiate their content from other spam. They have to defeat existing spam filtration systems.

A successful anti-spam solution will be able to leverage one or more of these constraints to differentiate spam from anti-spam. Cloudmark's product, SpamNet, for example, leverages the "same meme to millions of people" constraint by allowing the first few recipients to identify the meme and share the knowledge with other intended recipients before they receive the message. Bayesian classifiers, on the other hand, leverage the fact that marketing messages are not statistically representative of the mail a person receives. Blacklists, like RBL and SpamCop, leverage the fact that spammers have a limited number of IP addresses from which to send their spam. Cloudmark's Authority product looks for "mutations" in a message crafted to defeat anti-spam systems.

Some of these constraints make for good differentiators and others don't.  IP addresses, for example, are a bad differentiator. A spammer can use a major Internet service provider to send out spam, and blocking the ISP's IP would block all the legit mail originating from the ISP. However, the knowledge of IP combined with another constraint (e.g., meme X originating from IP Y) can be a good differentiator. SMTP authentication would provide us another constraint to exploit. Like IP address, SMTP makes for a poor differentiator by itself (specially domain based authentication), but could be useful when combined with other constraints.

But do we really need to introduce another constraint to defeat spam? I don't believe that we do. If we develop clever systems that can effectively leverage existing constraints we can solve the problem without requiring authentication. Still, if I had to choose, I would be more supportive of identity-based authentication techniques that employ digital signatures, since such schemes will be based on identifying individual users (instead of domains). As far as I know, no such scheme has yet been proposed for fighting spam, but they can't be far from coming.

Barry Shein talks about augmenting e-mail to use payment systems. Assuming such a system ever gets widely deployed, it will change e-mail a lot more radically than authentication. The tradeoff is more apparent in this case; is addition of another constraint that helps us solve the spam problem worth radically changing the way e-mail functions? I think not.

I am a fan of the current Internet e-mail architecture. If e-mail had strong authentication or payment systems built into it, I doubt it would have been as wildly successful as it is today. David Crocker's warning is well grounded: Let's not go off and change e-mail, specially not without an understanding of what can be done with the resources we have today.

Here's a list of three rules (created after the most important features of e-mail) that anti-spam software should strive to follow:

    1) Ability to send and receive e-mail from a stranger. (Whitelisting, payment systems, and challenge/response break this rule.)
    2) Ability to send and receive pseudo-anonymous e-mail. (Domain-based authentication breaks this rule.)
    3) E-mail should be free. (Payment systems break this rule.)

If we can solve the spam problem while maintaining the three features of e-mail, it would be a much sweeter victory.

Vipul Ved Prakash is founder and chief scientist for Cloudmark.

Previous entries:
Crocker (June 20)
Shein (June 20)
Shein (June 16)
Crocker (June 16)

Comments

Log In

Forgot your password?     Register »
Advertisement

Videos

Making 3D Maps on the Move
Technology Review November/December 2009

Current Issue

Natural Gas Changes the Energy Map
The United States has vast supplies of this cleaner fossil fuel. But how should we use it?
Featured Content
Sponsored by:
White Papers

Twelve ways to reduce costs with SQL Server 2008
Find out how to reduce costs and get more efficient

Download

Total Economic Impact of SQL Server 2008 Upgrade
Forrester reports on increasing productivity and management capabilities

Download 

Achieving Cost and Resource Savings with UC
How Office Communications Server R2 and Exchange Server can make your business smarter and more efficient

Download 

The Compelling Case for Conferencing
Read how you can improve workload support and find IT efficiencies

Download

How Windows Server 2008 R2 Helps Optimize IT and Save you Money
Read how you can improve workload support and find IT efficiencies

Download

Windows Server 2008 R2 Hyper-V Live Migration
See how Windows Server 2008 R2 and Hyper-V enable virtualization and Live Migration

Download
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2009 Technology Review. All Rights Reserved.