Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Researchers have shed new light on the methods by which spammer harvest e-mail addresses from the Web and relay bulk messages through multiple computers. They say that findings could provide additional ammunition in the fight against junk e-mail campaigns.

The problem of unwanted e-mail messages, or spam, continues to vex computer users and security professionals. Currently, more than 90 percent of the e-mail messages traversing the Internet appear to be spam, according to the information released in June by the e-mail security firm MessageLabs.

In one paper scheduled to be presented this week at the Conference on E-mail and Anti-Spam, in Mountain View, CA, researchers from Indiana University studied how spammers obtain the e-mail addresses in the first place. The researchers used a variety of techniques to match the programs that cull e-mail addresses from Web pages to the resulting spam. “We are basically trying to figure out how spammers get your address–the addresses of people that they try to victimize,” says Craig Shue, a graduate student at Indiana University who now works at Oak Ridge National Laboratory.

This involved exposing 22,230 unique e-mail addresses on the Web over a five-month period and watching for spam sent to those destinations. The researchers found that an e-mail address included in a comment posted to a website had a much higher probability of resulting in spam. While only four e-mail addresses submitted to 70 websites during registration resulted in spam, half of the e-mail addresses posted to popular sites resulted in spam.

The researchers also set up a website on their own domain and waited for their pages to be crawled. Each visitor to the website would see a different e-mail, a strategy that the researchers hoped would gauge how often programs that automatically crawl sites are operated by spammers. “We are giving out a unique e-mail address to every visitor to our webpage,” Shue says. “If we ever get an e-mail to that address, we know that the crawler gave that e-mail address to a spammer.”

The researchers also found that the programs that crawl the Web looking for e-mail addresses–dubbed spamming crawlers–have characteristics that could make it easier to detect them. For example, the parts of a network from which a crawler operates tend to be a good predictor of whether it is a legitimate crawler, such as those used by Google or other search engines, or a spamming crawler. “It may be feasible to block a small number of [network numbers] associated with spammer Web crawlers to eliminate the harvesting of e-mail addresses on a site,” the Indiana University researchers wrote.

16 comments. Share your thoughts »

Credit: Technology Review

Tagged: Communications, Web, security, spam, e-mail, spammers

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me