Technology Review

Communications

Catching Spammers in the Act

Researchers show how spammers harvest e-mail addresses and send out bulk messages.

  • Wednesday, July 15, 2009
  • By Robert Lemos

Researchers have shed new light on the methods by which spammer harvest e-mail addresses from the Web and relay bulk messages through multiple computers. They say that findings could provide additional ammunition in the fight against junk e-mail campaigns.

The problem of unwanted e-mail messages, or spam, continues to vex computer users and security professionals. Currently, more than 90 percent of the e-mail messages traversing the Internet appear to be spam, according to the information released in June by the e-mail security firm MessageLabs.

In one paper scheduled to be presented this week at the Conference on E-mail and Anti-Spam, in Mountain View, CA, researchers from Indiana University studied how spammers obtain the e-mail addresses in the first place. The researchers used a variety of techniques to match the programs that cull e-mail addresses from Web pages to the resulting spam. "We are basically trying to figure out how spammers get your address--the addresses of people that they try to victimize," says Craig Shue, a graduate student at Indiana University who now works at Oak Ridge National Laboratory.

This involved exposing 22,230 unique e-mail addresses on the Web over a five-month period and watching for spam sent to those destinations. The researchers found that an e-mail address included in a comment posted to a website had a much higher probability of resulting in spam. While only four e-mail addresses submitted to 70 websites during registration resulted in spam, half of the e-mail addresses posted to popular sites resulted in spam.

Advertisement

The researchers also set up a website on their own domain and waited for their pages to be crawled. Each visitor to the website would see a different e-mail, a strategy that the researchers hoped would gauge how often programs that automatically crawl sites are operated by spammers. "We are giving out a unique e-mail address to every visitor to our webpage," Shue says. "If we ever get an e-mail to that address, we know that the crawler gave that e-mail address to a spammer."

The researchers also found that the programs that crawl the Web looking for e-mail addresses--dubbed spamming crawlers--have characteristics that could make it easier to detect them. For example, the parts of a network from which a crawler operates tend to be a good predictor of whether it is a legitimate crawler, such as those used by Google or other search engines, or a spamming crawler. "It may be feasible to block a small number of [network numbers] associated with spammer Web crawlers to eliminate the harvesting of e-mail addresses on a site," the Indiana University researchers wrote.

Print

Related Articles

Search Spammers Hacking More Websites

The head of Google's Web-spam-fighting team warns that spammers are increasingly attacking websites.

A Better Way to Shoot Down Spam

Junk mail can now be identified based on a single packet of data.

How Spam is Improving AI

Anti-spam puzzles are helping researchers develop smarter algorithms.

Close Comments

To comment, please sign in or register

Forgot my password

timbrady

1 Comment

  • 945 Days Ago
  • 07/15/2009

spam

If ALL of us deleted ALL our spam messages, they would become unprofitable and might stop.  Don't buy anything, don't even look at them.  Just delete them.

Reply

Guest (jpdemers)

  • 945 Days Ago
  • 07/15/2009

Re: spam

When you send out millions of spam messages, it takes only a very small percentage of fools to make the enterprise profitable. I don't see the number of fools declining as time goes on.

Perhaps if we had laws making the advertisers (who are easier to locate) liable for violating anti-spam statutes, we could discourage some of the abuses.  Charging a nominal voluntary fee for email (a tenth of a cent per message), and filters that block unpaid-for transmissions, would probably put an immediate end to the problem.

Reply

fiberman

186 Comments

  • 943 Days Ago
  • 07/17/2009

Re: spam

Aren't you tired of subsidizing spammers? You know your cost of using the Internet pays for their usage of the majority of the bandwidth?

Reply

martwill38

9 Comments

  • 945 Days Ago
  • 07/15/2009

Don't delete -- fight back!

Instead of simply deleting every spam message I forward it to SPAM@UCE.GOV, including the full header information - in Thunderbird etc., ctrl-U brings up the message source info, ctrl-A and ctrl-C selects all and copies, then in the forwarded message pane, ctrl-V pastes it.  Then click send and you're done.  It takes only a few seconds to do this, and the FTC will go and investigate the spammer.  If it is a phishing scam appearing to come from a financial institution, go to the real institution's web site (do not click on any links in the message!), find their fraudulent email reporting address and CC it to that address before sending.  Simple and effective.

Reply

nhjeff

6 Comments

  • 945 Days Ago
  • 07/15/2009

Why is spam so hard to stop?

At the risk of showing how little I know, it seems to me that it would be reasonably easy to create a verification system for e-mail:

1)  ISP's certify that they know the originator of all messages.  All mail from ISP's that forward e-mail without properly verifying the sender's id will be suspect.
2)  For individual users, only a nominal number of e-mails would be allowed before the user becomes subject to greater scrutiny.
3)  Large volume e-mailers will prove that they meet guidelines--such as easily removing one's address from their lists.  The mailer will pay the cost of certification.

Spam programs will allow all mail from the system defined above to pass.  All other e-mail will be considered suspect, and legitimate e-mailers will quickly demand to be included in the secured system.

Individual PC's that are taken over by bots would quickly be shut off under such a system.  Mass e-mailers will also monitor their own systems to assure that their volume is consistent with their business practices.  The biggest problems seem to be creating a secure transport network and verifying the identity of each member of that network.

Reply

carlhage

84 Comments

  • 945 Days Ago
  • 07/15/2009

Re: Why is spam so hard to stop?

So the web crawler finds the IP of spammers-- so what? There is no problem finding spammers-- the problem is that there is nothing reasonable to be done about it. I see a constant attack on the servers I manage in logs, but I can't do anything about it other than hope the breakin doesn't work. We get tons of spam, mostly from hijacked computers, but does anyone help the victims or isolate these computers? There is a way to trace back to the source for payment, registration, bank login, etc., but there is noone to follow up at stop the spammers. The government, for the most part, is not punishing spammers, not dealing with compromised computers that have been hijacked, even resists passing laws against it, and so there is no process other than partly effective self-defense (firewall/filter). If you ask me the Libertarian ideal is a nice concept but doesn't work. It doesn't surprise me that there was a government denial of service attack-- it could be a rogue government (e.g. N Korea) but just as easily be a high school student in a bedroom somewhere.

Reply

carbonmind

9 Comments

  • 941 Days Ago
  • 07/19/2009

Re: Why is spam so hard to stop?

The WSJournal Reports: "When it comes to identifying spam, two-thirds used the sender’s name as a gauge, followed by 45% who looked at subject lines and 22% who spot other “visual indicators.” About 3% relied on the time a message was sent to identify whether or not it’s legitimate. So what’s driving them to click on Cialis offers or fake Michael Jackson photos? About 17% said it was a mistake. Twelve percent were interested in the product or service, and 13% don’t know why they acted on the message. Six percent “wanted to see what would happen.”

- spam is so cheap to send, and people are gullible - unless we use closed-loop environments for messaging we are going to be stuck with spam for a long time. Migrating off the web and onto your cellphone may mitigate the volume, i.e. acceptance of only "known" numbers and addresses - but so long as people keep opening and responding to spam, it will continue.

BTW: This particular research doesn't even mention that spammers don't all scrape or harvest their addresses from the web, they create them on the fly. Major ISPs create "honeypot" or decoy email addresses to catch spammers and receive spam almost within the hour at these fictitious addresses!

Reply

Advertisement

reid

3 Comments

  • 945 Days Ago
  • 07/15/2009

Suggested Solution

We need to change the economics of e-mail. Here's a suggestion: Create an alternate e-mail system. In order to participate, you have to register and pre-fund an account. The amount could be modest, say $5 or $10. Then every time you send an e-mail message, you pay a penny out of your account. Every time your receive a message, the sender's penny is deposited into your account. If you reply, the penny goes back. Suddenly spam is no longer profitable.

Reply

a3an

1 Comment

  • 945 Days Ago
  • 07/15/2009

Re: Suggested Solution

I guess one does not need an ISP to send emails. However, your idea to fight this with weapons of economy is interesting.  How about defining an mailing system based on mail notifications. Such a system would tell you got mail and send the header plus an Internet address where to pick-up the actual contents. With such a system one would know the originating system, but foremost the originator has to provide for resources to send the actual contents when you ask for it. This will be costlier than just dumping a million emails to the net at little or no cost. 

Reply

fiberman

186 Comments

  • 944 Days Ago
  • 07/16/2009

Change the economics

The idea of charging a small amount per email is the only easy solution. As long as a spammer can send all the email they want for no cost, we'll be facing this problem. And charging a penny (or even less) an email is easy to implement.
Unfortunately, the suppliers of hardware and software to the Internet and the suppliers of the backbone connections will fight you on this - they make their money supplying products and services that make this possible, and cutting spam would adversely affect their revenue for years!
You know, they're like the bankers that got us in this financial mess and the healthcare companies who profit from our illnesses; they only care about the profit, not the problems they cause.
Second solution - a secure email system that only allows authenticated emails - not from the bastards that use my email for spamming which fills my inbox with undeliverables every few months, or the ones who sent me over 1,000 emails last week because I caught and stopped one scam!
Come on, industry, just once do something because it's right, not just to make a buck!!

Reply

ru4sure

1 Comment

  • 944 Days Ago
  • 07/16/2009

Re: Change the economics

The answer is simple. Use gmail by google

Reply

vaniderstine

1 Comment

  • 930 Days Ago
  • 07/30/2009

Re: Change the economics

Masking the problem does not solve it.   Just because GMail doesn't deliver 100 spam messages/day into your inbox doesn't mean that they weren't sent to you.

Reply

r0s3

1 Comment

  • 945 Days Ago
  • 07/15/2009

I'm not afraid of spam

rose@askauntrose.com

There spammers, come 'n get me.  I have amazing spam filters that are impenetrable.

hahaha.

Reply

johnjones

1 Comment

  • 945 Days Ago
  • 07/15/2009

authenticate email...

well this could be solved relatively easily by organizations actually signing their email

by using DKIM this then provides a way to tell if the domain that they preport to be from is actually who they say they are without breaking email standards

once identity works spammer have a much harder time !

regards

John Jones
http://www.johnjones.me.uk/

Reply

Advertisement

DogStar76

2 Comments

  • 943 Days Ago
  • 07/17/2009

Additional Authentication

Having worked for a Fortune 10 company that handled over 3M messages of spam a week I can say that you'll never stop it all.

In addition to DKIM, which every respectable business partner should be moving toward doing between each other along with TLS, LDAP authentication to verify legitimate addresses and some nice RegEx content filters help.

After moving from multiple RBL hosting parties to only Zen Spamhouse which has PBL, Policy Based Listings, we saw a big drop in bot-net spam since those are using cable modem & dsl hosts on home user class isp networks. Those networks are blocked by the PBL.

Reply

Phineas

127 Comments

  • 943 Days Ago
  • 07/17/2009

What's Not To Like?

I have to assume that all of us have won that EuroLottery and Queen Elizabeth wants to send us money for some vague reason. It must have been that night Liz and I spent with Paris Hilton at the Motel Six. And the widow of deposed Nigerian president Obasanjo wants to deposite 25 mill in our bank accts.

Seriously, spam is a war. Any victory is likely to be momentary, followed by a notice from your bank to tell you that your pwrd has been compromised. Let me know who wins this one, and besides, if I didn't have junk spam, I'd never get any mail.

Reply

Advertisement

MAGAZINE

Can We Build Tomorrow's Breakthroughs?

Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.

Advertisement

Technology Review Lists

TR50

Our list of the 50 most innovative companies, including the following:

Novartis

Zynga

Synthetic Genomics

Cellular Dynamics International

More

Advertisement

Facebook

Advertisement