Technology Review

Communications

A Better Way to Shoot Down Spam

Junk mail can now be identified based on a single packet of data.

  • Wednesday, July 29, 2009
  • By Rachel Kremen

New software developed at the Georgia Institute for Technology can identify spam before it hits the mail server. The system, known as SNARE (Spatio-temporal Network-level Automatic Reputation Engine), scores each incoming e-mail based on a variety of new criteria that can be gleaned from a single packet of data. The researchers involved say the automated system puts less of a strain on the network and minimizes the need for human intervention while achieving the same accuracy as traditional spam filters.

Separating spam from legitimate e-mail, also known as ham, isn't easy. That's partly because of the sheer volume of messages that need to be processed and partly because of e-mail expectations: users want their e-mail to arrive minutes, if not seconds, after it was sent. Analyzing the content of every e-mail might be a reliable method for identifying spam, but it takes too long, says Nick Feamster, an assistant professor at Georgia Tech who oversaw the SNARE research. Letting spam flow into our in-boxes unfiltered isn't a sensible option, either. According to a report released by the e-mail security firm MessageLabs, spam accounted for 90.4 percent of all e-mail sent in June.

"If you're not concerned about spam, I would suggest you turn off your spam filter for about an hour and see what happens," says Sven Krasser, senior director of data-mining research at McAfee. The Santa Clara, CA, company provided raw data for analysis by the Georgia Tech team.

The team analyzed 25 million e-mails collected by TrustedSource.org, an online service developed by McAfee to collate data on trends in spam and malware. Using this data, the Georgia Tech researchers discovered several characteristics that could be gleaned from a single packet of data and used to efficiently identify junk mail. For example, their research revealed that ham tends to come from computers that have a lot of channels, or ports, open for communication. Bots, automated systems that are often used to send out reams of spam, tend to keep open only the e-mail port, known as the Simple Mail Transfer Protocol port.

Advertisement

Furthermore, the researchers found that by plotting the geodesic distance between the Internet Protocol (IP) addresses of the sender and receiver--measured on the curved surface of the earth--they could determine whether the message was junk. (Much like every house has a street address, every computer on the Internet has an IP address, and that address can be mapped to a geographic area.) Spam, the researchers found, tends to travel farther than ham. Spammers also tend to have IP addresses that are numerically close to those of other spammers.

Dean Malmgren, a PhD candidate at Northwestern University whose work includes identifying new methods for identifying spam, says he finds the research interesting. But he wonders how robust SNARE will be once its methodology is widely known. IP addresses, he notes, are easy to fake. So, if spammers got wind of how SNARE works, they might, for example, use a fake IP address close to the recipient's.

Print

Related Articles

Scrutinizing Facebook Spam

Researchers downloaded 3.5 million profiles to see how accounts are used to send out spam.

Fixing E-Mail

Experts at Defrag believe e-mail can benefit from lessons learned on the social Web.

Tracking Devious Phishing Websites

Researchers are monitoring a trick that makes it harder to track and shut down fraudulent websites.

Close Comments

To comment, please sign in or register

Forgot my password

fiberman

186 Comments

  • 929 Days Ago
  • 07/29/2009

The odds are in your favor!

Well, if 90+% of all email is SPAM, just delete all emails and you'll be right 90+% of the time - a helluva lot better than other SPAM filters!
C'mon, based on my recent experiences, SPAMers are a lot smarter than the people who run the Internet. And those who profit from building ever more Internet bandwidth have no interest in stopping SPAM. If it was stopped, they'd lose a lot of business selling product and services to build more capacity for a couple of years.
We can stop SPAM in 10 minutes - just charge a penny per email. Make it economically unfeasible - like the US Post Office has with most types of junk mail - and you stop it immediately. And implementing it is simple - charge everyone a few extra bucks a month for the first couple of hundred emails they send so only the big users get billed.
But it will never happen - as Internet profiteers will never go for it - as I said - they make too much money off the Spammers.
Maybe we need to compare them to the Big Pharma or the Health Insurance Industry - everybody knows what profiteers they are!!!

Reply

smithsomian

182 Comments

  • 929 Days Ago
  • 07/29/2009

Re: The odds are in your favor!

We can stop SPAM in 10 minutes - just charge a penny per email. Make it economically unfeasible - like the US Post Office has with most types of junk mail - and you stop it immediately. And implementing it is simple - charge everyone a few extra bucks a month for the first couple of hundred emails they send so only the big users get billed.

But it will never happen - as Internet profiteers will never go for it - as I said - they make too much money off the Spammers.


It has nothing to do with profit, and everything to do with implementation.

Using the post office as an example is disingenuous. The post office is a monopoly. You cannot just set up your own post office, issue your own stamps, and expect letters you sent to actually get anywhere. The post office is able to charge people money for sending letters because you simply have no other way to send said letters. Yes, you could always use UPS or Brown or DHL, but these are massive companies that have required billions of dollars to set up. And you still have to pay to use their services. They can charge you because they control EVERYTHING from receipt of your letter from you, to its delivery at its destination. At no stage does it ever leave their control.

E-mail servers are a whole different ball of wax. When I set up an e-mail server, I am (quite literally) setting up my own post office. Using that server, I can send e-mail to anyone else in the world without having to rely on anyone else to actually "carry" that e-mail as an e-mail. In other words, the ISP's which own the fibre optic lines that make up the Internet wouldn't be able to tell my e-mail from a web page without directly inspecting the data packet. To do that with every data packet passing through their system would be cost-prohibitive, mainly because any data over a certain size (usually a fraction of what an e-mail or web page usually is) is broken up into many packets which can take any possible path between source and destination. To track and confirm an e-mail would require re-assembling the entire e-mail from its individual packets -- and packets may have bypassed their network entirely (by using a competitor's), rendering any re-assembly impossible. The Internet is based on a LACK of control... and that is its strength. If a network from, say, AT&T goes down, there is always the network from Sprint which can allow data to be re-routed around the blackout area. That way, no one ISP actually "controls" what goes over its network... it can throttle the data, but it cannot control it in any way which is reliable for everyone on the Internet. This lack of control - this flexibility, actually - is the very keystone of what makes the Internet reliable and resistant to interruption.

Plus, the barrier to entry for e-mail is so low. Setting up your own post office and postal network would require millions - if not billions - of dollars. Setting up a single e-mail server costs me nothing in terms of software (Linux) and may even cost me nothing in terms of hardware (cast-offs from people upgrading).

Then we get to the issue of who would collect your "penny per e-mail". Since I have just created my own postal service (e-mail server), and since the e-mail server connects directly to the recipient server to deliver e-mail, who would step in to demand payment? If the recipient is another "independent" like me, why should we even acknowledge the transaction? Who would be the central arbiter for collecting these funds, where would they go, and what rights would this central arbiter have to force independents like me to pay? Essentially, forcing senders of e-mail to pay would fracture and break up the Internet into the "haves" and the "have-nots".

And finally, there is the issue of inertia. E-mail is now so pervasive, and so broadly spread around the world, that any attempted implementation would be stillborn. Entire countries would refuse to play along (especially those poor countries struggling to improve their lot with the Internet - even a penny per e-mail would be too costly for citizens who might earn only a few hundred dollars per year). Businesses would balk. And individual consumers - especially those who are tech savvy - would be the biggest protesters. They would find ways around the pay system, rendering it impotent. Without 100% compliance from day one (and we are 34 years too late for that), setting up a "pay system" for e-mail would be impossible.

Reply

fiberman

186 Comments

  • 929 Days Ago
  • 07/29/2009

Re: The odds are in your favor!

Methinks you do complicate too much!
Is the PO a monopoly? Then why do I send paperwork by UPS or FEDEX more often than USPS? What happens when I send a letter outside the US? What about phone calls? The billing systems are well-developed and essential to international trade.
C'mon, implementing a counter on email and charging usage over say 1000 or even 5000 per month per user would be much easier than fighting all the @#$%^&*()_ SPAM.
And the Internet is not FREE - we're paying for all those $%^&*() to fill my inbox with thousands of undeliverable notices because they use my email address as the spoofed source of their SPAM!

Reply

pjduncan

20 Comments

  • 929 Days Ago
  • 07/29/2009

Re: The odds are in your favor!

I believe some of the people proposing the charging scheme even suggest having the penny flow through to the final recipient.  The way this would probably work is that any ISP or corporation with mail servers would simply refuse to receive mail from entities who have not joined the payment scheme.  If you run an independent mail server and wished to accept email from any servers without payment that would be your option.  If you ran an independent mail server and didn't want to join the payment scheme you would be severely limited in who you could send mail to.

This wouldn't require inspecting packets anywhere along the network since it could be handled by the mail servers.  Obviously, this requires a secure way for mail servers to identify themselves to each other, but that's a fairly simple problem.

Reply

sorgfelt

23 Comments

  • 929 Days Ago
  • 07/29/2009

spammers send free

If you set up a system to charge for emails sent, it will have to be administered by our ISPs.  The legitimate users will have to pay the charges to the ISPs.  However, spammers use bots, open SMTP servers and/or ISPs in Russia that don't care about this issue.  They will still send spam for free.

Reply

fiberman

186 Comments

  • 929 Days Ago
  • 07/29/2009

Re: spammers send free

ISPs all around the world pay for their connections.

Reply

pjduncan

20 Comments

  • 928 Days Ago
  • 07/30/2009

Re: spammers send free

ISP and internet mail services (Verizon, Yahoo, Google, etc.) would have SMTP mail servers which would keep track of the traffic from other mail servers.  They would bill those organizations for the incoming traffic.  If an organization is in Russia or elsewhere with an open SMTP server or simply lax policies, they would either loose money by having to pay the recipient organizations or they would not pay and their mail servers would be refused connections in the future.

Reply

Advertisement

arnetwork

85 Comments

  • 929 Days Ago
  • 07/29/2009

pay the recipient

I like the idea of pay the recipient. I simply instruct my isp to not send me any email that hasn't been paid for. People who want to send me email then have to have an account to which they are paying with their isp. Assuming networks have a secure way of identifying themselves then payment is assured.

Only people who wish to pay a penny to email me get through. Those networks that don't charge will find their email options limited to those who don't mind getting spam.

Some companies will pay the penny because they see me as a worthwhile prospect. Let them. They will soon be out of business if their selection criteria are out line with reality. If they are correct in their assumptions I will look at the email. If they are wrong I will ignore it.

Some people will opt for the free email networks because they don't want to pay for their emailing and don't care if they get a lot of spam. Fine.  

Since the networks wouldn't be in business if they didn't already have a secure way of correctly billing each other it seems like it ought to feasible.

Reply

Eric D

1 Comment

  • 920 Days Ago
  • 08/07/2009

Pay *anyone* -- it's easier

If I an see that the sender has paid anyone something, I'm willing to look at the message, because a spammer can't afford to do that. This makes implementation easier than it would be if the payment had to go to the email recipient, or to the originating ISP:

A sender need only set up an account with any known, trusted payment-certifier, which will sign email messages as being paid-for. A spam filter need only check to see whether a message has been signed. Assume 0.01 cents per message, and a $10.00 payment-account lets you send 100,000 paid-for messages. The payments can go anywhere but a spammer's own pocket -- a designated charity, a dollar-bill bonfire, whatever.

The incentive to pay, and to use an email service that notices payments: For serious email users to be able to exchange messages that don't get thrown away. A few more details are left as an easy exercise for the reader.

Yahoo could do it. Google could do it. Others could do it. There's no need to change infrastructure.

I want this. Do you?

Reply

Phineas

127 Comments

  • 928 Days Ago
  • 07/30/2009

Much Much More...

Set up your penny-per-email and see if the fees escalate.

Reply

goulding

1 Comment

  • 928 Days Ago
  • 07/30/2009

a penny is way too much

While I realize most people are just using a penny as a trivially small amount, I think its worth noting that one could probably use a far smaller amount to deter spammers.

See Spam gets 1 response per 12,500,000 emails

If you accept that rate, even if the spammers make $100 profit for each "response" (which seems generous), one would still need to charge only 0.0008 cents per email (or 1 penny every 1250 emails) in order to eliminate that profit.

Reply

OnSeeker

1 Comment

  • 927 Days Ago
  • 07/31/2009

A risky thing!

In the cloud" technologies is really new and still a little unstable, because good emails or other things can get stopped before one gets access to them.

I think also that the possibility to get less spam can be moderated on your own pc. I use a very good anti virus with an excellent detection rate for SPAM and viruses!

I'm using BitDefender and I recommend it!

Reply

Mulea

1 Comment

  • 927 Days Ago
  • 07/31/2009

bad idea

On the original article: because I've spent time securing my server and closing ports I'll be classified as a spammer? Gee thanks!

On the charge scheme: so who's going to deliver these payments? How much are they going to charge? Who will keep them honest?

The last attempt to do something like this has already sold out for money. Look at spamhaus. They run a "spam list" that most of the major free email carriers subscribe to. It includes every address provided by any of the cable companies. Even if you don't send spam you are blocked from sending email just because you're using cable for your service provider. There's no opt out and nobody will remove me from that list unless I pay for it.

You guys really need to live closer to the real world for a while.

Reply

Atwes

1 Comment

  • 927 Days Ago
  • 07/31/2009

Fixing a flawed system...

This entire discussion has been about fixing a communication system with fundamental flaws. The protocols used for email were never designed for security. It's far to easy to spoof email information (sender, sender IP, etc). Instead of trying to patch another layer of software on top of the protocols to detect and deny spam, why not generate a new and improved email protocol?
Email 2.0 could be designed for security and reduction of spam.

Reply

Advertisement

rhsimard

1 Comment

  • 927 Days Ago
  • 07/31/2009

IP address spoofing

The article says that IP addresses are easy to fake.  That's true for such things as, for one example, DNS cache poisoning, where attackers firing intense bursts of UDP packets with faked IP addresses at target servers have no need for any kind of reply, but a TCP connection (or, for that matter, a data exchange over UDP where the recipient's application communicates with the sending application and does not blindly accept what's thrown at it) can't work without valid IP addresses at both ends.

Perhaps what the author means is faked Received: headers.  Essentially, the spammer can stick anything he likes into the headers and body of a message, including bogus Received: headers, but once it leaves his control, Received: headers (which include the IP addresses of all concerned) are as reliable as the servers that generate them. Presumably they are trustworthy, at the very least, the last SMTP server along the line, normally operated by the recipient's provider.

Reply

pegdashfab

1 Comment

  • 926 Days Ago
  • 08/01/2009

Re: IP address spoofing

I agree with rhsimard: IP addresses are not easy to spoof in an SMTP conversation (and I am guessing that Dean Malmgrem was misquoted).

Feamster's idea seems promising to me.  I look forward to reading the paper.

Reply

Mighty_BPFH

1 Comment

  • 926 Days Ago
  • 08/01/2009

@ Rachel

"Bots, automated systems that are often used to send out reams of spam, tend to keep open only the e-mail port, known as the Simple Mail Transfer Protocol port."

Wow ... In one sentence you just shoot the whole research down.
so they are not even able to understand SMTP, a protocol invented 29 years ago ?
Hum very promising, ok we can move...

Reply

rosh

1 Comment

  • 924 Days Ago
  • 08/03/2009

Why use E-mail ?

Why continue with a communication tool (Email) that is only 10% correct? I would say we should stop this "free push" way of communicating and use something that "pulls on demand" from the "trusted senders".  Contrast E-mail spam levels to that of Social networking sites, and you could get an idea of what I am saying..

Reply

Killerhawk

1 Comment

  • 878 Days Ago
  • 09/18/2009

The trouble with trouble is....

Maybe  someone will propose a question, which certain others do not want to tolerate at any cost, How many IUSERS (internet users)would like to subscribe to a non-commercial version of the internet with the following Characteristics:
1. only private individuals (real humans- no corporations allowed-ads,photos, porn (;>, ETC,ETC,...
2. A FULLY COMMERCIAL ONLY Salesnet, with all of the accompanying ysda, yada...
3. Each 'net' space to have distinct, non- cooperating ips systems, with a $5,000,000 fine 10-20 years minimum(no parole until ninety percent of sentence is served)  jail time , including HARD LABOR ( and no "white, white collar prison time category) imposed by the court system on any any all who would violate the non cooperative ips (info sharing)feature.
Before anyone starts B'ing about all of the expenses and improbabilities, impossibilities, and whining for the "$$$-tit", you must submit a documented dissertation-level economic study which outlines clearly the monetary costs to everyone in America, at the very least, who've struggled to recover their "good name". Indicate some vehicle that quantifies the value of opportunities lost because "The Commercial Wizzes of Sleez'n Greed and their Political Pole Dancers" refused to implement or even recognize the words PERSONAL PRIVACY.

Please enter ALL complaints, criticisms, and whiney statements in the BOX following the next period(punctuation mark!!!).  

Reply

Advertisement

MAGAZINE

Can We Build Tomorrow's Breakthroughs?

Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.

Advertisement

Technology Review Lists

TR50

Our list of the 50 most innovative companies, including the following:

Crowdcast

SpaceX

Novomer

Amazon.com

More

Advertisement

Facebook

Advertisement