How Spammers Use Low-cost Labor to Solve CAPTCHAS
Workers in Russia, Southeast Asia, and China are paid a pittance to solve millions of CAPTCHAS.
Christopher Mims 08/11/2010
- 9 Comments
What can only be described as an epic new analysis by a cadre of researchers at UC San Diego has uncovered the seedy underbelly of a sophisticated, highly automated, world-wide network of services that help email, blog and forum spammers get past the CAPTCHAS that are designed to keep them out.
![]() |
A CAPTCHA, for those of you not up on your reverse Turing tests, is that little bit of distorted text you have to type back at a webpage when you're trying to sign up for a new email account or leave a comment on a forum or blog that happens to use them. The original idea was that a CAPTCHA would prevent spammers from being able to flood public forums with their dreck, because CAPTCHAS are by definition easy for humans to solve but challenging or impossible for computers to get right often enough. They'll be recognized as a computer after their 6th or 7th failure.
But the inventors of CAPTCHAS probably didn't anticipate this: Hundreds, possibly thousands of laborers working for less than $50 a month to solve an endless stream of CAPTCHAS delivered to them by automated middlemen who sell the results to spammers in real time, so that their spam bots can use those solutions to post to forums and blogs as well as set up fraudulent email accounts, says a paper about to be delivered at the USENIX Security Symposium.
![]() |
Clever analysis of the location of the workers involved in this scheme revealed that they are based in India, Russia, Southeast Asia and China. The system is so efficient at delivering CAPTCHAS to workers in these remote locales that the average time for delivery of a solution hovers around 20 seconds.
One of the CAPTCHA services the researchers experimented with - ImageToText - was so good that its workers were able to deliver correct results in "a remarkable range of languages," including Dutch, Korean, Vietnamese, Greek and Arabic.
Even setting the sample CAPTCHAS in Klingon - a language readable by so few people on earth that the scientists thought they could use it as a control in their experiment - wasn't enough to stop ImageToText, whose workers managed to solve a handful of these CAPTCHAS despite odds of less than one in one thousand of their randomly getting the right answer.
The results of this landmark study show that a number of sites, including those run by Microsoft, AOL, Google and the widely use reCaptcha, are regularly compromised by spammers employing these services.
Here's an actual screenshot of what workers for these services see when solving a CAPTCHA:

The researchers conclude that their investigation, which included interviews with an anonymous "Mr. E" who actually runs one of these services, proves that for sophisticated spammers, CAPTCHAS aren't so much a barrier as a cost of doing business.





mattgroom
290 Comments
Solutions
Offer a choice to the user:
1. Make the user enter the answer in under 10 seconds. Make them do this three times.
2. Take a computer specific id by allowing a piece of software to grab information from the persons machine in real time.
That computer id will only work for 5 created accounts.
I prefer number 2.
I have additional things i could say to improve the computer imaging of these but if i do that they could be picked up the wrong people. I hate spam.
Reply
rsanchez1
213 Comments
Re: Solutions
I think the very best solution is to spread awareness of spam. If people don't click spam links, or if they are tricked into clicking spam links immediately leave the site they were brought to, then the spammers won't have people to spam. EVERYONE has to do this for it to be effective. I read once that as little as 1 in 100,000 people have to fall for spam for spam to be effective. This is a problem with internet culture and if you tell people that by not responding to spam, you're helping poor people in Asia that are being exploited for spam, you'll get people to stop responding to spam.
Reply
jsmitty2212
1 Comment
Re: Solutions
Those are both bad ideas.
#1 has all sorts of problems. Most people would have a hard time solving a good CAPTCHA in under 10 seconds, but I bet someone who does it all day would get to be faster than average at it. I have seen people take minutes to figure out what one says. That is what makes the 20 second number so amazing, even with the overhead involved the service gets the correct solution really fast. And making people solve 3 in a row is going to frustrate legitimate users, while people being paid to solve them will just keep plugging away.
As for #2, you just hand waved away all the import details. What piece of computer specific information, and what software should you use to capture it? Flash? Java? Silverlight? Whatever you pick, there are going to be platforms like smartphones that don't run it. And do you really want to give some random website permission to install software that can read your hardware details? Of course as soon as the software is created, people will start working on cracking it. You will replace a simple image with something with privacy/security problems, that won't work on many devices, and it will just be a speed-bump.
Ultimately, the problem is that, barring strong AI, there isn't a technical solution to the problem of determining if you like the intent of the human behind the keyboard.
Reply