The loser in all this is the consumer who is conned into making a purchase decision based on false premises. And for the moment, consumers have little legal redress or even ways to spot the practice.
Today, Cheng Chen at the University of Victoria in Canada and a few pals describe how Cheng worked undercover as a paid poster on Chinese websites to understand how the Internet Water Army works. He and his friends then used what he learnt to create software that can spot paid posters automatically.
Paid posting is a well-managed activity involving thousands of individuals and tens of thousands of different online IDs. The posters are usually given a task to register on a website and then to start generating content in the form of posts, articles, links to websites and videos, even carrying out Q&A sessions.
Often, this content is pre-prepared or the posters receive detailed instructions on the type of things they can say. And there is even a quality control team who check that the posts meet a certain ‘quality’ threshold. A post would not be validated if it is deleted by the host or was composed of garbled words, for example.
Having worked undercover to find out how the system worked, Cheng and co then studied the pattern of posts that appeared on a couple of big Chinese websites: Sina.com and Sohu.com. In particular, they studied the comments on several news stories about two companies that they suspected of paying posters and who were involved in a public spat over each other’s services.
The Sina dataset consisted of over 500 users making more than 20,000 comments; the Sohu dataset involved over 200 users and more than 1000 comments.
Cheng and co went through all the posts manually identifying those they believed were from paid posters and then set about looking for patterns in their behaviour that can differentiate them from legitimate users. (Just how accurate were there initial impressions is a potential problem, they admit, but the same one that spam filters also have to deal with.)
They discovered that paid posters tend to post more new comments than replies to other comments. They also post more often with 50 per cent of them posting every 2.5 minutes on average. They also move on from a discussion more quickly than legitimate users, discarding their IDs and never using them again.