Skip to Content

Mining for Meaning

Software

Online newsgroups are popular gathering spots; over the years they’ve logged millions of opinions on topics ranging from politics to appliances. The largest newsgroup network, Usenet, boasts 500 million messages posted since 1995; unlike postings in chat rooms and online forums, such messages tend to be uncensored-and preserved.

All these postings add up to a trove of public opinion that sociologists, linguists and market researchers would love to analyze; and software projects at IBM and the University of California at Berkeley are beginning to develop the analytical tools they’ll need. Unlike Web search engines, which try to find the best matches for any one query, these efforts focus on understanding how communities of individuals interact online, and how their opinions evolve.

To begin taking on this difficult task, IBM’s Babble software depicts conversations as dynamic circular graphs in which icons representing frequent talkers cluster at the center, and less chatty participants move toward the circumference. “People do in fact cluster together when talking, then drift apart,” says Thomas Erickson, research analyst at IBM.

But that’s only a first step. Beyond charting the chatters lies the task of examining what they’re saying. At the University of California, Berkeley, computational linguist Warren Sack’s software maps how often words or phrases appear, and how close they are to one another. “In effect you’re building a thesaurus of terms that relate directly to the conversation being studied,” says Sack. “You can see constellations of conversations, and see which topics are being discussed more than others.” One test of this Conversation Map tool helped pinpoint when online participants began thinking of Gulf War syndrome as a “disease” rather than a cluster of symptoms.

Sack and others say they’re still years away from a commercial product. When the software is available, though, market researchers just might be the customers: with the right tools, they could turn newsgroups containing millions of opinions into the ultimate focus group.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.