One problem with tweets is that people often lard them up with so-called “hashtags.” These are symbols that start with a pound sign (#) followed by a word that represents a very popular current topic, such as “Nexus One” or “Earthquake” or whatever else might be a trendy topic at the moment. When a hashtag is included in a tweet, the resulting tweet will show up when other Twitterers click the hashtag’s topic word elsewhere on the site.
While such tags can usefully maximize exposure of a tweet, they can also serve as red flags to lower tweet quality and attract spam-like content, Singhal says. While he wouldn’t get into details, he said Google modeled this hashtagging behavior in ways that tend to reduce the exposure of low-quality tweets. “We needed to model that [hashtagging] behavior. That is the technical challenge which we went after with our modeling approaches,” Singhal says.
Another problem: how, if someone is searching for “Obama,” to sift through White House press tweets and thousands of others to find the most timely and topical information. Google scans tweets to find the “signal in the noise,” he says. Such a “signal” might include a new onslaught of tweets and other blogs that mention “Cambridge police” or “Harry Reid” near mentions of “Obama.” By looking out for such signals, Google is able to furnish real-time hits that contain the freshest subject matter even for very common search terms.
In the future, both Twitter and Google hope to improve the relevance of search returns in all contexts by adding geo-location data, which can be added to postings sent from smart phones. In general, real-time search “is evolving,” says Dylan Casey, the Google product manager for real-time search. “I talk with the guys at Twitter on a regular basis to learn where the feature is going. We get feedback from them, we give them feedback, and our engineers collaborate. It is truly symbiotic.”
Singhal added that Twitter is hardly the only source of real-time information. “Twitter is indeed a very important component of the real-time Web. However, what we are observing is that it is just one of the components. There’s a lot of value in news, blogs, and Web pages that are being generated in real-time, because news organizations work very hard to get quality to a certain level,” he says. “Twitter is indeed useful because it is short-form content. However, we are finding that the real-time Web is much bigger.”