Real-time search means retrieving information about what’s happening, everywhere, now. The amount of real-time data that’s available is growing rapidly with the proliferation of mobile devices. At Yahoo, we have already begun to incorporate real-time search results from Twitter and sources of developing news. But the scope of real-time data reaches far beyond tweets and Facebook updates. For example, users are uploading photos on Flickr to show what’s happening around them, chatting about the latest news, and answering questions live on sites like Yahoo Answers. That’s just the beginning of the real-time information that can be made available to search engines (see “TR10: Real-Time Search” ).
The sheer amount of real-time data presents unique challenges for search. Because a lot of the data is nonauthoritative, noisy, or spammy, search engines need to build trust models that can determine what data is important and influential. For example, retweets are not often useful results, and some data providers carry more authority than others. Search engines must also determine the right balance between timeliness and relevance to each user. Further, real-time data needs to be indexed and updated instantaneously. A few years ago, search engines took several hours to index. Today, they take only a few seconds–but they need to become even faster.
With the challenges of using real-time data come some exciting possibilities for reimagining search. As in the early days of the Web, when Yahoo built a directory to identify authoritative sites, we are seeing search engines building better trust models. Aggregators are emerging to qualify the reputations of sources. Many other types of self-organization are possible in this new realm.
Don’t settle for half the story.
Get paywall-free access to technology news for the here and now.