The company plans to release demo versions of the search engine on its Powerlabs website, where consumers can test-drive the product beginning in September. User feedback will be taken into consideration as Powerset makes the final product, which is slated for release next year.
“The key challenge is to get the system to the point where people can understand how to use it and get real value out of these systems even though they are not perfect,” Pell says. “We are finally at the point where we are going to cross that threshold.”
IBM is also in the midst of developing a semantic search engine, code-named Avatar, which is targeted at enterprise and corporate customers; it’s currently in beta testing within IBM. Project manager Shivakumar Vaithyanathan says that the hardest problems to overcome with natural-language search are finding a way to extract higher-level semantics from large documents while at the same time preserving precision and speed.
IBM’s engine is targeted toward searches of internal documents such as e-mail and intranet correspondence. It’s designed to be used in cases in which the user seeks to find one particular piece of information that could not be easily located, such as a specific phone number or package-tracking URL that’s located in one of thousands of e-mails that a person may have stored on her computer.
Avatar’s semantic search seeks to develop “interpretations” of keyword queries that model the real intent behind the query. For example, if the query was “phone number,” the search engine would search the thousands of e-mails that a person may receive for the numbers that resemble a phone number. The search engine would provide the user with the useful information he seeks, and not just a keyword entry in an e-mail that contains the words mentioned in the query.
In order to quickly extract all the meaningful information from both the underlying text and the query, Vaithyanathan says, it’s necessary to utilize either a lot of computers or a large number of people. Both options are expensive and can be difficult to implement. IBM hopes to find a way to extract meaning in less time and with fewer machines.
“If we do a better job of extracting, then we can do a better job of answering the questions that users give,” Vaithyanathan says.
Hear more from Google at EmTech Digital.