Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

EveryZing’s underlying technology is composed of two basic technologies from Boston-based BBN. The core speech-to-text system, called Byblos, has been funded by $50 million of research money based on a series of government grants over the past five years, says Wilde. Using probabilistic machine learning algorithms, the system takes one minute to convert each minute of audio content into text.

The second part of the technology, says Wilde, is the algorithms that process the content of the text. BBN’s natural language technology contains huge stores of phrases and words for context, which helps it make sense of a video. For instance, a news segment about health might use language that’s specific to the medical field. In this case, the system would be able to recognize certain obscure words. Understanding the meaning of the text is a powerful tool, says Wilde, because it lets EveryZing provide high-level concepts to users so that they can fine-tune their search. And importantly, it enables the company to pair targeted ads with the right content.

The time is right for a video search engine with these capabilities, says Carnegie Mellon’s Stern. “Video is a much more compelling and entertaining medium than just plain text,” he says, and now so much of it is available on the Internet. He adds that BBN’s 80 percent accuracy is “really quite a feat,” and it should be adequate for searching the troves of content online.

While the technology is good, it’s not perfect, says EveryZing’s Wilde. The accuracy drops when background music is present and if there are multiple people talking at once. But for the infotainment and news market that the company is targeting right now, the technology should offer a significant improvement over what’s currently available, he says. “I think we’ll look back in a couple of years and say, ‘Of course the content of multimedia files needs to be searchable,’” says Wilde. “It’d be as if the Web pages could only be searched by title and tag.”

0 comments about this story. Start the discussion »

Credit: everyzing

Tagged: Computing, Google, search, video, Yahoo!, speech recognition

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me