Technology Review - Published By MIT
Advertisement

More-Accurate Video Search

Speech-recognition software could improve video search.

By Kate Greene

Tuesday, June 12, 2007

smaller text tool iconmedium text tool iconlarger text tool icon

Boston-based startup EveryZing has launched a search engine that it hopes will change the way that people search for audio and video online. Formerly known as PodZinger, a podcast search engine, EveryZing is leveraging speech systems developed by technology company BBN that can convert spoken words into searchable text with about 80 percent accuracy. This bests other commercially available systems, says EveryZing CEO Tom Wilde.

Audio cues: A new video and audio search engine can convert audio into a text transcript with 80 percent accuracy. That’s good enough to show snippets of the transcript, direct users to the spot in the file where a search term appears, and summarize key concepts.
Credit: everyzing

This high accuracy is enabling new search capabilities, Wilde says, such as the ability to provide entire transcripts of video and audio, and the ability to direct people to the exact spot in a file where a word or phrase is spoken. The technology will also let the company provide targeted ads associated with specific content, much in the way that Google provides ads based on the text of a Web page.

"The big challenge [in online video and audio] ... is the opaqueness of media content," says Wilde. It's extremely difficult to know what range of content is inside a video or audio clip. "The problem we want to solve," he says, "is the discoverability of multimedia within Web search." EveryZing does this by extracting the content of multimedia files and outputting text so that it can take advantage of the preexisting text-search tools developed by the likes of Google and Yahoo.

The Web is exploding with multimedia from YouTube, podcasts, TV news reports, and National Public Radio shows. But it's still difficult to search for "Barack Obama" and pull up all the instances on the Web in which his name is mentioned. Typically, the titles of clips and the tags that people assign to them don't contain enough information to give useful search results. And this is why a handful of companies over the past couple of years are exploring using audio content as a guide. For instance, video search engine Blinkx uses speech-recognition technology to scour the entire Web for relevant content, aggregating it on a single site, much as Google aggregates Web pages. (See "Surfing TV on the Internet.")

EveryZing's business goals differ from Blinkx's, says Wilde, and he suspects that the two approaches can complement each other. "We're about merchandising content, not trolling the Web," he says. EveryZing (which, like Blinkx, provides a search portal for Web surfers) mainly wants to partner with content providers to make their multimedia searchable. For instance, the company wants to convert all the audio and video content within ABC.com into searchable text, adding time stamps to that text (as well as preexisting closed-captioned text) so a person can immediately jump to a specific word in a clip.

Story continues below

In addition, unlike Blinkx's current technology, BBN's technology lets EveryZing extract high-level concepts that originally might not have been searched for. If someone searched for "Barack Obama," for instance, EveryZing might also offer other keywords in the clip, such as "rally."

The idea of using audio transcripts to search for multimedia has been around in research labs for decades, and basic speech-recognition research dates back even earlier. Much of the seminal work occurred at BBN, MIT, Carnegie Mellon University, IBM, and SRI International. In 1995, Carnegie Mellon had a working demonstration of a similar video search system, says Richard Stern, professor of electrical and computer engineering at the university. This system, called Informedia, spurred other research in the field, he says, and was the precursor to BBN's modern video analysis approach.

Comments

Log In

Forgot your password?     Register »
Advertisement

Videos

Making 3D Maps on the Move
Technology Review November/December 2009

Current Issue

Natural Gas Changes the Energy Map
The United States has vast supplies of this cleaner fossil fuel. But how should we use it?
Featured Content
Sponsored by:
White Papers

Twelve ways to reduce costs with SQL Server 2008
Find out how to reduce costs and get more efficient

Download

Total Economic Impact of SQL Server 2008 Upgrade
Forrester reports on increasing productivity and management capabilities

Download 

Achieving Cost and Resource Savings with UC
How Office Communications Server R2 and Exchange Server can make your business smarter and more efficient

Download 

The Compelling Case for Conferencing
Read how you can improve workload support and find IT efficiencies

Download

How Windows Server 2008 R2 Helps Optimize IT and Save you Money
Read how you can improve workload support and find IT efficiencies

Download

Windows Server 2008 R2 Hyper-V Live Migration
See how Windows Server 2008 R2 and Hyper-V enable virtualization and Live Migration

Download
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2009 Technology Review. All Rights Reserved.