Imagine you were a transcriptionist at the federal government’s trial of Microsoft last year. Say you were trying to find instances of when Bill Gates testified between May 15 and June 1. Using existing tools like full-text search engines, natural language query or speech recognition, you’d have to transcribe the audio into a text file, then index it with a lexicon of terms that included “Gates.” Such an undertaking would have been labor-intensive, time-consuming, and error-prone. But only then could congressmen quickly locate testimony in which they were interested.The key to expediting the process was eliminating the need for transcription or indexing or both. This has long appeared to be an insoluble problem. But a company called Fast-Talk Communications that spun out of Georgia Tech has created a way for users to locate subject matter in an actual audio file simply by phonetically spelling and entering any term they want to find.
Say, for example, that you want to locate the word “Sudetenland” in an audio account of events leading up to World War II. According to Mark Clements, co-founder of the Atlanta-based company, you’d simply “sound out what Sudetenland sounds like. Take the name, Sue,’ the city, Dayton,’ and the word, land,’ and string those together, type it in. That gets resolved into the set of phonemes you’re looking for” (phonemes are units of sound in any language of which all its words are phonetically comprised). The Fast-Talk software finds the string of phonemes that correspond to the letters you enter and guides you to all spoken references to Sudetenland in the audio file. Because this tool bypasses the whole transcription and indexing process, it delivers results fast. According to Clements, the system processes “on the order of 30 hours of material per second.”
This is important, says Dan Rasmus, an analyst at the market research firm Giga/Forrester, because “voice is one of those untapped resources that companies have.” Jackie Fenn, who follows emerging technologies at Gartner, contends that Fast-Talk’s “main value is in tapping into audio streams that you probably wouldn’t really be able to get access to” otherwise. “It’s not cost-effective to have a human do that,” Fenn says.