Better speech-based error correction for dictation tools />
CONTEXT: Extreme multitasking is the modern fad, but no person has enough hands to manage a cell phone, a digital organizer, a steering wheel, and coffee all at the same time. Accordingly, people want a hands-free way to interact with computers. Although speech recognition systems are more accurate than ever, typical users still spend more time correcting errors than dictating text; half of their correction time is spent just moving a cursor to errors identified in, say, a dictated e-mail. “Confidence scores” – the software’s estimates of how likely it is to have captured the right word – can be used to identify possible errors. Now Jinjuan Feng and Andrew Sears at the University of Maryland, Baltimore County, have shown that confidence scores can also be used to accelerate the correction process.
METHODS AND RESULTS: Twelve participants dictated 400-word documents using a speech recognition system. It interpreted 17 percent of the words incorrectly, a typical rate; it was the correction process that was atypical. The software used confidence scores to tag words throughout the text as “navigation anchors.” Users could quickly jump to each anchor with short voice commands and then move a cursor word by word to the error. The researchers measured the number of navigation commands the participants used, the failure rates of the navigation commands, and the time spent dictating and navigating. Average failure rates reported for other techniques are about 5 percent for direction-based navigation (“move right”) and 10 to 20 percent for word-based navigation (“select December”). In a test of Feng and Sears’s technique, the failure rate was only 3.2 percent. Even better, the time users spent navigating to errors was cut by nearly a fifth. This is significant compared with other error-correction techniques and it is promising, because this work suggests the means for further improvement.
WHY IT MATTERS: The Lilliputian buttons on PDAs and other pocket-sized wonders are quickly shrinking under a constant-sized thumb. Multitasking is on the rise, and more people with physical disabilities are entering the workforce. Both trends will steer users away from computer systems with manual interfaces. Speech recognition, but for its high error rate and long correction times, is an obvious alternative.
This work clearly shows that using confidence scores for navigation can shrink users’ correction times. With further improvements, the technique promises to boost the usability of hands-free error correction and so engender a surge of new gadgets and applications.
SOURCE: Feng, J., and A. Sears. 2004. Using confidence scores to improve hands-free speech based navigation in continuous dictation systems. ACM Transactions on Computer-Human Interaction 11:329-356.