Synopses: Information Technology

Better speech-based error correction for dictation tools; how to check errors in a quantum computer; machines learn to analyze brain activity.

Monya Baker (edit)archive page

March 1, 2005

Verbal Compass
Better speech-based error correction for dictation tools />

CONTEXT: Extreme multitasking is the modern fad, but no person has enough hands to manage a cell phone, a digital organizer, a steering wheel, and coffee all at the same time. Accordingly, people want a hands-free way to interact with computers. Although speech recognition systems are more accurate than ever, typical users still spend more time correcting errors than dictating text; half of their correction time is spent just moving a cursor to errors identified in, say, a dictated e-mail. “Confidence scores” – the software’s estimates of how likely it is to have captured the right word – can be used to identify possible errors. Now Jinjuan Feng and Andrew Sears at the University of Maryland, Baltimore County, have shown that confidence scores can also be used to accelerate the correction process.

METHODS AND RESULTS: Twelve participants dictated 400-word documents using a speech recognition system. It interpreted 17 percent of the words incorrectly, a typical rate; it was the correction process that was atypical. The software used confidence scores to tag words throughout the text as “navigation anchors.” Users could quickly jump to each anchor with short voice commands and then move a cursor word by word to the error. The researchers measured the number of navigation commands the participants used, the failure rates of the navigation commands, and the time spent dictating and navigating. Average failure rates reported for other techniques are about 5 percent for direction-based navigation (“move right”) and 10 to 20 percent for word-based navigation (“select December”). In a test of Feng and Sears’s technique, the failure rate was only 3.2 percent. Even better, the time users spent navigating to errors was cut by nearly a fifth. This is significant compared with other error-correction techniques and it is promising, because this work suggests the means for further improvement.

WHY IT MATTERS: The Lilliputian buttons on PDAs and other pocket-sized wonders are quickly shrinking under a constant-sized thumb. Multitasking is on the rise, and more people with physical disabilities are entering the workforce. Both trends will steer users away from computer systems with manual interfaces. Speech recognition, but for its high error rate and long correction times, is an obvious alternative.

This work clearly shows that using confidence scores for navigation can shrink users’ correction times. With further improvements, the technique promises to boost the usability of hands-free error correction and so engender a surge of new gadgets and applications.

SOURCE: Feng, J., and A. Sears. 2004. Using confidence scores to improve hands-free speech based navigation in continuous dictation systems. ACM Transactions on Computer-Human Interaction 11:329-356.

Quantum Corrections
How to check errors in a quantum computer

CONTEXT: To an outsider, the logic of quantum computing can seem mystical. While a standard bit represents data as one way or another (digital 0 or 1), a quantum bit stores data as one way and another (0 and 1 and all possibilities in between). While a standard computer must crunch through possible solutions one at a time, a quantum computer could, in theory, survey all solutions at once and pick the correct one in a single step. This is ideal for solutions that rely on trial and error, such as breaking encryption codes.

But, like some cursed mythical creature, much of the information contained in a quantum system will vanish if it is observed, because the process of looking at it disturbs the system. That means a user can look at the answer to a question but can’t check the calculations behind it. A quantum computer therefore needs to correct errors reliably without anyone actually seeing them. Now, for the first time, John Chiaverini and colleagues from the National Institute of Standards and Technology (NIST) have done this in a quantum system that could be scaled up.

METHODS AND RESULTS: In the NIST quantum computer, information is encoded in a single atom’s quantum state. Using a process called entanglement, the fate of this “parent atom” is linked to that of two companion atoms, so that changes to the parent’s condition are reflected in the companions. Using beryllium ions (atoms with electric charge) to carry quantum information, the researchers were able to disentangle, decode, and compare the states of the two companion ions and thus indirectly deduce whether an error had occurred. A laser pulse could then correct the original ion’s quantum state without actually observing it.

WHY IT MATTERS: Many encryption techniques depend on the difficulty of factoring very large numbers through trial and error. A quantum computer could, in theory, defeat all such encryption systems and promises to be orders of magnitude more powerful than the most advanced systems today. So anyone interested in keeping digital secrets – from credit card numbers for Web transactions to classified information for governments and corporations – cares about quantum computing. Although a useful quantum computer is still far, far away, the work at NIST has shown how to lift one of the most bedeviling curses of quantum mechanics.

SOURCE: Chiaverini, J., et al. 2004. Realization of quantum error correction. Nature 432:602-605.

Scanning Your Thoughts
Machines learn to analyze brain activity />

CONTEXT: Can computers learn to read the human mind? Detecting thoughts may be beyond their abilities, but computers can be trained to recognize certain mental tasks from scans that monitor brain activity. One popular scanning technique, functional magnetic resonance imaging (fMRI), already aids the study of learning, memory, emotion, neural disorders, and psychiatric drugs. Using statistics and data analysis, researchers can identify patterns of activity as characteristic of certain mental activities and states. Now, Tom Mitchell and his colleagues at Carnegie Mellon University have shown that computers can automate this process, at least for some simple tasks.

METHODS AND RESULTS: Using fMRI data from subjects engaged in various tasks, the CMU team trained computers to recognize which fMRI patterns accompanied cognitive states for different tasks. During this process, the computer developed mathematical models to distinguish between different cognitive states. Then, given new fMRI data, the computers predicted the subjects’ mental states from the brain scans. Though imperfect, the automatically trained computers convincingly outperformed chance in discriminating whether a subject was looking at sentences or pictures, reading ambiguous or nonambiguous sentences, and reading words associated with different categories such as people, tools, or fruit.

WHY IT MATTERS: This work shows that a computer can use the results from one set of brain scans to predict what a brain was doing during other scans. This capability could eventually lead to more accurate use of MRI scans in medicine. It might also speed up data analysis, particularly when one individual is being studied over time. And, since the computers learned to recognize brain activity from a single short interval rather than a composite of several scans over a longer time period, it might reduce the time each patient spends in an MRI machine, making expensive equipment more readily available.

More broadly, this work is an important application in the field of machine learning. With relatively few training examples, the computers were able to detect meaningful patterns in data containing thousands of inputs, many of them irrelevant or inaccurate. As scientists collect ever more detailed data sets from the brain and other complex systems, these techniques proffer a way to use the information more effectively.

SOURCE: Mitchell, T. M., et al. 2004. Learning to decode cognitive states from brain images. Machine Learning 57:145-175.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.