New computing devices are inspiring new ways to input text.
Per Ola Kristensson is making it easy, fast, and intuitive to input text on mobile devices. He helped invent the popular gestural text-entry method known as ShapeWriter, but that’s just the beginning. Kristensson, a lecturer in human-computer interaction at the University of St. Andrews in Scotland, thinks gestures could be combined with speech recognition and even gaze recognition in a text-entry system that makes it easier to correct mistakes and enter unpronounceable information like passwords. “I’m interested in optimizing the flow of information from your brain into the computer,” he says.
ShapeWriter lets you enter text by dragging a finger over the letters in a word. The software then stores the squiggle or shape that you make when you touch those letters as a stand-in for the word itself. The shapes for common words are easy to recall; any time you want to enter such a word, you can quickly reproduce its shape instead of pecking at the letters again. Practiced users can gesture-type in excess of 30 words per minute—blinding speed on the typical mobile device. The ShapeWriter app was downloaded more than a million times from Apple’s App Store before it was bought by Nuance Communications in 2010. Now the technology is built into Android, where it’s called “gesture typing.”
Kristensson, who has a quick smile and an easy laugh, has always sought to fuse disparate fields of inquiry. Growing up in Sweden, he bucked an educational system designed to channel students into narrow specializations. He was drawn to computer science but couldn’t bear spending four years studying nothing else. So he opted for cognitive science, which enabled him to study not only computer science but also linguistics, philosophy, and psychology. That combination launched him on the path to creating user interfaces that are fundamentally changing the way we interact with computers.
His work on tools for disabled people illustrates his approach to problem solving. Many people who can’t speak and have very limited manual dexterity communicate by slowly typing words and prompting a computer to pronounce them. Their communication speed averages one or two words per minute. In such a laborious process, predicting the speaker’s intent can greatly accelerate the task. This requires what is known as a statistical language model. “I was amazed to find that in 30 years of development of this kind of technology, no one had produced a good statistical model for the things these people need to say,” Kristensson explains.
The main problem is the dearth of data from which to derive statistical relationships. You can’t wiretap the computers used by large numbers of disabled people. So Kristensson came up with an alternative: ask people who are not disabled to imagine what they would say if they had to communicate by this method. He used Amazon’s Mechanical Turk to crowdsource imagined communications—”Who will drive me to the doctor tomorrow?” and “I need to make a shopping list.” Then he combed through Twitter, blogs, and Usenet for phrases that were statistically similar to the ones generated by Mechanical Turk. After several iterations, he had the tens of millions of phrases he needed to build a useful model.
These days, Kristensson is working on technology that supports super-fast typing: a gargantuan statistical language model that accurately interprets typed input despite large numbers of mistakes. He’s also working on new ways to enter text in the absence of a touch screen or keyboard. Such technology will be necessary to make the most of wearable computing devices such as Google Glass, but it will have to work nearly perfectly to be of any benefit, given how frustrating a bad speech-to-text system can be. “In a few years, we’ll have amazing sensors that will help us generate contextual information to create truly intelligent, adaptive interfaces,” he says.
Gain the insight you need on emerging technologies at EmTech MIT.