Applying the tools of social science to make robots easier to live and work with.
People often find robots baffling and even frightening. Leila Takayama, a social scientist, has found ways to smooth out their rough edges. Through numerous studies and experiments that look at how people react to every aspect of robots, from their height to their posture, Takayama has come up with key insights into how robots should look and act to gain acceptance and become more useful to people.
Takayama has had an especially big influence on the design of an advanced robot from Willow Garage, the startup she works for in Menlo Park, California. Called PR2 (see “Robots That Learn from People”), it’s an early prototype of a new generation of robots that promise to be indispensable to the elderly, people with physical challenges, or anyone who simply needs a little help around the home or office.
PR2 can fold laundry and fetch drinks, among other impressive tasks. But Takayama suspected that the nest of a half-dozen cameras originally perched on PR2’s head would alienate users. To find out, she turned to crowdsourcing, showing images of the robot head to an online audience recruited for the purpose. The results verified her concerns, and she successfully lobbied to jettison all but a few of the cameras, some of which were redundant.
More recently, Takayama has devoted effort to improving a robot called Project Texai, which is operated directly by humans rather than running autonomously. She ran an extensive field study to find out how Project Texai fit into the office environment of several different companies, coming by each office every two weeks to collect feedback and observe interactions between on-site staff and robots operated by remote colleagues. That study led to a surprising insight: “When you control a telepresence robot, there comes a point for a lot of people when they feel as if the robot is their body,” she explains. “They don’t want people to stand too close or touch the buttons on the screen.”
She also discovered that people in the offices ended up being less comfortable with Project Texai if they were allowed to dress it up. Personalizing the robot led people to feel more possessive about it and less accepting of the fact that someone else was controlling it. Project Texai should be personalized, Takayama concluded, but only by the “pilot,” and not by those who are around the machine. She also found that robot size can have a big impact on acceptance and is conducting a study to nail down the optimal height for Project Texai. Another key question: is it better to have the robot at eye level with a person who is sitting or standing?
Takayama is now conducting home interviews with the elderly and disabled to figure out which sorts of tasks would be most helpful to them. She predicts that someday soon, older people will employ personal robots to help them communicate with family and friends.
Rana el Kaliouby
Teaching devices to tell a frown from a smile.
Computers are good with information—but oblivious to our feelings. That’s a real shortcoming, believes MIT Media Lab scientist Rana el Kaliouby, because it leaves them unable to usefully respond to many of our needs until we take the trouble to tap out instructions. To close that gap, el Kaliouby has come up with technologies that help computers recognize facial expressions and other physical indicators of how someone is feeling. Someday this could help make our machines more adept at assisting us.
El Kaliouby is not the first researcher to try to map facial expressions. But where others have focused on trying to get computers to recognize a half-dozen exaggerated expressions recorded in the lab, she is identifying the more varied and subtle faces that people commonly make. “It’s a problem that requires pushing the state of the art of computer vision and machine learning,” she says.
To break the problem down, she zeroed in on 24 “landmarks” on the face. Then she trained a computer to identify how those parts of the face change shape in response to different emotions, creating expressions such as a furrowed brow. To ensure that the technology would work with people in different cultures, el Kaliouby, who lives in Cairo and spends one week a month at MIT, enlisted the help of thousands of people on six continents. They have allowed their computers’ embedded cameras to record their expressions while they watch a video, resulting in what she says is the largest database of facial images in the world.
One early experimental application of the technology was a set of camera-equipped glasses intended for people with Asperger’s syndrome, who tend to have difficulty recognizing others’ emotional states. The device could recognize whether someone facing the wearer appeared bored; if so, it could use small lights in the glasses to signal that to the wearer. (El Kaliouby herself was known to sport a head-cam in and out of the lab, tucked into the head scarf she wears.)
El Kaliouby has cofounded a company called Affectiva in Waltham, Massachusetts, to commercialize the facial recognition technology and a wristband that she helped develop to measure skin conductance, which is associated with emotional arousal and can be used to detect anxiety in real time. For now, Affectiva uses facial recognition mainly to give advertisers a better sense of how their ads are affecting viewers. The company convenes enormous virtual focus groups made up of online viewers who allow their expressions to be tracked, and then analyzes the resulting data. But in the longer term, el Kaliouby also wants to bring her technology to classrooms to help teachers identify which material students respond to best.
The technology could eventually become a critical component of many electronic devices, making it possible for them to recognize when we’re puzzled, frustrated, happy, or sad—and enabling them to respond with the right information, music, or human assistance. And there’s a lot to be said for getting our phones, PCs, and GPS systems to recognize when we just want to be left alone.
Letting advertisers send targeted pitches to your mobile phone without ever seeing your personal information.
Saikat Guha is convinced that privacy and profit don’t have to conflict online. The Microsoft Research India computer scientist has developed a software platform that allows advertisers to precisely target potential customers without exposing the customers’ personal information.
The trick involves flipping the basic model of targeted advertising. Companies now track your browsing and purchasing behavior and then sell your data to advertisers. But instead of acquiring data from your phone or PC so that companies can send the right ads to websites you visit, Guha’s system calls for companies to send potential ads to you; then software on your device figures out which of them are targeted effectively. Thus, if you search for video games, the software will fetch entertainment-related ads. If your computer or phone recognizes that, say, you often buy DVDs, the device will pick out a DVD ad to show you. Guha’s ad-selecting software could be built into browsers, or into websites such as Facebook. And he estimates that the ads wouldn’t take up significant amounts of memory on your machine.
Since companies wouldn’t be able to see or store your data or toss it around the Web, risking accidental leakage, even data normally too private to share with advertisers could be brought to bear in picking from among them.
Today, for instance, Google can’t determine your birth date unless you offer it up. But Guha’s software might come across it on your PC and use it to enhance the targeting of Google’s ad network, without ever revealing the date to Google. It’s a privacy protection scheme that, unlike almost all others, indirectly gives businesses an even richer set of data to work with.
Guha has also addressed the privacy threat from smartphone apps that package and sell sensitive information such as a user’s name and location. “Today someone could construct a full history of where you are at any given time of the day,” he says. His idea is a platform that cryptographically splits information such as a person’s name, the name of the store the person is visiting, and the amount of time spent at the previous store into disconnected fragments before sending it to the cloud. Software on the phone or tablet could then use all or most of those fragments to target advertisements, but no party involved could connect them to create a privacy-violating portrait of the user.
There will always be those who will try to get around privacy protection schemes to scope out more about you than you care to share. Guha is on top of that problem, too. He’s working on algorithms that detect when websites and apps are surreptitiously using your personal data, so you can block them.
Liberating us from the touch screen by turning skin and objects into input devices.
Chris Harrison recently helped develop an invention, called Touché, that can turn practically anything into a computer input device—a table, a doorknob, a pool of water, your hand. To do this, he relies on the natural conductivity of some things, or he adds electrodes to objects that aren’t conductive. Then he wires up a controller that registers the range of electronic signals the objects generate when they are changed by, say, a particular hand gesture or body posture. A sensor attached to a sofa, for instance, can continuously monitor voltage changes to detect the signatures of particular motions and events and link them to actions. A dog leaping on the couch might trigger a harsh noise to scare it off; a person sitting down might cause the TV to switch on. (Yes, even a couch potato’s life can be made easier.)
Harrison, a PhD student in Carnegie Mellon’s Human-Computer Interaction Institute, says his mission is to liberate our fingers from having to command our phones and other devices by poking at squished keyboards and teensy screens. “If you think about all the ways we use our hands, being limited to only poking would make the world really hard to use,” he says.
He is enlisting technologies ranging from cameras to stethoscopes to miniature projectors. Before Touché, which he developed while at Disney Research, he invented a device called Skinput that turns skin into the equivalent of an interactive touch screen: a tiny body-mounted optical system projects “buttons” onto the wearer’s hand and arm and detects any tapping of the buttons so that a device can be controlled. As an intern at Microsoft, he helped create OmniTouch, a roughly similar system that makes it possible to turn any object in the environment into a multitouch screen. And he’s made a device called Scratch Input that uses a modified stethoscope and generic microphone to convert the sound of a fingernail dragging over just about any surface into an electrical control signal.
Harrison notes that as computers become better integrated into almost everything we do, we will find it increasingly convenient to be able to interact with them in a variety of ways, without always having to resort to a screen or keyboard. “Eventually we’ll develop input technologies so good that we don’t need a touch screen,” he says. Our tired fingers salute that quest.
Securing our smartphones from spyware and rogue apps, with a little help from the crowds
In 2005 John Hering notoriously invented a hacking “rifle” called the BlueSniper that enabled him to take control of a Nokia handset from a record-setting distance of 1.2 miles. But though he’s been a hacker since childhood, Hering isn’t the kind of hacker you have to worry about. In fact, his mission is to keep your cell phone safe from malware.
The BlueSniper stunt was all about exposing security weaknesses in Bluetooth technology. Hering used the attention he got from it to further a more ambitious idea: that there should be a central database of information about phone malware. In 2007 he cofounded Lookout Mobile Security with two college buddies and created a free app that protects Android users from malicious apps—say, a fake version of a game that tacks an easy-to-miss $5 charge onto your monthly smartphone bill. Lookout found 1,000 instances of virus-infected apps last year and found that Android users had a 4 percent chance of encountering malware, a number expected to rise.
To stay on top of the bad guys, Lookout has built what it calls the Mobile Threat Network: a giant database, tallying more than a million rogue apps, that it continuously adds to as the company’s software scans and analyzes apps worldwide. When an Android smartphone owner uses Lookout’s app, it compares installed apps against its database of known threats and notifies the user when it detects a match.
Users can help by allowing Lookout to collect data from their mobile devices, essentially crowdsourcing the job of finding threats. That approach to identifying malware stands in contrast to the methods used by traditional security software for desktop computers, which rely on professionals working in the background to find threats in the digital wild.
Last year, Lookout blocked millions of mobile threats, according to the company. More than 20 million people have downloaded the app. (Most of Lookout’s revenue comes from users who pay $3 a month to subscribe to a premium service that also secures mobile devices’ Web browsers and makes it possible to lock or erase stolen phones remotely. But Hering won’t say whether the privately held company is profitable yet.)
Hering says he thinks of his approach to mobile security as one that will empower users, not hamper them, as desktop security programs sometimes do. “Security is typically something that’s thought of as a burden,” he says. “It slows down your computer, it tries to scare you. It’s all these things that we don’t stand for.”
Hiding all the complexities of remote file storage behind a small blue box.
One day in 2009, Drew Houston and his business partner, Arash Ferdowsi, pulled their Zipcar into Apple headquarters in Cupertino, California. “We went to the front desk,” Houston recalls. “And what do you say at that point? ‘We’re here to see Steve.’”
Steve Jobs had invited them largely because he wanted to explore acquiring Houston’s fast-growing company, Dropbox. Founded in 2007, Dropbox conferred iPhone-like ease and reliability on cloud-based file storage—something Apple itself hadn’t yet begun offering. People using any browser or operating system, on any kind of device, could drag any kind of file to Dropbox’s icon of an open blue box. The files were stored on Dropbox’s servers and synched each time you saved a file, so that it would be available on any device running Dropbox.
Houston and his team hammered out thousands of issues to create an easy system free of the typical annoyances. Dropbox knows that while Linux file names are case-sensitive, Windows file names aren’t, so a Windows file called “ABC.doc” will overwrite one called “abc.doc.” It can keep antivirus software from interfering with its file-synching system. It integrates smoothly with different user interfaces: on a Mac, for example, the Finder displays a check mark in the Dropbox icon when files are in sync.
Its ability to shield users from myriad mind-numbing details and housekeeping chores—“the acrobatics to support all these different situations,” as Houston puts it—is what made Dropbox a hit. “It sounds like what we do is simple,” says Houston, who wrote the original code on a bus ride from Boston to New York and is now Dropbox’s CEO. “But sanding down the thousand rough edges to make something work 100 percent of the time is really, really hard. Even something simple, like synching a file, is actually really complicated to do in a bulletproof way a billion times.”
That’s how many times people are updating files with Dropbox every two days. And as consumers slide more stuff into their Dropbox folders, more blow past their free two-gigabyte limit and start paying $10 a month for additional storage. Dropbox says it now has more than 50 million users, with 4 percent paying.
The other big technical challenge was how to make Dropbox work fast on any device. Users often store thousands of files, and tracking and synching every one of them could easily eat up memory and processor time. The first version of the service hogged two full gigabytes of memory, but Dropbox eventually whittled that down to a mere 100 megabytes. And to keep Dropbox from dropping the ball when operating systems are revised or upgraded on users’ PCs, the company created custom analysis tools that rapidly detect and resolve any software conflicts.
Houston’s team is now working on advanced capabilities for synching and sharing photos, and gearing up for the demands that will be imposed on the software by continued rapid growth. “We’re designing a system that can connect billions of devices,” he says. The company has tripled its staff in the past year, to 150, and taken over a large office space in San Francisco.
Back at that meeting at Apple in 2009, Houston told Jobs he wasn’t interested in selling, after which Apple went on to bring out its competing iCloud service. But it’s hard to argue that Houston was being shortsighted, given that private investors recently valued Dropbox at $4 billion.
By tracking the direction of light, a camera takes pictures that can be refocused on different objects in a scene.
Today’s digital cameras do the focusing for you, but they occasionally blow the shot with a blurred subject. That’s never a problem with Ren Ng’s camera. His company, Lytro, sells a $399 model that captures light in a very different way from conventional cameras, recording the angle at which each ray enters the lens. The resulting photo can be sharply focused on any part of the scene, and then refocused on a different part—all long after the picture has been taken. “This is going to drive even larger transformations than the transition from film to digital photography,” says Ng.
Ng’s camera is at the leading edge of the new field of computational photography, which uses software to wring new tricks out of conventional optical components and a few novel ones. Lytro is preparing to release software upgrades that will allow shots taken with one of its cameras to be viewed in 3-D, and it is developing methods that could get professional-quality shots from cameras with cheap lenses, such as those on cell phones.
The focusing trick is an impressive enough start. When a photo taken with the Lytro camera is displayed on a computer, anyone can click on any object in the picture to get the software to instantly bring that object (and anything else in the photo that was the same distance from the camera) into sharp focus, leaving the rest artfully blurred. The focus point can then be changed with a click elsewhere in the photo. Friends can refocus Lytro photos for themselves when they are shared on Facebook or elsewhere online.
Whereas a conventional digital camera captures a focused image as light strikes a sensor chip, the Lytro camera has a plastic sheet of thousands of tiny lenses directly in front of its sensor. These lenses take rays that come into the camera at different angles and direct them to different points on the sensor. That leaves an unfocused image, but it doesn’t matter—because Ng’s software in the camera can use the information about the angle of the light rays to bring any part of the image into sharp focus.
In 2006 Ng was a PhD student at Stanford University studying the illumination of virtual objects. But he wanted to work on something with a more tangible impact, so he put off finishing his degree and started researching ideas for better camera designs. He wasn’t sure how to proceed until one day he found himself staring in frustration at a poorly focused photo he had recently taken. “I thought, ‘Does the camera have to focus before you take the shot?’” he recalls. He had a strong hunch the answer was no, and he immediately set out to prove it.
Once he hit on the idea for his camera system with multiple lenses inside, Ng started tearing apart and rejiggering conventional digital cameras to build prototypes. When he wasn’t screwing together camera parts, he was networking to scrounge up the expertise, technology, and funding he needed. After about nine months, he finally found himself at his kitchen table assembling what he hoped would be his first fully functioning prototype capable of after-the-fact focus. It worked, and became the subject of Ng’s prize-winning PhD thesis.
Ng decided to start a company based on the technology. The easier path would have been to license it to one of the established camera manufacturers, such as Nikon or Canon, rather than trying to take them on. But he feared that a big company would simply try to add the technology to its existing cameras as an incremental improvement. “A transformational technology requires a transformational product,” he says. So he started Lytro, and after four years of stealthy development, the company’s first camera began shipping in February.
Lytro has raised over $50 million in investments. It is currently working on introducing software to expand the capabilities of the existing camera model, with the 3-D upgrade expected this year. A bit further down the road, says Ng, could be cameras that will take refocusable videos.
Mobile apps that tell you what you need to know before you have to ask.
PROBLEM: We’re forced to interact with smartphones in much the same way that we do with desktop computers—by selecting applications, typing in information, choosing from menus, hunting down snippets on websites, and clicking links. That’s okay at a desk, but it can be a huge inconvenience when you’re dealing with a tiny screen on the go.
SOLUTION: Hossein Rahnama, research and innovation director of the Digital Media Zone at Toronto’s Ryerson University, decided that smartphones ought to offer us useful information where and when we need it.
Through his startup, Flybits, Rahnama is laying the technical groundwork for a wave of mobile software that can identify and respond to contextual cues like location and time of day—and integrate them with information such as a user’s travel itinerary. It can then guess at what information would be most relevant to display, such as directions to a car-rental counter when you get off the plane after arriving at an airport.
Others have been working on so-called context-aware computing, but Rahnama’s software platform is already being used as the basis for inexpensive, commercially practical applications that also protect privacy. Several Canadian airports and the transit systems in Toronto and Paris have used the Flybits platform to create apps that automatically serve up personalized, location-keyed guidance to travelers, and a small U.K. telecommunications company is using it to develop apps that can route calls to the appropriate number to help you avoid roaming fees (for example, it knows to send your mom’s call to your hotel landline rather than your cell if it detects that you’re overseas).
Flybits can also make it easier to find the people most relevant to your location and interests. The company is rolling out a service called Flybits Lite that prompts users to form spontaneous social networks limited to a certain space, such as the office or a concert. So eventually, after you’ve navigated the Metro to the Louvre, perhaps you can find out who else is there to admire the Mona Lisa.
His ultracheap computer is perfect for tinkering.
Eben Upton thought a new generation of youngsters might never develop valuable hardware and software hacking skills unless they had access to cheap, hobbyist-friendly computers. So he set out to build one himself. The resulting tiny box, which sells for just $25, has been a big hit. It could boost computer skills not only among children but among adults in poor countries as well.
Upton came up with the idea in 2006, when he was finishing his PhD in computer science at the University of Cambridge. Having agreed to help out with undergraduate computer science admissions, he was looking forward to interacting with teenagers who loved messing around with computers as much as he had when he was younger.
Upton had done all that messing around partly for the thrill of bending the machines to his will, and partly because the 1980s boom in video games had made it easy to imagine making a fortune working with computers. “I was a mercenary child,” he says, sounding a bit apologetic. “One of the things that drew me to computing was that there were 15-year-old kids who made so much money from computing they actually bought Ferraris.”
To judge by the applicants Upton was looking at, however, kids had lost interest. They were still messing around on computers, but they weren’t messing around with them. They weren’t writing programs and taking apart circuit boards. They were the kinds of kids who played World of Warcraft and exchanged cat pictures on Facebook. They had changed from active hackers to passive consumers.
Perhaps the dot-com bust had killed some of the enthusiasm for hacking. But to Upton, one other possible factor loomed large. In the 1980s, he and his friends had learned basic computer science on a BBC Micro, a line of computers built for the British Broadcasting Corporation by Acorn Computers and installed in most English schools. Small, rugged, inexpensive, and expandable, the Micro introduced a generation of British children to hardware engineering and software programming.
There was no contemporary equivalent to the Micro. “Sure, everyone in the middle class has a PC,” Upton says. “But even then, often there is only the one family PC. You won’t let kids screw around with it.” Schools aren’t going to let students take apart their machines, either. As a result, he observes, “computing” classes teach children how to use Microsoft Word and PowerPoint. “Even Microsoft wants schools to produce software engineers,” Upton says. To successfully restore literacy in computer tinkering, he decided, the world needed a modern analogue of the BBC Micro.
Being a hardware guy at heart, Upton went ahead and built a prototype of a next-generation hobbyist machine—the sort of stripped-down device that would enable its users to become acquainted with the guts of a computer. It would also allow its users to put the machine to work in projects ranging from robotics to wearable computing to gaming. He eventually took up a Cambridge professor’s suggestion to call his device Raspberry Pi, tipping his hat to the old tech tradition of naming computers after fruit. But he didn’t immediately see a way to produce Raspberry Pi in sufficient numbers to make a difference, so he reluctantly mothballed the project.
After finishing his PhD, Upton went to work at the Cambridge, U.K., office of Broadcom, a networking company based in Southern California. (He is now one of the company’s technical directors for Europe.) Upton was instrumental in the creation of Broadcom’s first microprocessor intended for multimedia applications—the BCM2835. Released in 2011, it is a single chip that’s small enough to fit in a phone but big enough to contain vital parts such as a central processing unit and a graphics processor. By some measures it was the most powerful chip in the mobile market at the time, and it was a tremendous success for Broadcom.
It was also, Upton realized, the way to restart Raspberry Pi, given that a single-chip computer would be much less costly to produce. He and half a dozen volunteers worked on the new version on evenings and weekends. But the BCM2835 wasn’t easy to deal with: it was dauntingly jammed with tiny components, including no fewer than five power supplies.
To keep Raspberry Pi small and cheap, the team wanted to build it on a single circuit board that could be stamped out, no further assembly required. But to enable the phone chip to work with computer peripherals and run full-scale computer software, they would, it appeared, need to build a board with more than eight stacked layers of circuitry, a prohibitively complex and expensive proposition. Working furiously to simplify the circuitry, the team eventually managed to shave the board design down to six layers.
The first prototypes were ready in December 2011, but Upton discovered, to his horror, that they didn’t work at all. Fighting panic at the thought of all the various subtle flaws that might be buried in all those layers of tangled circuitry, the team discovered that one pin on the chip had been inadvertently disconnected. It was a blessedly easy fix, and within minutes, his invention was popping to life.
The Raspberry Pi is strikingly unlike other computers. About the size of an Altoids box, the computer has no keyboard, monitor, or disk drive—it doesn’t even have an internal clock or an operating system. In other words, the machine requires a fair amount of hardware and software tinkering just to get started. It almost dares you to take it on and try to hack together a robot or gaming system.
It can’t get by on looks. Lacking a case, the Raspberry Pi offers a dense, bristling cluster of tiny electronics to the owner’s view, with five ports: HDMI, to hook the computer up to a television; USB, to hook it to multiple devices; Ethernet, for data; and analog TV and analog stereo. But having to face the guts of the device is a good thing, according to Upton. “Kids can see what they ordinarily can’t see, unless they smash a phone,” he says.
The really surprising feature of the Raspberry Pi is the $25 price: about a tenth the cost of the lowest-priced computers available in stores (if you ignore tablets, which no one can hack anyway).
It was intended for kids, but hackers of all ages wanted it, and so did budding computer scientists in poor countries. Almost the instant the Raspberry Pi went on sale, orders crashed the websites of its two vendors, RS Components and Premier Farnell. The companies reported that they were taking in orders fast enough to tear through the entire initial stock of 10,000 computers in minutes.
Thrilled with the reception, Upton is making more of the devices through a nonprofit Raspberry Pi Foundation he put together—his mercenary tendencies having abated over the years. In fact, he says, he intends to sell two million Raspberry Pis a year in order to reach a critical mass that will support an active community of owners to share tips and applications. He also hopes that the existence of this community will prompt schools to adopt the Raspberry Pi for courses.
Even more important, Upton hopes, is that kids start to take them apart. “That would be real success,” he says.
Spotting tiny problems with help from an ultrafast camera.
Nothing moves too fast for Andreas Velten’s camera—not even light. Last year Velten, who built the camera while a postdoc at the MIT Media Lab’s Camera Culture Group, made a video of laser light zipping through a plastic soda bottle. Capturing the equivalent of 600 billion frames per second, the slow-mo footage showed a ghostly light moving from one end of the bottle to the other. Equally remarkable, the camera can harness light reflected off surfaces to see around corners. Because the camera is so fast, it can detect how long it takes the different light rays to reach it, and an image can be reconstructed from that information.
It’s not just amazing gimmickry. Velten’s technology could lead to ultrafast medical imagers and scanners that use light instead of sound to detect tiny imperfections, whether in cancerous tissue or in airplane wings. It also suggests an approach to taking high-quality photos of scenes lit only by the tiny flash on a cell phone.
Velten’s table-mounted camera uses 672 carefully positioned and timed optical sensors, each capable of capturing a trillionth of a second’s worth of reflected laser light. The technical advance was figuring out how to modify a streak camera, a common piece of equipment in chemistry labs that measures the optical properties of laser light. That type of camera can capture only one horizontal line, or “streak,” of light at a time. Velten, combining his expertise in optics and computer science, developed custom software to repeat the scan over and over and combine the resulting data.
Now at the Morgridge Institute at the University of Wisconsin, Velten is applying his ultrafast imaging techniques to help develop new types of microscopy and biomedical imaging for clinical applications. One of the tools he envisions, for example, is a less invasive endoscope that could travel shorter distances to see deeper inside the body.