Three Questions for Microsoft’s New Head of Research, Peter Lee

As Microsoft prepares to absorb Nokia’s handset business, a new research strategy emerges.

David Talbotarchive page

October 15, 2013

Microsoft new head of research, Peter Lee, is tasked with helping the company invent the future. His bosses hope that it will be one in which the computing giant has more than just 4 percent of the market for mobile operating systems.

Lee’s strategy is to funnel resources toward technologies he believes could revolutionize our relationships with computers, mobile and otherwise. He also faces the challenge of managing an increasingly rare breed in the computer industry: a large and sprawling corporate research division. Microsoft Research currently has 1,100 researchers and engineers in 13 labs around the world, from Cairo, Egypt, to New York City. A 14th is planned for Rio de Janeiro, Brazil, and Microsoft may be about to absorb Nokia’s research wing when it completes its takeover of the handset maker.

Lee took on his new position after running Microsoft Research’s flagship lab at the company’s headquarters in Redmond, Washington, and was previously head of the computer science department at Carnegie Mellon University. He recently spoke with David Talbot, chief correspondent of MIT Technology Review.

Can Microsoft Research reverse Microsoft’s failure to significantly dent the smartphone market?

We are committed to making sure that the best concepts in hardware, devices, and sensors are in phones. Those can make a big difference. There are other sensors and wearable technologies that we think are fairly promising. And we want to give the user a more natural interaction, with a phone that is aware of what you are doing.

MSR will be closely involved. The converged OS for Windows Phone 8 was really a skunkworks project that originally was a joint effort led by MSR, joined by a team in the phone group.

We have committed to bringing highly personalized machine learning technology into the phone. The soft keyboard technology in the Windows phone is widely considered to be the best: you see the keys on the touch screen, but the system is able to learn where your fingers actually type—depending on the word, and even on the sentence. That’s the reason why typing feels like it’s working better than it does on the iPhone.

The Nokia acquisition has not been concluded and we don’t know what happens to the Nokia research centers right now. But we do have a lot of exciting collaborations going on between Microsoft and Nokia. No matter what happens, we will keep those going. The smartphone battle isn’t over yet.

Many major IT companies—such as HP, Intel, Yahoo, and Nokia—have killed or scaled back research units. What did they do wrong that Microsoft is doing right?

It’s hard for me to say what went wrong in those other places, but for Microsoft it is easier to get to the scale to make research work. We can make a lot of different bets. At the same time, in the context of Microsoft, we are very small, about 1 percent of the company in terms of number of employees.

To pick one success story: in the mid-1990s, we did research on the cocktail-party conversation problem. How do you hear someone in a noisy environment? People use adaptive beam-forming to focus on someone’s voice; you cock your head a little to create different times-of-flight to your ears, and your mind processes these to focus on the person’s voice. In 2003, we built a nine-microphone array to mimic that and tested it in people’s homes. Then Alex Kipman [the product leader who developed the Kinect gestural interface] came by and says “I want that!” and asked if it can be built into a four-microphone array in Kinect sensors. We embedded with the product team and delivered it for Christmas in 2010. Now if you use a Kinect and wave your hand, the microphone array will focus on your mouth and you can control it with speech, in a noisy environment, without yelling.

What are your biggest research bets right now?

Machine learning is the really big one. It is our number one investment. We think we’re well within reach of solving speech recognition, making a big dent in translation, and devices that see and hear with capabilities approaching human ones. So, for example, a camera could understand what is being said and what it is looking at. A photo could include this extra information. Or [a phone] could look at a plate of food and understand what it is, to help your diet and monitor your health.

We are expanding our activity in quantum computing pretty dramatically now. I predict that within five years, there will be a Nobel Prize related to quantum computing, for the basic science and physics of the ability to encode and compute using quantum effects. These are becoming technologies that will be the equivalent of a transistor for a new age. And it will help with other major efforts in security and privacy.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.