TR: What is the solution?
RUDOLPH: Before pervasive computing, I had been working on parallel processing, where it is well known that the debugging is a nightmare. In a parallel computing system you have to handle ten, twenty, a hundred, a thousand, ten thousand processors. But those processors are all the same. In pervasive computing, on the other hand, we're talking about lots and lots of pieces that are all different-different technologies, different generations, different software. How do I debug that? IBM has been pushing something they call autonomic computing-techniques to do self-reflection, self-healing, and other things with beautiful names. The computer automatically finds what's wrong and fixes it. I think we're really far from that. But the one thing we do know is that when something was working and it suddenly stopped working, it's because something has changed. So our systems should at least give the human a chance to find out the problem. It is easier to tell what has recently changed than to decide if that change is right or wrong.
TR: Can you make a machine send the user a message saying it's going to fail?
RUDOLPH: Things change all the time and mostly it is normal behavior. We are trying to develop systems that can figure out typical patterns of behavior for individual system components and communication links. These systems can learn the common patterns in communication connections, and the typical patterns of input and output values of certain processes.
TR: Can you give an example of this?
RUDOLPH: Suppose my computer music system starts having annoying pauses. It might be due to a network congestion problem because I started up a Web browser. Or, if I'm listening to a CD, it might be due to a scratch on the disc. In the first case, the system will notice a change in the communication rates, whereas in the second case it might notice a change in the values of the audio stream itself. So in the first case-starting up a browser-the system may recognize that this is typical behavior and the user should just wait for the connection to get better. But if the cause is a scratch, then the user should be told to examine the disc and the CD player. The user has a hope-sometimes a very slim hope-to know where to look for the problem.
TR: Does the fact that devices are going wireless makes things more difficult?
RUDOLPH: Yes. Imagine a television set that can answer my telephone. I'm watching TV and the telephone rings. I answer the call using the TV, which activates my TiVo digital video recorder. Suppose that suddenly the TV starts ringing without stop, and I want to disconnect the telephone from the TV. How do I do that? If I'm lucky, there's a plug on the telephone outlet in the wall going into the TV. So I could just disconnect that plug. Very soon, though, we're not going to have wires anymore. The communication will all be wireless-802.11, Bluetooth, whatever. I might have to stand in front of an annoying, ringing TV fumbling with buttons trying to disconnect the telephone.
TR: Why didn't engineers think about failure-detecting systems before?
RUDOLPH: Before the Internet, people built systems that were very well engineered-the telephone network, for example. AT&T understood its behavior-and owned the whole system. Then things like the Internet came around. Now no one owns the whole thing-it's too big, it's too distributed. We are no longer able to engineer the whole world. We can't rebuild the Internet. What's great about the Oxygen experience is that we're building new systems, so we can try to do something right from the start without the pressure on having to follow release dates. Universities have time to do something right.
TR: That's why MIT is doing this kind of development work, instead of the companies that will sell the products?
RUDOLPH: That's right. Academia has an important role here in that we are helping to figure out how to build systems defensively. Nokia is a partner of Oxygen. Nokia cares a lot about security and privacy. But how much is Nokia willing to spend on research on security and privacy when it knows that teenage girls dominate the cell phone market, and they are not worried about privacy and security; they care about color and style and games and other features. So if Nokia is going to spend a lot of research money on security and privacy and some other company spends their research money on the finicky tastes of teenagers, Nokia is going to lose market share. On the other hand, if MIT, Stanford, Boston University, or any university can develop a solid system with security and privacy and make it public, it would be much easier for Nokia to incorporate that technology.
TR: What have you done so far?
RUDOLPH: We're just starting. We talked about the example of a telephone talking to a TV. But how grandma can use this system? When there's no wire, how does grandma know that the phone is talking to the computer? And how does she stop it? Does she find the IP address of the telephone and delete it? No, she's not going to do that. One possible solution: you hold up a handheld device with a camera and have it view the room. Whenever it sees devices it can figure out what they are, so it knows, for instance, that it is pointing at a TV, or a telephone. Then it consults a database and concludes, "that telephone is talking to that TV." So now we can give feedback to grandma, probably visually. You can use the image of the room and overlay a blue line connecting the telephone and the TV. And then touching the screen you can choose to break that connection.
TR: How do you plan to simulate failures in the systems you are developing?
RUDOLPH: We don't have to. They just happen!
Comments