Software that Learns by Watching

KarDo learns how to perform common IT support tests by observing what the experts do.

Duncan Graham-Rowearchive page

May 19, 2010

Overworked and much in demand, IT support staff can’t be in two places at once. But software designed to watch and learn as they carry out common tasks could soon help–by automatically performing the same jobs across different computers.

The new software system, called KarDo, was developed by researchers at MIT. It can automatically configure an e-mail account, install a virus scanner, or set up access to a virtual private network, says MIT’s Dina Katabi, an associate professor at MIT.

Crucially, the software just needs to watch an administrator perform this task once before being able to carry out the same job on computers running different software. Businesses spend billions of dollars each year on simple and repetitive IT tasks, according to reports from the analyst groups Forrester and Gartner. KarDo could reduce these costs by as much as 20 percent, Katabi says.

In some respects, KarDo resembles software that can be used to record macros–a set sequence of user actions on a computer. But KarDo attempts to learn the goal of each action in the sequence so it can be applied more generally later, says MIT post-graduate Hariharan Rahul, who codeveloped the system.

When IT staff want KarDo to learn a new task, they press a “start” button beforehand and a “stop” button afterwards. During a “learning phase,” KarDo will attempt to map each of the actions performed in the graphical user interface, such as clicking on particular icons or buttons, with system-level actions, such as starting or closing a program, or opening a Web page. This allows a task to be applied across machines running different software, says Katabi. “I can go to my desktop, click on the Internet Explorer icon, go to a website, and then click on a particular link to download a file,” she says. The same actions could then be applied by KarDo on a machine running a different Web browser like FireFox or Chrome. KarDo compares actions performed during the learning phase with a database of other tasks.

KarDo is able to reliably infer how to reproduce each of the subtasks after watching it being performed just once, says Rahul. For example, after watching an e-mail account being set up using Microsoft Outlook, it can do the same on other computers running different e-mail software. KarDo has been tested on hundreds of combinations of real tasks by IT staff at MIT and was found to get tasks right 82 percent of the time. When KarDo doesn’t perform a task correctly, the results aren’t serious, Katabi says.

The ultimate goal is for KarDo to intervene completely automatically, although this has not yet been tested. The idea is that when a user sends a request to the IT department , KarDo would perform the task automatically.

This sort of “programming by demonstration” is not a new idea, says Stephen Muggleton, an expert in machine learning at Imperial College London. But the approach has remained very much a research curiosity, he says. “An obvious concern from a user point of view will be the accuracy of the learned model,” says Muggleton. Normally it takes relatively large amounts of data to generate error-free machine learning models, he notes.

“There’s a great deal of promise in learning procedures and plans by watching,” says Eric Horvitz of Microsoft Research in Redmond, WA. However, in general, this is very challenging to pull off. It is usually hard to do anything useful without constraining the nature of the task, says Horvitz.

KarDo was announced last week as the winner of the Web/IT track of MIT’s $100K Entrepreneur Competition.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.