Hot Commodity

For Andrej Karpathy, a Stanford PhD candidate who makes software that can imitate us, landing a job after graduation was easy.

Katherine Bourzacarchive page

March 28, 2016

Andrej Karpathy is holding a classroom full of Stanford grad students and undergrads rapt with his description of the pros and cons of different kinds of algorithms used in training a neural network to recognize objects in an image. Suddenly, from the middle of the room, the distinctive artificial voice of Apple’s Siri pipes up: “I’m not sure what you said.”

Siri, probably activated accidentally, draws big laughs. In this room, where students are deep into the intricacies of learning how to make software that better understands humans and our data, the error message is a reminder of the technology’s exploding real-world applications.

There’s a huge demand for AI experts from deep-pocketed companies like Siri’s parent Apple, as well as from IBM, Google, and Facebook. As a consequence, the students in Karpathy’s class are likely to graduate into a favorable job market. It’s not uncommon these days for big companies to buy whole startups to get the talent. Competition is so stiff that smaller companies are starting to broaden recruiting beyond computer science majors to fields like cosmology and physics. At AI startup Maluuba, CEO Sam Pasupalak has research recruitment specialists poring over the academic papers published every day, looking for authors who might make good staffers, and going to conferences to buttonhole leading researchers after their talks. Joshua Clarke, a partner at recruiter Heidrick & Struggles, says an AI background commands a high premium today because technology companies aren’t the only ones competing for these candidates. Fortune 500 companies are also assessing how AI will affect their businesses.

It’s not uncommon these days for big companies to buy whole startups to get the talent. Competition is so stiff that smaller companies are recruiting not just computer science majors but graduates in fields like cosmology and physics.

No one better personifies the war for AI talent than Karpathy himself. The 29-year-old PhD student is a rising star in the field of neural networks, a trendy area of artificial intelligence. When he graduates in May, he’ll become one of the founding researchers at OpenAI, a nonprofit research startup. Karpathy has seen what it’s like working at startups, and he’s spent two summers at behemoth Google. OpenAI, which offers the ability to build a new institution from the ground up, also promises the intellectual freedom of academia and the money to make the work possible, he says. OpenAI has already announced $1 billion in donations from Peter Thiel, Elon Musk, and companies including Amazon Web Services.

Karpathy has been interested in computers as far back as he can remember. When he was just five or six years old in Kosice, Slovakia, he begged his parents for a PC; he was the first person in town to get one. He remembers playing games and making pictures with MS Paint. “Programming is an act of creation, too,” he says.

After moving to Canada as a teenager, Karpathy enrolled in the University of Toronto, expecting to work on quantum computers. He changed his mind after taking a class from machine-learning expert Geoffrey Hinton, a pioneer in programming neural networks.

While older approaches to AI gave computers smarts through brute-force data searches, says Karpathy, neural networks are designed to learn in a way that’s analogous to the brain. These programs make associations and recognize patterns, enabling them to beat other kinds of AI technology in tests of image recognition, drug discovery, and Siri’s bread and butter—listening to and speaking like humans.

Making computers that can learn and understand more like people is “the ultimate meta-problem” in computing, says Karpathy. If computers can combine humanlike understanding with their ability to store and access tremendous amounts of data, he says, AI will pave the way for great progress in robotics, self-driving cars, security systems that recognize faces and voices, art, and just about anything you can think of.

It was through a side project he took on while working on his PhD that Karpathy drew the attention of Greg Brockman, founder of OpenAI.

For fun, Karpathy had programmed a neural network that can learn to generate text in any style—Shakespeare’s, Obama’s, whatever it’s trained on. One piece of code just 100 lines long can find patterns in poems, mathematics, or any stream of symbols, Karpathy says. His network can then produce strings of characters in that style. To a human reading even somewhat closely, what the network currently produces is mostly nonsense with a ring of Shakespeare or presidential oratory. But Karpathy says it gets better and better the more training text it’s fed.

Karpathy’s decision to post the network’s underlying code online for anyone to use impressed Brockman. “Engaging the public is one way OpenAI hopes to get progress in AI and machine learning,” he says.
Once Brockman had Karpathy on the list of people he’d like to bring on board for OpenAI, he began to leverage each new hire to entice Karpathy to join. “The best people want to work with the best people,” Brockman says. Indeed, Karpathy says, he is generally recruited by engineers he knows, and he won’t answer recruiter calls. A key hire was John Schulman, a recent PhD graduate from the University of California, Berkeley. Once Schulman said he was going to work with Brockman, Karpathy says, he knew the project was serious. Its focus on creativity and AI’s potential to benefit humanity were also appealing. “We want to make sure no company has a monopoly on AI, and guide the field in the most beneficial way for the general public,” says Karpathy.

In class, Karpathy has a knack for bringing the technology to life. After 60 minutes parsing the pros and cons of image-processing algorithms, he describes a Google project that reveals which parts of an image a neural network prioritizes as it identifies objects on view. Up on the screen pops a funny photo of a sheep that the program enhanced with a dog’s face. The data sets used to train neural networks contain so many images of the animals that “neural networks end up hallucinating dogs,” he says.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.