Startups Aim to Exploit a Deep-Learning Skills Gap

Entrepreneurs see an opportunity to optimize deep-learning code to run on graphics processing chips.

Will Knightarchive page

January 6, 2016

The latest machine-learning techniques promise to transform whole industries by making it easier for computers to recognize patterns in data, to make accurate predictions, and to generally behave more intelligently. Unfortunately, the experts capable of crafting and optimizing the code needed to make this magic possible are in pretty short supply.

Such is the demand for machine-learning talent, in fact, that startups see an opportunity to offer deep technical expertise to companies, from financial and insurance firms to Web startups and carmakers, that are hoping to harness AI. A few startups now offer to accelerate the performance of the machine-learning algorithms so that they work well on arrays of computer chips. At least one is designing its own computer chips to squeeze the best performance out of the latest algorithms.

The technique at the center of the current boom in AI, called deep learning, relies on simulating large, multilayered webs of virtual neurons, which enable a computer to learn to recognize abstract patterns, such as cats in images (see “10 Breakthrough Technologies 2013: Deep Learning”). Training such a network involves performing many parallel calculations, usually across large arrays of graphics processing units (GPUs), which are well suited to the computational task. And while the basic principles of deep learning are straightforward, configuring these networks so that they learn efficiently and run quickly across many GPUs requires significant expertise.

One startup, called Minds.ai, includes several chip experts and an expert on deep learning who studied with one of the founders of the field, Geoffrey Hinton, at the University of Toronto.

Speaking at the Neural Information Processing Systems (NIPS) conference in Montreal last month, Tijmen Tielemen, a Dutch machine-learning expert who studied in Hinton’s group, and who specialized in optimizing neural networks, said that it can take many hours or even days to train a deep-learning network on a large data set, and each time a network is tweaked the training process must begin again.

Minds.ai offers software libraries that support a deep-learning network by communicating efficiently with graphics chips. This could help businesses perform cutting-edge deep learning without requiring top talent. For example, a company aiming to train an algorithm for self-driving cars to recognize particular objects would normally need a team with strong technical expertise to do it efficiently. “When you build a serious neural network these days, it takes a long time to train it,” Tielemen says. “This is a very real concern, and we are training neural networks faster.”

Minds.ai has shown that its library can train a neural network more quickly than some other leading systems. The group compared the performance of its software using a well-known deep-learning network called AlexNet, designed for image recognition, and they found it to be faster than 99 percent of the other implementations around.

Another company aiming to accelerate deep learning is Nervana Systems. The company plans to launch its own computer chip optimized for the deep neural nets, as well as software libraries, in the coming year. CEO Naveen Rao, who previously designed chips for Sun Microsystems and Qualcomm, says the goal is not simply to speed up deep learning but to design a computer system around this machine-learning approach. “We set out to build a new architecture centered on neural networks,” he says. “But we also saw an opportunity to change how the computer looks from the architecture side.”

Companies like Minds.ai and Nervana may well find willing customers for now, but the market for deep learning is expanding and changing at a rapid pace. Big companies working on machine learning are now releasing their software frameworks and libraries, with the goal of establishing the standards that everyone will use (see “Facebook Joins Stampede of Tech Giants Giving Away Artificial Intelligence Technology”). And so, as the technology matures and more code is released, it may become easier for companies to build highly optimized deep-learning networks for themselves.

“Open source will eventually catch up with all the inefficiencies and potential optimizations,” says Rajan Goyal, distinguished engineer at a chip maker called Cavium who is exploring the merits of designing silicon for deep learning.

However, Goyal says, companies like Google and Facebook, which have a huge vested interest in improving deep learning, will probably build their own computer chips for deep learning before long, perhaps by acquiring a company already working on such technology. “Currently, the deep-learning market is fragmented and in a nascent state,” he says. “GPUs served the initial need of market, but there is growing interest in more efficient solutions.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.