Information firehose: The standard practice for teaching a machine-learning algorithm is to give it all the details at once. Say you’re building an image classification system to recognize different species of animals. You show it examples of each species and label them accordingly: “German shepherd” and “poodle” for dogs, for example.
But when a parent is teaching a child, the approach is entirely different. They start with much broader labels: any species of dog is at first simply “a dog.” Only after the child has learned how to distinguish these simpler categories does the parent break each one down into more specifics.
Dispelled confusion: Drawing inspiration from this approach, researchers at Carnegie Mellon University created a new technique that teaches a neural network to classify things in stages. In each stage, the network sees the same training data. But the labels start simple and broad, becoming more specific over time.
To determine this progression of difficulty, the researchers first showed the neural network the training data with the final detailed labels. They then computed what’s known as a confusion matrix, which shows the categories the model had the most difficulty telling apart. The researchers used this to determine the stages of training, grouping the least distinguishable categories together under one label in early stages and splitting them back up into finer labels with each iteration.
Better accuracy: In tests with several popular image-classification data sets, the approach almost always led to a final machine-learning model that outperformed one trained by the conventional method. In the best-case scenario, it increased classification accuracy up to 7%.
Curriculum learning: While the approach is new, the idea behind it is not. The practice of training a neural network on increasing stages of difficulty is known as “curriculum learning” and has been around since the 1990s. But previous curriculum learning efforts focused on showing the neural network a different subset of data at each stage, rather than the same data with different labels. The latest approach was presented by the paper’s coauthor Otilia Stretcu at the International Conference of Learning Representations last week.
Why it matters: The vast majority of deep-learning research today emphasizes the size of models: if an image-classification system has difficulty distinguishing between different objects, it means it hasn’t been trained on enough examples. But by borrowing insight from the way humans learn, the researchers found a new method that allowed them to obtain better results with exactly the same training data. It suggests a way of creating more data-efficient learning algorithms.
Yann LeCun has a bold new vision for the future of AI
One of the godfathers of deep learning pulls together old ideas to sketch out a fresh path for AI, but raises as many questions as he answers.
Inside a radical new project to democratize AI
A group of over 1,000 AI researchers has created a multilingual large language model bigger than GPT-3—and they’re giving it out for free.
Sony’s racing AI destroyed its human competitors by being nice (and fast)
What Gran Turismo Sophy learned on the racetrack could help shape the future of machines that can work alongside humans, or join us on the roads.
DeepMind has predicted the structure of almost every protein known to science
And it’s giving the data away for free, which could spur new scientific discoveries.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.