Deep Neural Network Learns to Judge Books by Their Covers

A machine-vision algorithm can tell a book’s genre by looking at its cover. This paves the way for AI systems to design the covers themselves.

Emerging Technology from the arXivarchive page

November 7, 2016

The idiom “never judge a book by its cover” warns against evaluating something purely by the way it looks. And yet book covers are designed to give readers an idea of the content, to make them want to pick up a book and read it. Good book covers are designed to be judged.

And humans are quite good at it. It’s relatively straightforward to pick out a cookery book or a biography or a travel guide just by looking at the cover.

And that raises an interesting question: can machines judge books by their covers, too? We already know they judge people by their faces.

Today we get an answer thanks to the work of Brian Kenji Iwana and Seiichi Uchida at Kyushu University in Japan. These guys have trained a deep neural network to study book covers and determine the category of book they come from.

More on machine learning

With Its New Photo Filter, Facebook Announces Its Plan to Have AI Invade Your Phone

The tool is fun, but it’s also a mission statement for the social networking giant.

StarCraft Will Become the Next Big Playground for AI

Artificial intelligence will require key advances in order to play a video game filled with planning, guesswork, and bluffing.

Machines Can Now Recognize Something After Seeing It Once

Algorithms usually need thousands of examples to learn something. Researchers at Google DeepMind found a way around that.

Today’s Artificial Intelligence Does Not Justify Basic Income

Even the simplest jobs require skills—like creative problem solving—that AI systems cannot yet perform competently.

AI’s Language Problem

Machines that truly understand language would be incredibly useful. But we don’t know how to build them.

Their method is straightforward. Iwana and Uchida downloaded 137,788 unique book covers from Amazon.com along with the genre of book. There are 20 possible genres but where a book was listed in more than one category, the researchers used just the first.

Next, the pair used 80 percent of the data set to train a neural network to recognize the genre by looking at the cover image. Their neural network has four layers, each with up to 512 neurons, which together learn to recognize the correlation between cover design and genre. The pair used a further 10 percent of the dataset to validate the model and then tested the neural network on the final 10 percent to see how well it categorizes covers it has never seen.

The results make for interesting reading. The algorithm listed the correct genre in its top 3 choices over 40 percent of the time and found the exact genre more than 20 percent of the time. That’s significantly better than chance. “This shows that classification of book cover designs is possible, although a very difficult task,” say Iwana and Uchida.

Some categories turn out to be easier to recognize than others. For example, travel books and books about computer and technology are relatively easy for the neural network to spot because book designers consistently use similar images and design for these genres.

The neural net also found that cookbooks were easy to recognize if they used pictures of food but were entirely ambiguous if they used a different design such as a picture of the chef.

Biographies and memoires were also problematic with the algorithm often selecting history as the category. Interestingly, for many of these books, history is the secondary genre listed on Amazon, suggesting that the algorithm wasn’t entirely bamboozled.

The algorithm also confused children’s books with comics and graphic novels as well as medical books and science books. Perhaps that’s also understandable given the similarities between these categories.

There is one shortcoming in this work. Iwana and Uchida have not compared the performance of their neural network against humans’ ability to recognize book genres by their covers. That would be an interesting experiment and one that would be relatively straightforward to do with an online crowdsourcing service such as Amazon’s Mechanical Turk.

Until that work is done, there is no way of knowing whether machines are any better at this task than humans. Although, no matter how good humans are at this task, it is surely only a matter of time before machines outperform them.

Nevertheless, this is interesting work that could help designers improve their skills when it comes to book covers. A more likely outcome, however, is that it could be used to train machines to design book covers without the need for human input. And that means book cover design is just another job that is set to be consigned to the history books.

Ref: arxiv.org/abs/1610.09204: Judging a Book by Its Cover

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Deep Neural Network Learns to Judge Books by Their Covers

More on machine learning

With Its New Photo Filter, Facebook Announces Its Plan to Have AI Invade Your Phone

StarCraft Will Become the Next Big Playground for AI

Machines Can Now Recognize Something After Seeing It Once

Today’s Artificial Intelligence Does Not Justify Basic Income

AI’s Language Problem

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

More on machine learning

With Its New Photo Filter, Facebook Announces Its Plan to Have AI Invade Your Phone

StarCraft Will Become the Next Big Playground for AI

Machines Can Now Recognize Something After Seeing It Once

Today’s Artificial Intelligence Does Not Justify Basic Income

AI’s Language Problem

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review