AI Algorithm Identifies Humorous Pictures

The latest work with AI machines is expanding the new field of computational humor.

Emerging Technology from the arXivarchive page

January 8, 2016

Humor is a uniquely human quality. Most people can recognize funny sentences, incidents, pictures, videos, and so on. But it is not always easy to say why these things are humorous.

So it’s easy to imagine that humor will be one of the last bastions that separates humans from machines. Computers, the thinking goes, cannot possibly develop a sense of humor until they can grasp the subtleties of our rich social and cultural settings. And even the most powerful AI machines are surely a long way from that.

That thinking may soon have to change. Today, Arjun Chandrasekaran from Virginia Tech and pals say they’ve trained a machine-learning algorithm to recognize humorous scenes and even to create them. They say their machine can accurately predict when a scene is funny and when it is not, even though it knows nothing of the social context of what it is seeing.

Psychologists have a relatively poor understanding of the mechanisms behind humor. Most theories of humor suggest that its key components are qualities such as unexpectedness, incongruity, pain, and so on. When one or all of these elements are present in sentences, pictures, and videos, the chances of raising a smile are increased.

Chandrasekaran and co limit their study to pictures. And to keep things simple, they confine themselves to pictures created with a clip art program. This contains 20 “paper doll” human models of various ages, genders, and races with moveable arms and legs and eight different expressions. It also contains 31 animals in various poses and around 100 indoor and outdoor objects such as doors, windows, tables, sun, clouds, trees, and so on.

A key part of any machine-learning process is creating a database that contains good examples of the thing the algorithm has to learn. This is no easy task, particularly when it comes to something as subjective as humor.

The team tackles this by asking workers on Amazon’s Mechanical Turk service to create funny scenes using the clip art program, along with a short sentence describing why they think the scenes are funny. They also asked these people—turkers, as they are called—to create unfunny scenes.

In this way, the team amassed a database of 6,400 images, half of which were funny and half unfunny. They calibrated the database by asking other turkers to rate the funniness of each scene and found that most fell into their intended categories, although a few unfunny scenes turned out to be unintentionally funny and vice versa.

After further analysis, it turned out that the scenes rated most funny were usually associated with animals or people doing something unusual.

That prompted the team to think about ways to alter the funniness of an image. One way to do this is to replace the object or person doing something unusual with a different object or person. So they asked turkers to replace objects with other objects that were as similar as possible to the first object, but that made the scene unfunny. ”This helps us understand fine-grained semantics that causes a specific object category to contribute to humor,” they say.

In this way, they altered each of the 3,000 funny images in five different ways to create a database of 15,000 unfunny counterparts of funny images.

With this database under their belt, Chandrasekaran and co began the task of training a machine-learning algorithm to spot the difference between funny and unfunny images (holding back 20 percent of the database to test it on later).

They gave the machine two tasks. The first was to predict the funniness of a scene and the second was to alter the funniness of a scene by replacing an object within it.

The results make for interesting reading. In general, the algorithm performs quite well in predicting the funniness of scene—certainly better than a random guess.

The task of altering the funniness of a scene consists of two parts. The first is to recognize the elements of the scene that contribute to the humor and the second is to choose a replacement object that reduces the funniness.

In the first task, the algorithm makes some interesting progress. “We observe that the model learns that, in general, animate objects like humans and animals are more likely sources of humor compared to inanimate objects and therefore tends to replace these objects,” say Chandrasekaran and co.

The algorithm makes progress in the second task too. “It eliminates humor in most scenes by choosing to replace objects contributing to humor with other objects that blend into the background well,” say the team. For example, the algorithm might replace the unusual object in an indoor scene with a potted plant that fits in well or a butterfly in outdoor scenes.”

And the techniques work well. “In human evaluations, scenes made unfunny by our [algorithm] were found to be less funny than the original funny scene 95 percent of the time,” say Chandrasekaran and co.

The machine did less well in making scenes funnier but that’s clearly something to work on in future.

Of course, an important question is what exactly the machine is learning to do. In this work, funniness may be a proxy for something else entirely. Indeed, if Chandrasekaran and co’s paper were rewritten with every instance of the word “funniness” replaced with the word “oddness” or “incongruity” or “unexpectedness,” the results would be no less valid.

Nevertheless, the team has an interesting approach that could lead to some fascinating applications. The ability to judge the humor in a scene could help researchers develop better photo editing tools, tools that choose funny pictures to post on social media, or even smart cameras that can pick better moments to take humorous pictures.

It is also part of a new field of computational humor—using machine intelligence to make us laugh. Perhaps one day, machines may even share the joke.

Ref:arxiv.org/abs/1512.04407: We Are Humor Beings: Understanding and Predicting Visual Humor

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.