Machine Creativity Beats Some Modern Art
If machines can outperform humans at playing games and driving cars, can they also produce better art? A new kind of Turing test aims to find out.
Creativity is one of the great challenges for machine intelligence. There is no shortage of evidence showing how machines can match and even outperform humans in vast areas of endeavor, such as face and object recognition, doodling, image synthesis, language translation, a vast variety of games such as chess and Go, and so on. But when it comes to creativity, the machines lag well behind.
Recommended for You
Not through lack of effort. For example, machines have learned to recognize artistic style, separate it from the content of an image, and then apply it to other images. That makes it possible to convert any photograph into the style of Van Gogh’s Starry Night, for instance. But while this and other work provides important insight into the nature of artistic style, it doesn’t count as creativity. So the challenge remains to find ways of exploiting machine intelligence for creative purposes.
Today, we get some insight into progress in this area thanks to the work of Ahmed Elgammal at the Art & AI Laboratory at Rutgers University in New Jersey, along with colleagues at Facebook’s AI labs and elsewhere.
These guys have trained a machine to generate images that are recognizably similar to human art and yet different in measurable ways. What’s more, they’ve carried out a kind of Turing test for creativity by asking humans whether they can tell the difference between human art and this machine-generated art. They also asked humans which kind of art they prefer, with results that are somewhat unexpected.
The approach is relatively straightforward. It relies on a machine called a generative adversarial network. This consists of two neural networks that together bootstrap the learning process.
One of these networks is a traditional machine-vision algorithm that learns to recognize images of a specific type. Elgammal and co trained it using the WikiArt database, which consists of over 80,000 paintings by more than 1,000 artists dating from the 15th century to the 20th century.
Each image is tagged with its artistic style. So the database contains over 13,000 Impressionist paintings, 2,000 Cubist paintings, over 1,000 early Renaissance paintings, and so on. The machine learns to recognize each of these styles.
The next stage is for another network to generate random images and show them to the trained network, which either recognizes them as representing a particular artistic style or rejects them. By producing lots of images, this second network learns what the first network recognizes as art by a process of trial and error. And after many iterations, it learns to produce images that match specific styles.
However, the team does not consider these images creative because they are simply copying known styles in art. By contrast, a human artist would push the boundaries by producing something new.
There are plenty of hypotheses from art historians and psychologists about the creative process that leads to new art. For example, a well-known idea is that new artistic work has to be firmly rooted in an artistic tradition. In other words, it has to be different, but not too different.
In particular, theorists say that art must stimulate the viewer in specific ways. “The most significant arousal-raising properties for aesthetics are novelty, surprisingness, complexity, ambiguity, and puzzlingness,” say Elgammal and co.
“Novelty refers to the degree a stimulus differs from what an observer has seen/experienced before. Surprisingness refers to the degree a stimulus disagrees with expectation. Surprisingness is not necessarily correlated with novelty, for example it can stem from lack of novelty. Unlike novelty and surprisingness, which rely on inter-stimulus comparisons of similarity and differences, complexity is an intra-stimulus property that increases as the number of independent elements in a stimulus grows. Ambiguity refers to the conflict between the semantic and syntactic information in a stimulus. Puzzlingness refers to the ambiguity due to multiple, potentially inconsistent, meanings.”
But whatever the effect, the level of arousal it generates must be moderate rather than extreme. “Too little arousal potential is considered boring, and too much activates the aversion system, which results in negative response,” say Elgammal and co.
That has important implications for the way their generative adversarial network, or agent, is set up. “The agent’s goal is to generate art with increased levels of arousal potential in a constrained way without activating the aversion system,” they say. “In other words, the agent tries to generate art that is novel, but not too novel.”
The researchers say they have found a way to make their generative adversarial network do this. Having learned to reproduce certain artistic styles, the machine is set up to produce images that fall within accepted limits of art as a whole but maximize the difference from known styles. “The agent tries to explore the creative space by deviating from the established style norms and thereby generates new art,” say Elgammal and co. They call this machine a creative adversarial network.
The acid test, of course, is how humans react to this machine-generated art. To find out, Elgammal and co showed a range of images—both human- and machine-generated—to human workers on Mechanical Turk, an online crowdsourcing service.
The human-generated images included those from the WikiArt database of Abstract Expressionism as well as a selection of images from a flagship contemporary-art fair held in Basel, Switzerland, in 2016. These images represent the pinnacle of modern art. The reason for choosing Abstract Expressionism was to largely remove images of people and objects that clearly help distinguish machine art from human art.
Some of the machine-generated images were produced by the creative adversarial network, but others were produced by the generative adversarial network that simply reproduces artistic styles it has learned.
The researchers asked each person to rate the images in various ways, such as how much they liked each one, how novel it seemed, and whether it was created by a human or a machine. They also asked whether participants could sense the artist’s intention, whether they could see a structure emerging in the picture, and whether the image inspired them.
The results make for interesting reading. Humans were pretty good at spotting Abstract Expressionist images created by a human and those created by a machine. But in the case of the art from Basel, humans viewers had a hard time telling the difference.
The humans also rated the images created by the creative adversarial network more highly than the human-generated art shown at Basel. They identified with it more closely and found it more inspiring.
It is tempting to interpret this as a damning indictment of the state of modern art and the level of creativity it engenders. But Elgammal and co avoid a knee-jerk reaction. “We leave open how to interpret the human subjects’ responses that ranked the CAN art better than the Art Basel samples in different aspects,” they say diplomatically.
The bigger question is whether the process that Elgammal and co have used to make their images can truly be thought of as creative. Another interpretation is that it is a purely algorithmic process that has learned to exploit humanity’s emotional vulnerabilities. If so, perhaps a future definition of art will have to include the stipulation that it must be created by a human.
Either way, this kind of work is set to push the boundaries of art and creativity just a little bit further.
Ref: arxiv.org/abs/1706.07068 : CAN: Creative Adversarial Networks Generating “Art” by Learning About Styles and Deviating from Style Norms
*Answer: all of them
Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.Subscribe today