Those cool AI-generated images you’ve seen across the internet? There’s a good chance they are based on the works of Greg Rutkowski.
Rutkowski is a Polish digital artist who uses classical painting styles to create dreamy fantasy landscapes. He has made illustrations for games such as Sony’s Horizon Forbidden West, Ubisoft’s Anno, Dungeons & Dragons, and Magic: The Gathering. And he’s become a sudden hit in the new world of text-to-image AI generation.
His distinctive style is now one of the most commonly used prompts in the new open-source AI art generator Stable Diffusion, which was launched late last month. The tool, along with other popular image-generation AI models, allows anyone to create impressive images based on text prompts.
For example, type in “Wizard with sword and a glowing orb of magic fire fights a fierce dragon Greg Rutkowski,” and the system will produce something that looks not a million miles away from works in Rutkowski’s style.
But these open-source programs are built by scraping images from the internet, often without permission and proper attribution to artists. As a result, they are raising tricky questions about ethics and copyright. And artists like Rutkowski have had enough.
According to the website Lexica, which tracks over 10 million images and prompts generated by Stable Diffusion, Rutkowski’s name has been used as a prompt around 93,000 times. Some of the world’s most famous artists, such as Michelangelo, Pablo Picasso, and Leonardo da Vinci, brought up around 2,000 prompts each or less. Rutkowski’s name also features as a prompt thousands of times in the Discord of another text-to-image generator, Midjourney.
Rutkowski was initially surprised but thought it might be a good way to reach new audiences. Then he tried searching for his name to see if a piece he had worked on had been published. The online search brought back work that had his name attached to it but wasn’t his.
“It’s been just a month. What about in a year? I probably won’t be able to find my work out there because [the internet] will be flooded with AI art,” Rutkowski says. “That’s concerning.”
Stability.AI, the company that built Stable Diffusion, trained the model on the LAION-5B data set, which was compiled by the German nonprofit LAION. LAION put the data set together and narrowed it down by filtering out watermarked images and those that were not aesthetic, such as images of logos, says Andy Baio, a technologist and writer who downloaded and analyzed some of Stable Diffusion’s data. Baio analyzed 12 million of the 600 million images used to train the model and found that a large chunk of them come from third-party websites such as Pinterest and art shopping sites such as Fine Art America.
Many of Rutkowski’s artworks have been scraped from ArtStation, a website where lots of artists upload their online portfolios. His popularity as an AI prompt stems from a number of reasons.
First, his fantastical and ethereal style looks very cool. He is also prolific, and many of his illustrations are available online in high enough quality, so there are plenty of examples to choose from. An early text-to-image generator called Disco Diffusion offered Rutkowski as an example prompt.
Rutkowski has also added alt text in English when uploading his work online. These descriptions of the images are useful for people with visual impairments who use screen reader software, and they help search engines rank the images as well. This also makes them easy to scrape, and the AI model knows which images are relevant to prompts.
Stability.AI released the model into the wild for free and allows anyone to use it for commercial or noncommercial purposes, although Tom Mason, the chief technology officer of Stability.AI, says Stable Diffusion’s license agreement explicitly bans people from using the model or its derivatives in a way that breaks any laws or regulations. This places the onus on the users.
Some artists may have been harmed in the process
Other artists besides Rutkowski have been surprised by the apparent popularity of their work in text-to-image generators—and some are now fighting back. Karla Ortiz, an illustrator based in San Francisco who found her work in Stable Diffusion’s data set, has been raising awareness about the issues around AI art and copyright.
Artists say they risk losing income as people start using AI-generated images based on copyrighted material for commercial purposes. But it’s also a lot more personal, Ortiz says, arguing that because art is so closely linked to a person, it could raise data protection and privacy problems.
“There is a coalition growing within artist industries to figure out how to tackle or mitigate this,” says Ortiz. The group is in its early days of mobilization, which could involve pushing for new policies or regulation.
One suggestion is that AI models could be trained on images in the public domain, and AI companies could forge partnerships with museums and artists, Ortiz says.
“It’s not just artists … It’s photographers, models, actors and actresses, directors, cinematographers,” she says. “Any sort of visual professional is having to deal with this particular question right now.”
Currently artists don’t have the choice to opt in to the database or have their work removed. Carolyn Henderson, the manager for her artist husband, Steve Henderson, whose work was also in the database, said she had emailed Stability.AI to ask for her husband’s work to be removed, but the request was “neither acknowledged nor answered.”
“Open-source AI is a tremendous innovation, and we appreciate that there are open questions and differing legal opinions. We expect them to be resolved over time, as AI becomes more ubiquitous and different groups come to a consensus as to how to balance individual rights and essential AI/ML research,” says Stability.AI’s Mason. “We strive to find the balance between innovating and helping the community.”
Mason encourages any artists who don’t want their works in the data set to contact LAION, which is an independent entity from the startup. LAION did not immediately respond to a request for comment.
Berlin-based artists Holly Herndon and Mat Dryhurst are working on tools to help artists opt out of being in training data sets. They launched a site called Have I Been Trained, which lets artists search to see whether their work is among the 5.8 billion images in the data set that was used to train Stable Diffusion and Midjourney. Some online art communities, such as Newgrounds, are already taking a stand and have explicitly banned AI-generated images.
An industry initiative called Content Authenticity Initiative, which includes the likes of Adobe, Nikon, and the New York Times, are developing an open standard that would create a sort of watermark on digital content to prove its authenticity. It could help fight disinformation as well as ensuring that digital creators get proper attribution.
“It could also be a way in which creators or IP holders can assert ownership over media that belongs to them or synthesized media that's been created with something that belongs to them,” says Nina Schick, an expert on deepfakes and synthetic media.
AI-generated art poses tricky legal questions. In the UK, where Stability.AI is based, scraping images from the internet without the artist’s consent to train an AI tool could be a copyright infringement, says Gill Dennis, a lawyer at the firm Pinsent Masons. Copyrighted works can be used to train an AI under “fair use,” but only for noncommercial purposes. While Stable Diffusion is free to use, Stability.AI also sells premium access to the model through a platform called DreamStudio.
The UK, which hopes to boost domestic AI development, wants to change laws to give AI developers greater access to copyrighted data. Under these changes, developers would be able to scrape works protected by copyright to train their AI systems for both commercial and noncommercial purposes.
While artists and other rights holders would not be able to opt out of this regime, they will be able to choose where they make their works available. The art community could end up moving into a pay-per-play or subscription model like the one used in the film and music industries.
“The risk, of course, is that rights holders simply refuse to make their works available, which would undermine the very reason for extending fair use in the AI development space in the first place,” says Dennis.
In the US, LinkedIn lost a case in an appeals court, which ruled last spring that scraping publicly available data from sources on the internet is not a violation of the Computer Fraud and Abuse Act. Google also won a case against authors who objected to the company’s scraping their copyrighted works for Google Books.
Rutkowski says he doesn’t blame people who use his name as a prompt. For them, “it’s a cool experiment,” he says. “But for me and many other artists, it’s starting to look like a threat to our careers.”
Why Meta’s latest large language model survived only three days online
Galactica was supposed to help scientists. Instead, it mindlessly spat out biased and incorrect nonsense.
DeepMind’s game-playing AI has beaten a 50-year-old record in computer science
The new version of AlphaZero discovered a faster way to do matrix multiplication, a core problem in computing that affects thousands of everyday computer tasks.
A bot that watched 70,000 hours of Minecraft could unlock AI’s next big thing
Online videos are a vast and untapped source of training data—and OpenAI says it has a new way to use it.
Google’s new AI can hear a snippet of song—and then keep on playing
The technique, called AudioLM, generates naturalistic sounds without the need for human annotation.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.