Sam Altman: This is what I learned from DALL-E 2

Three things the groundbreaking generative model taught OpenAI’s CEO.

Will Douglas Heavenarchive page

December 16, 2022

Getty

Sam Altman, OpenAI’s CEO, has been at the heart of the San Francisco–based firm since cofounding it with Elon Musk and others in 2015. His vision for the future of AI and how to get there has shaped not only what OpenAI does, but also the direction in which AI research is heading in general. OpenAI ushered in the era of large language models with its launch of GPT-3 in 2020. This year, with the release of its generative image-making model DALL-E 2, it has set the AI agenda again.

When it dropped back in April, DALL-E -2 set off an explosion of creativity and innovation that is still going. Other models soon followed—models that are better, or are free to use and adapt. But DALL-E 2 was where it began, the first “Wow” moment in a year that will leave a mark not only on AI but on mainstream society and culture for years to come. As Altman acknowledges, that impact is not all positive.

I spoke to Altman about what he’d learned from DALL-E 2. “I think there's an important set of lessons for us about what the next decade’s going to be like for AI,” he says. (You can read my piece on generative AI’s long-term impact here.)

These extracts from our conversation have been edited for clarity and length.

Here is Sam Altman, in his own words, on:

1/ Why DALL-E 2 made such an impact

It crossed a threshold where it could produce photorealistic images. But even with non-photorealistic images, it seems to really understand concepts well enough to combine things in new ways, which feels like intelligence. That didn’t happen with DALL-E 1.

But I would say the tech community was more amazed by GPT-3 back in 2020 than DALL-E. GPT-3 was the first time you actually felt the intelligence of a system. It could do what a human did. I think it got people who previously didn’t believe in AGI [artificial general intelligence] at all to take it seriously. There was something happening there none of us predicted.

But images have an emotional power. The rest of the world was much more amazed by DALL-E than GPT-3.

2/ What lessons he learned from DALL-E 2’s success

I think there’s an important set of lessons for us about what the next decade’s going to be like for AI. The first is where it came from, which is a team of three people poking at an idea in, like, a random corner of the OpenAI building.

This one single idea about diffusion models, just a little breakthrough in algorithms, took us from making something that’s not very good to something that can have a huge impact on the world.

Another thing that’s interesting is that this was the first AI that everyone used, and there’s a few reasons why that is. But one is that it creates, like, full finished products. If you’re using Copilot, our code generation AI, it has to have a lot of help from you. But with DALL-E 2, you tell it what you want, and it’s like talking to a colleague who’s a graphic artist. And I think it’s the first time we’ve seen this with an AI.

3/ What DALL-E means for society

When we realized that DALL-E 2 was going to be a big thing, we wanted to have it be an example of how we’re going to deploy new technology—get the world to understand that images might be faked and be like, “Hey, you know, pretty quickly you’re going to need to not trust images on the internet.”

We also wanted to talk to people who are going to be most negatively impacted first, and have them get to use it. It’s not the current framework, but the world I would like us, as a field, to get to is one where if you are helping train an AI by providing data, you should somehow own part of that model.

But, look, it’s important to be transparent. This is going to impact the job market for illustrators. The amount one illustrator is able to do will go up by, like, a factor of 10 or 100. How that impacts the job market is very hard to say. We honestly don’t know. I can see it getting bigger just as easily as I can see it getting smaller. There will, of course, be new jobs with these tools. But there will also be a transition.

At the same time, there’s huge societal benefit, where everybody gets this new superpower. I’ve used DALL-E 2 for a lot of things. I’ve made art that I have up in my house. I did a remodel of my house, too, and I used it quite successfully for architectural ideas.

Some friends of mine are getting married. Every little part of their website has images generated by DALL-E, and they’re all meaningful to the couple. They never would have hired an illustrator to do that.

And finally, you know, we just wanted to use DALL-E 2 to educate the world that we are actually going to do it—we’re actually going to make powerful AI that understands the world like a human does, that can do useful things for you like a human can. We want to educate people about what’s coming so that we can participate in what will be a very hard societal conversation.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Sam Altman: This is what I learned from DALL-E 2

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review