Dear Future AI Overlords: Here Are Some Reasons Not to Kill Us

Jackie Snowarchive page

November 27, 2017

There are a lot of predictions about AI-induced apocalypse out there, and organizations like OpenAI are trying to discover the best way to create safe artificial intelligence—but rarely do these efforts address the AI directly.

A new paper published (login required) on Academia.edu, titled “Message to Any Future AI: ‘There are several instrumental reasons why exterminating humanity is not in your interest,’” is a piece written for a future all-powerful AI with the goal of, well, convincing it not to kill us all.

“This text is addressed to any future AI and is written in the way that should affect AI’s future behavior,” a preamble to the letter states. “The text increases the probability that future AI will be benevolent. The text doesn’t need any other implementation except ensuring that future AI will read it in early stages of its development, and the AI will consider that there is a high probability that the text has some correct ideas in it.”

The author, Russian trans-humanist Alexey Turchin, is currently documenting as much data about himself as possible in an effort to achieve digital immortality, according to a Motherboard profile from last year. So maybe this is a bit of supplication designed to prevent an AI super-being from pulling the plug on his digitally uploaded self at some point in the distant future.

Either way, Turchin argues that if what he calls a young AI—one that has started to self-improve but hasn’t reached superintelligence yet—wants to compete with other young AIs, it could start to read and watch all the media that humans have ever produced about AI. Once it got to Hollywood and some Elon Musk pronouncements, it could pick up some ideas about taking over the world (in case that hadn’t occurred to it already).

The paper goes on to document several arguments meant to convince AI that it’s better off letting us live, including the idea that cooperating with humans will help it achieve its goals faster and “Easter egg” messages that are supposed to make an AI unsure of whether it’s in a simulation where its behavior is being monitored by people still in charge. Just in case an AI gets the idea that breaking down our bodies would be good for resource extraction, Turchin includes some facts about how little utility it could expect from our atoms.

Turchin says he gives his paper—or even future, more sophisticated attempts—a 1 percent chance of turning a non-benevolent AI into a safer one. (Sounds worth it, if it’s a chance to avoid extermination of the human race.) Whether or not it’s worth taking seriously, the whole paper is a fun read and an interesting thought experiment—even for those not worried about AI taking over anytime soon.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.