A Cambridge Analytica-style scandal for AI is coming

Can you imagine a car company putting a new vehicle on the market without built-in safety features?

Melissa Heikkiläarchive page

April 25, 2023

EUROPEAN DATA PROTECTION SUPERVISOR

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Can you imagine a car company putting a new vehicle on the market without built-in safety features? Unlikely, isn’t it? But what AI companies are doing is a bit like releasing race cars without seatbelts or fully working brakes, and figuring things out as they go.

This approach is now getting them in trouble. For example, OpenAI is facing investigations by European and Canadian data protection authorities for the way it collects personal data and uses it in its popular chatbot ChatGPT. Italy has temporarily banned ChatGPT, and OpenAI has until the end of this week to comply with Europe’s strict data protection regime, the GDPR. But in my story last week, experts told me it will likely be impossible for the company to comply, because of the way data for AI is collected: by hoovering up content off the internet.

The breathless pace of development means data protection regulators need to be prepared for another scandal like Cambridge Analytica, says Wojciech Wiewiórowski, the EU’s data watchdog.

Wiewiórowski is the European data protection supervisor, and he is a powerful figure. His role is to hold the EU accountable for its own data protection practices, monitor the cutting edge of technology, and help coordinate enforcement around the union. I spoke with him about the lessons we should learn from the past decade in tech, and what Americans need to understand about the EU’s data protection philosophy. Here’s what he had to say.

What tech companies should learn: That products should have privacy features designed into them from the beginning. However, “it’s not easy to convince the companies that they should take on privacy-by-design models when they have to deliver very fast,” he says. Cambridge Analytica remains the best lesson in what can happen if companies cut corners when it comes to data protection, says Wiewiórowski. The company, which became one of Facebook’s biggest publicity scandals, had scraped the personal data of tens of millions of Americans from their Facebook accounts in an attempt to influence how they voted. It’s only a matter of time until we see another scandal, he adds.

What Americans need to understand about the EU’s data protection philosophy: “The European approach is connected with the purpose for which you use the data. So when you change the purpose for which the data is used, and especially if you do it against the information that you provide people with, you are in breach of law,” he says. Take Cambridge Analytica. The biggest legal breach was not that the company collected data, but that it claimed to be collecting data for scientific purposes and quizzes, and then used it for another purpose—mainly to create political profiles of people. This is a point made by data protection authorities in Italy, which have temporarily banned ChatGPT there. Authorities claim that OpenAI collected the data it wanted to use illegally, and did not tell people how it intended to use it.

Does regulation stifle innovation? This is a common claim among technologists. Wiewiórowski says the real question we should be asking is: Are we really sure that we want to give companies unlimited access to our personal data? “I don’t think that the regulations … are really stopping innovation. They are trying to make it more civilized,” he says. The GDPR, after all, protects not only personal data but also trade and the free flow of data over borders.

Big Tech’s hell on Earth? Europe is not the only one playing hardball with tech. As I reported last week, the White House is mulling rules for AI accountability, and the Federal Trade Commission has even gone as far as demanding that companies delete their algorithms and any data that may have been collected and used illegally, as happened to Weight Watchers in 2022. Wiewiórowski says he is happy to see President Biden call on tech companies to take more responsibility for their products’ safety and finds it encouraging that US policy thinking is converging with European efforts to prevent AI risks and put companies on the hook for harms. “One of the big players on the tech market once said, ‘The definition of hell is European legislation with American enforcement,’” he says.

Read more on ChatGPT

The inside story of how ChatGPT was built from the people who made it

How OpenAI is trying to make ChatGPT safer and less biased

ChatGPT is everywhere. Here’s where it came from.

ChatGPT is about to revolutionize the economy. We need to decide what that looks like.

ChatGPT is going to change education, not destroy it

_____________________________________________________________________

DEEPER LEARNING

Learning to code isn’t enough

The past decade has seen a slew of nonprofit initiatives that aim to teach kids coding. This year North Carolina is considering making coding a high school graduation requirement. The state follows in the footsteps of five others with similar policies that consider coding and computer education fundamental to a well-rounded education: Nevada, South Carolina, Tennessee, Arkansas, and Nebraska. Advocates for such policies contend that they expand educational and economic opportunities for students.

No panacea: Initiatives aiming to get people to become more competent at tech have existed since the 1960s. But these programs, and many that followed, often benefited the populations with the most power in society. Then as now, just learning to code is neither a pathway to a stable financial future for people from economically precarious backgrounds nor a panacea for the inadequacies of the educational system. Read more from Joy Lisi Rankin.

_____________________________________________________________________

BITS AND BYTES

Inside the secret list of websites that make AI like ChatGPT sound smart

Essential reading for anyone interested in making AI more responsible. We have a very limited understanding of what goes into the vast data sets behind AI systems, but this story sheds a light on where the data for AI comes from and what kinds of biases come with it. (The Washington Post)

Google Brain and DeepMind join forces

Alphabet has merged its two AI research units into one mega unit, now called Google DeepMind. The merger comes as Alphabet leadership is increasingly nervous about the prospect of competitors overtaking it in AI. DeepMind has been behind some of the most exciting AI breakthroughs of the past decade, and integrating its research deeper into Google products could help the company gain an advantage.

Google Bard can now be used to code

Google has rolled out a new feature that lets people use its chatbot Bard to generate, debug, and explain code, much like Microsoft’s GitHub copilot.

Some say ChatGPT shows glimpses of AGI in ChatGPT. Others call it a mirage

Microsoft researchers caused a stir when they released a paper arguing that ChatGPT showed signs of artificial general intelligence. This is a nice writeup of the different ways researchers are trying to understand intelligence in machines, and how challenging it is. (Wired)

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.