Campaigns and elections have always been about data—underneath the empathetic promises to fix your problems and fight for your family, it’s a business of metrics. If a campaign is lucky, it will find its way through a wilderness of polling, voter attributes, demographics, turnout, impressions, gerrymandering, and ad buys to connect with voters in a way that moves or even inspires them. Obama, MAGA, AOC—all have had some of that special sauce. Still, campaigns that collect and use the numbers best win.
That’s been true for some time, of course. In 2017, Hillary Clinton lamented that the Democratic National Committee had supplied her team with out-of-date data. She blamed this in part for her loss to Donald Trump, whose campaign sat atop an impressive Republican data-crunching machine. (The DNC retorted that it wasn’t the data, but how it was used, that was inadequate.)
In 2020, campaigns have added new wrinkles to their tactics for gathering and manipulating data. Traditional polling is giving way to AI-powered predictive modeling; massive data exchanges, once considered questionably legal, allow campaigns, PACs, and other groups to coordinate their efforts. And who can forget microtargeting? Both campaigns seek to arm themselves with comprehensive views of each potential voter and are using algorithms to segment and target voters more specifically and strategically. Here is our guide to what’s new and improved, and what it means for you, the voter.
Voter data galore
Over the last few years, campaigns have been steadily adding to the vast amount of personal information they keep on voters. That’s partly a result of a practice called acquisition advertising, in which campaigns run direct response ads that seek to get either contact information or opinions straight from a person. As of May, both presidential campaigns were spending upwards of 80% of their ad budgets on direct response ads.
Campaign officials don’t like to talk about exactly how much data they keep—but most voter files probably have somewhere between 500 and 2,500 data points per person. (A voter file is an integral data set that consolidates state-level voter registration info. Learn more about them here.) Each ad, phone call, email, and click increases that number. Since the Democratic Data Exchange (or DDx) came online in June, it has aggregated over a billion data points, most of which DDx says is contact information.
Contrary to what one might think, though, many of these personal details come from people who’ve already made up their minds about the candidates. The Trump campaign’s app, for example, allows automatic Bluetooth pairing that can help identify a user’s location—something that has drawn scrutiny. (Bluetooth beacons have been found in Trump yard signs in the past.) This kind of surveillance isn’t considered the norm, but it makes sense. People who download a candidate’s app probably already support that candidate, and committed voters are the most likely to donate.
Data exchanges allow campaigns and PACs to share data, making outreach and messaging more efficient and comprehensive. Republicans have used Data Trust since 2013—it’s a one-stop shop that includes an exchange, voter data, and data hosting services. Democrats initially felt this was a violation of Federal Election Commission rules against cooperation between different types of political organizations, such as PACs, nonprofits, and the campaigns themselves. The American Democracy Legal Fund, a democratic group, sued DataTrust and lost … so naturally Democrats spun up their own version. That’s the Democratic Data Exchange that went live in June.
The promise of data exchanges is to let all aligned organizations share data. According to a demo given to the New York Times, DDx can produce a dashboard that shows how comfortable each voter is with voting by mail, and this is shared among all liberal groups in the exchange. In previous years, local canvassing groups, state parties, and issue-oriented PACs might all have been spending money in parallel collecting that kind of information. On the Republican side, Data Trust has proved its worth many times over. For example, it gathered information on voters who cast their ballots early during the 2018 midterm elections. Campaigns stopped reaching out to those people, saving a reported $100 million.
In ancient Rome, slaves were trained to memorize the names of voters who might be persuaded to vote for their master, so that he could find and greet them personally. These days, the strategy behind personal targeting comes from computer models that can slice the electorate into highly specific groups. Messaging is honed using extensive A/B testing.
Social platforms vary in the kind of microtargeting they allow. Facebook lets campaigns target small groups and individuals. Through its “custom audience” feature, campaigns can upload a spreadsheet of users’ profiles and deploy their message with surgical precision. They can also leverage a tool called “look alike” that uses that custom lists to find profiles likely to respond in similar ways. (Here’s how you can opt out of that type of targeting.) Both presidential campaigns have been doing this, and a project out of New York University is tracking these type of advertisements. It shows, for example, that from July 30 to August 4, an ad splashed with the message “Our Recovery Will Be Made in America” appeared in the feeds of about 2,500 Facebook users in Wisconsin. Those users were selected specifically by profile name from a list uploaded by the Biden campaign. It’s nearly impossible to trace where this small list of names came from, though it was most likely purchased from a third party.
Other platforms are more restrictive. Google banned political microtargeting early this year, while Twitter has banned political ads from campaigns—though it allows ads from politically aligned advocacy groups.
Out with the polls, in with the AI models
You’ve probably heard: polls don’t work the way they used to. The 2016 presidential election touched off an industry crisis centered on the rise of the “non-response bias”—a fancy way of saying that cell-phones users tend not to answer calls from numbers they don’t recognize (like pollsters’), and that people have grown increasingly coy when asked about their political views.
In response, campaigns are turning to machine learning and AI to predict how voters will behave. Instead of relying on intermittent benchmarking of the populace, models are now run using continuously updated data sets. The most common technique campaigns use is called scoring, where a group of voters get assigned a number from 1 to 100 based on how likely they are to do something or hold a certain opinion. Campaigns use those likelihoods to inform their strategy, either by attempting to persuade undecided voters or by leveraging strongly held opinions for money or mobilization.
The models aren’t perfect. In 2016, they predicted Clinton’s win with a margin of error similar to that assumed in the polls. But models have an easier time overcoming some of the problems with polling, and the more data the models ingest, the more accurate they are.
The result: No shared truths
As collective messaging fades in importance, it becomes harder to police the myriad tailored messages political groups are churning out and putting in front of voters. Personalized messaging means that each person’s view of a campaign differs, because each is taking in a different information stream. Embellishment, distortion, and outright lying become that much easier, especially for public figures, whose posts on social platforms often get special treatment. The technologies being fervently employed right now are enabling a reality in which campaigns can manufacture cleavages in the public, fundamentally altering how we form opinions and, ultimately, vote.
All is not lost. Though the 2020 election cycle is in its final stretches, public pressure to redirect these technologies is increasing. In a newly published study, the Pew Research Center showed that 54% of the American public doesn’t think social-media platforms should allow any political advertisements, while 77% of Americans believe data collected on social platforms shouldn’t be used for political targeting.
There are several bills in the Congress that reflect this sentiment, like the bipartisan Designing Accounting Safeguards to Help Broaden Oversight and Regulations on Data Act and the Banning Microtargeted Political Ads Act. These bills are due to be addressed in 2021, and experts think some form of regulation is likely, regardless of who wins the White House.
Three things to know about the White House’s executive order on AI
Experts say its emphasis on content labeling, watermarking, and transparency represents important steps forward.
How generative AI is boosting the spread of disinformation and propaganda
In a new report, Freedom House documents the ways governments are now using the tech to amplify censorship.
A controversial US surveillance program is up for renewal. Critics are speaking out.
Here's what you need to know.
Government technology is famously bad. It doesn’t have to be.
New York City is fixing the relationship between government and technology–and not in the ways you’d expect.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.