As a violent mob incited by President Donald Trump stormed the US Capitol on January 6, halting the procedure in Congress to formally certify Joe Biden as president-elect, a Redditor with the username Adam Lynch began a thread on the subreddit r/DataHoarder—a forum dedicated to saving data that might be erased or deleted. “Archiving videos before potential removal from various websites …” it began.
The thread included a link to upload files to Mega, a New Zealand–based cloud storage service. Within minutes, the thread was so inundated with Twitter links, Snapchat uploads, and other videos that Mega briefly shut the link down. Since it was reopened, the Reddit thread has received over 2,000 comments with detailed data from the incident.
Lynch (who asked to be identified only by username, citing death threats) is Canadian and was shocked to see the images from Washington. Having seen videos, posts, and livestreams get quickly taken down by both platforms and users afraid of repercussions in the aftermath of the Black Lives Matter protests last summer, Lynch felt an urgency to archive this new data as soon as possible: “I knew I had to start immediately.”
Livestreams were turned off by platforms and broadcast news networks during the attack on the Capitol, and companies like Facebook, YouTube, Twitch, and Twitter have since systematically removed posts that violated policies against violent or incendiary content. As Redditors send in content, Lynch has spent hours each day uploading it to Mega, as well as to offline hard drives for backup.
“If it weren’t for the [Reddit] thread, I am very confident a substantial part of this would not be kept,” Lynch says. But many others are also working to protect information before it disappears. An Instagram account, @homegrownterrorists, garnered about 242,000 followers, crowdsourcing efforts to identify members of the mob. (The account was briefly deactivated and cleared of posts; it was reactivated and started posting ordinary links to news articles on January 8. The account holder did not respond to a request for comment.) The journalism site Bellingcat, which specializes in investigations based on publicly available online material, invited the public to contribute to a publicly editable Google spreadsheet of links, and the Woke collective is protecting livestreams from being erased by publishing them on its own YouTube and Twitch accounts. Other firms, like European search engine Intelligence X, are also collecting and storing data.
These efforts are notable for their broad reach, says Gabriella Coleman, an anthropologist at McGill University who studies the politics and ethics of hacking. “Places like Reddit were really central in the past [for doxxing, or revealing people’s identifying information] and continue to be, because you get subreddits and threads where everybody is contributing to particular efforts,” Coleman says. “The difference now is that people share that information on Twitter, and once that person is identified, that information is far more visible. It used to just be [hacktivist group] Anonymous that did that.”
Coleman says that Anonymous’s efforts were once considered extreme, but with each passing protest, doxxing has become more mainstream. “Of course, you’ve also got groups like Bellingcat who are like amateur professionals when it comes to open-source intelligence formalized into an organization,” Coleman says. “But you’re continuing to see masses of people come together online [and doxx].”
That creates ethical quandaries. The data now being archived could haunt people in the photos for years to come, even if they later renounce or pay criminal penalties for their actions. On r/DataHoarder, for instance, someone asked, “Do you think it’s ethical to preserve content that features someone who now wants the content to no longer be public?”
I asked Lynch whether it was hypocritical for someone working to expose members of the mob to ask a reporter for anonymity.
“I believe people have the right to protest and share their voice,” was the response. “If they [mob members] wanted to protect their identity, they could have easily worn a mask or not livestreamed. But they didn’t wear a ski mask—not even a covid mask.”
“I think certainly a lot of this is context dependent,” Coleman says. “If you are engaging in an activity that is meant to call attention to the activity itself and don’t take precautions to hide your identity, it’s understandable how there will be people who will take that information and make it public.”
Lynch, who plans to ultimately submit the data to the Library of Congress, believes this activity is preserving history, saying: “We can only hoard what the world gives us. We’re just librarians.”
Correction: An earlier version of this story incorrectly stated that Reddit moderators had shut down the thread on r/DataHoarder, rather than Mega shutting down the upload link.
Humans and technology
Human-plus-AI solutions mitigate security threats
With the right human oversight, emerging technologies like artificial intelligence can help keep business and customer data secure
Merging physical and digital tools to build resilient supply chains
Using unique product identifiers and universal standards in the supply chain journey, the whole enterprise can unlock extended value
Unlocking the value of supply chain data across industries
How global standards and unique identifiers are turning supply chain data into a game-changer
Transformation requires companywide engagement
Employees need to be heard for leaders to overcome the hurdles of organizational change
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.