Skip to Content
Uncategorized

Net Worth

Efforts to preserve the Web should make use of the powerful, distributed collaboration it allows.

The challenge of collecting and preserving the Web, or even a representative sample of it, is a daunting one (see “Fire in the Library”). It is not enough to simply capture the information a website contained, be that text, images, or video. We must preserve something of the experience and activity a site supported. How a site was accessed, who linked to it, and how that changed over time provide important context for critical events such as the recent tsunami in Japan or the events of 9/11, which are relatively distant at the speed at which the Web evolves and leaves data behind. No lone institution can attempt to preserve all that. It will take the commitment of a critical mass of government institutions, companies, nonprofits, and more to ensure the longevity of our digital heritage, nationally and globally.

Current notions of what the Web represents socially, culturally, politically, economically, legally, and even scientifically vary depending on where you happen to live in the world. The value systems to which you subscribe shape what you see in the Web. This is an advantage when thinking of how to preserve the diversity of experience online. Unfortunately, many factors work against the cross-cultural collaboration needed to preserve the Web’s diversity at scale. Local legislation can hinder attempts to share information; companies can fear negative commercial consequences from providing access to their data; and limited budgets constrain the few organizations, such as the Internet Archive, that are dedicated to preserving the Web.

In a perfect world, this would not be the case. Individuals, governments, universities, libraries, and corporations would all work to preserve the world’s most vibrant cultural medium. Imagine for a moment an approach to preservation that builds on the fundamental strengths of the Internet itself—distributed, ubiquitous, relatively inexpensive, not easily quelled or manipulated by any single actor. “Netizens” from around the globe would work to build a unified Web archive spanning cultural, political, and commercial boundaries. Subject-­matter experts would ensure that their spheres were adequately represented; others would confirm that a representative sample across all domains was being collected.

The result would not be a single resource but, rather, a distributed collection of them. We would need the equivalent of search engines for this Web of the past, and new tools to mine, graph, and study it.

Making this happen would require a global willingness to exchange data for long-term preservation. Is this too far-out to imagine? Perhaps. But such coöperation is appearing within international research communities and cultural groups in both Europe and the United States. This work creates a foundation we can build upon. Only by encouraging this type of collaboration among like-minded communities can we hope to preserve any significant slice of the Web. The future does not afford anyone the luxury of the unlimited time, funds, computing power, and storage capacity that would be needed to do it alone.

Kris Carpenter Negulescu is director of Web ­archiving at the Internet Archive, a nonprofit Internet library that preserves digital content.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.