Skip to Content
Uncategorized

Memo to Washington: Save the Data.

The National Archives’ lack of speed in preserving digital federal records, existing in 16,000 different formats, could lead to serious data losses.
July 1, 2005

If you wander along the National Mall in Washington, DC, you can pop into the marble rotunda of the National Archives for a glimpse of the original Declaration of Independence, Constitution, and Bill of Rights. These calfskin parchments are preserved under glass, bathed in argon gas. But no such care is extended to digital federal records. The government is presumed to have used (or received) data in every format ever crafted by the computer industry – some 16,000 formats at last count – and has stored this data on every kind of hardware. But the fast-changing computer industry never stopped to think about long-term preservation, which means records of contemporary history are fast becoming obsolete – and there’s no existing system to permanently and reliably archive them.

That’s beginning to change, as our story “The Fading Memory of the State,” reports. The U.S. National Archives and Records Administration (NARA) is in the early stages of developing an Electronic Records Archives that will harmonize and preserve all these digital records and make them available online, so saving the nation’s contemporary history from destruction. Solving the problem will in some ways test the limits of computer-science research: NARA must not only preserve every data format ever dreamt up but contend with a volume of material that far exceeds that of even the largest private enterprise. What’s more, it has a responsibility to save all this data for the uniquely long (if ill-defined) time period NARA calls “the life of the Republic.”

Like any other federal agency, NARA is saddled with a cumbersome procurement process. It has hired two major contractors – Lockheed Martin and Harris Corporation – to generate competing preliminary designs, which are scheduled to be unveiled next month. Common sense suggests that the project will need close and continuing scrutiny from the U.S. Congress – and from the National Academies panel of industry and academic experts that has been advising NARA. The goal: to ensure that the resulting digital archive is not a rigid, custom-built system doomed to obsolescence but rather a flexible system grounded as much as possible in commercial offerings and able to evolve with the IT industry. As a good start, NARA could model its digital archive after early versions of digital archives already built by some nations and academic institutions, including MIT.

Clearly, the general problem of digital-record decay needs more attention than it’s currently getting in Washington. Yes, $136 million has been budgeted to date for NARA’s digital-archive project, but not enough has been done to actually force federal departments to harmonize how they store digital records. And some shortsighted cuts can be found in the administration’s proposed 2006 budget – specifically, eliminating $10 million in funding to the National Historical Publications and Records Commission (NHPRC), a grant agency within NARA that supports research in digital archiving and curation. That cut would effectively kill the NHPRC. Yet this is just the kind of small, nimble program that can help find ways to stanch the bleeding of the contemporary historical record. Congress should take the problem more seriously.

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

It’s time to retire the term “user”

The proliferation of AI means we need a new word.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.