The official repository of retired U.S. government records is a boxy white building tucked into the woods of suburban College Park, MD. The National Archives and Records Administration (NARA) is a subdued place, with researchers quietly thumbing through boxes of old census, diplomatic, or military records, and occasionally requesting a copy of one of the computer tapes that fill racks on the climate-controlled upper floors. Researchers generally don’t come here to look for contemporary records, though. Those are increasingly digital, and still repose largely at the agencies that created them, or in temporary holding centers. It will take years, or decades, for them to reach NARA, which is charged with saving the retired records of the federal government (NARA preserves all White House records and around 2 percent of all other federal records; it also manages the libraries of 12 recent presidents). Unfortunately, NARA doesn’t have decades to come up with ways to preserve this data. Electronic records rot much faster than paper ones, and NARA must either figure out how to save them permanently, or allow the nation to lose its grip on history.
One clear morning earlier this year, I walked into a fourth-floor office overlooking the woods. I was there to ask Allen Weinstein – sworn in as the new Archivist of the United States in February – how NARA will deal with what some have called the pending “tsunami” of digital records. Weinstein is a former professor of history at Smith College and Georgetown University and the author of Perjury: The Hiss-Chambers Case (1978) and coauthor of The Story of America (2002). He is 67, and freely admits to limited technical knowledge. But a personal experience he related illustrates quite well the challenges he faces. In 1972, Weinstein was a young historian suing for the release of old FBI files. FBI director J. Edgar Hoover – who oversaw a vast machine of domestic espionage – saw a Washington Post story about his efforts, wrote a memo to an aide, attached the Post article and penned into the newspaper’s margin: “What do we know about Weinstein?” It was a telling note about the mind-set of the FBI director and of the federal bureaucracy of that era. And it was saved – Weinstein later found the clipping in his own FBI file.
But it’s doubtful such a record would be preserved today, because it would likely be “born digital” and follow a convoluted electronic path. A modern-day J. Edgar Hoover might first use a Web browser to read an online version of the Washington Post. He’d follow a link to the Weinstein story. Then he’d send an e-mail containing the link to a subordinate, with a text note: “What do we know about Weinstein?” The subordinate might do a Google search and other electronic searches of Weinstein’s life, then write and revise a memo in Microsoft Word 2003, and even create a multimedia PowerPoint presentation about his findings before sending both as attachments back to his boss.
1,024 kilobytes. The length of a short novel or the storage available on an average floppy disk.
1,024 megabytes. Roughly 100 minutes of CD-quality stereo sound.
1,024 gigabytes. Half of the content in an academic research library.
1,024 terabytes. Half of the content in all U.S. academic research libraries.
1,024 petabytes. Half of all the information generated in 1999.
What steps in this process can be easily documented and reliably preserved over decades with today’s technology? The short answer: none. “They’re all hard problems,” says Robert Chadduck, a research director and computer engineer at NARA. And they are symbolic of the challenge facing any organization that needs to retain electronic records for historical or business purposes.
Imagine losing all your tax records, your high school and college yearbooks, and your child’s baby pictures and videos. Now multiply such a loss across every federal agency storing terabytes of information, much of which must be preserved by law. That’s the disaster NARA is racing to prevent. It is confronting thousands of incompatible data formats cooked up by the computer industry over the past several decades, not to mention the limited lifespan of electronic storage media themselves. The most famous documents in NARA’s possession – the Declaration of Independence, the Constitution, and the Bill of Rights – were written on durable calfskin parchment and can safely recline for decades behind glass in a bath of argon gas. It will take a technological miracle to make digital data last that long.
But NARA has hired two contractors – Harris Corporation and Lockheed Martin – to attempt that miracle. The companies are scheduled to submit competing preliminary designs next month for a permanent Electronic Records Archives (ERA). According to NARA’s specifications, the system must ultimately be able to absorb any of the 16,000 other software formats believed to be in use throughout the federal bureaucracy – and, at the same time, cope with any future changes in file-reading software and storage hardware. It must ensure that stored records are authentic, available online, and impervious to hacker or terrorist attack. While Congress has authorized $100 million and President Bush’s 2006 budget proposes another $36 million, the total price tag is unknown. NARA hopes to roll out the system in stages between 2007 and 2011. If all goes well, Weinstein says, the agency “will have achieved the start of a technological breakthrough equivalent in our field to major ‘crash programs’ of an earlier era – our Manhattan Project, if you will, or our moon shot.”