Encapsulation: Digital Cryonics

Neither migration nor emulation, then, offers a satisfactory long-term way to wrest digital bits from what Shakespeare called “the wrackfull siege of batt’ring days.” The only real way to keep digital things alive for the duration, many believe, is to lift them out of this inexorable march of digital progress-but to leave signposts that will tell future generations how to reconstruct what has passed.

Consortia of libraries and archivists worldwide are working on a solution called encapsulation: a way to group digital objects together with descriptive “wrappers” containing instructions for decoding their bits in the future. A wrapper would include both a physical outer layer, similar to the jacket of a floppy disk, imprinted with human-readable text describing the encapsulated content and how to use it, and a digital inner layer containing the specifications for the software, operating system and hardware needed to read the object itself. A Microsoft Word document, for example, might be packaged with instructions for re-creating Word, Windows and perhaps even an emulated version of a Wintel PC. For text documents, at least, encapsulation seems likely to be a viable method for long-term preservation, especially once international standards bodies agree on a uniform system for building wrappers. But if the documents being preserved contain more than simple text, encapsulation seems less likely to succeed: there are simply too many new software releases, compression schemes and hardware formats each year to describe all of them through encapsulation.

“The pagination is off even when you open a last-generation Word document,” observes Steve Gilheany, senior systems engineer at Archive Builders, a Manhattan Beach, CA-based records-management consulting group that has assisted the city of Los Angeles in its digital-document preservation. “Imagine then what happens when you try to open it in a hundred years or try to access a digital object more complicated than pages of text.”

Gilheany’s proposed solution is simpler, borrowing the concept behind that archetypal decryption key, the Rosetta stone. He recommends archiving critical files in at least three formats: The first would be a standard raster or bit-map format, where there is a one-to-one correspondence between how coordinates are stored and how they are displayed, without the kind of compression used today for large files like JPEG images. The second would be the file’s native format, whatever it happens to be, to simplify any future modifications. The third would be a “vector-based” format storing each letter, symbol or image in the form of a mathematical description of its shape on the page; Adobe Systems’ Portable Document Format is one example. In theory, each version could be used to decode the others. Gilheany has spent eight years assisting the Los Angeles city government in converting its original infrastructure documents into raster and PDF files, and in the absence of a better solution, most government agencies and others with critical archival needs are taking a similar approach.

Encapsulation and conversion, though, require foresight; as Smith notes, anything that isn’t expressly encapsulated or converted will surely disappear. These solutions also aren’t particularly long lived, at least compared with things like stone hieroglyphs or even paper. “Some researchers predict very long lifetimes for some types of media,” says Raymond Lorie, a research fellow at IBM’s Almaden Research Center in San Jose, CA. “But if a medium is good for N years, what do we do for N-plus-one years? Whatever N is, the problem does not go away.”

Tagged: Communications

