Indeed, for every Domesday Project that has lost its data to proprietary equipment and file formats, it is easy to point to another project for which information created decades ago is still available. The Internet “Request For Comment” (RFC) series, started back in the 1970s, is readable on practically every computer on the planet today because the RFCs were stored in plain ASCII text. Similarly, you can download images sent back from the Voyager space probes 30 years ago and view them on your PC because NASA stored those pictures as bitmaps-pixel-by-pixel copies of the images without any compression whatsoever.Some argue that it’s impossible to look into the future and determine which of today’s formats will survive and which will go the way of the VP 415. Poppycock! As a society we have a very good understanding of what will make one file format endure while another one is likely to perish. The key to survival is openness and documentation.
It is simply inconceivable that documents created today in Adobe’s Portable Document Format (PDF), or images stored in the Joint Photographic Expert Group (JPEG) format, won’t be decipherable on computers in the year 2030. That’s because both the PDF and the JPEG formats are well-defined and widely understood. Adobe has lost control of PDF: there are more than a dozen programs that can create PDFs and display them on a wide range of computers. In other words, PDF is no longer a proprietary format. The same goes for JPEG. Yes, Adobe may fail and new 3D cameras may make two-dimensional photography obsolete. But we will always be able to read files in these formats, because the detailed technical knowledge of how to do so is widely distributed throughout society.
What about the physical media itself? Although there are many examples of tapes and floppy disks being unreadable five or 10 years after they are created, there are many counterexamples as well. Generally speaking, people who make an effort to preserve digital documents have no problem doing so.
Take, for example, the electrical standard (sometimes called IDE, now called ATA) that’s used by the disk drives in most PCs. Developed in the 1980s, the ATA interface has been significantly enhanced over the past 20 years. Yet with rare exceptions, you can take a hard disk drive from the late 1980s or early 1990s, plug it into a modern desktop computer, and read the files that the disk contains. That’s because the power cables, physical mounting brackets, data connectors, and even the electrical signals used by today’s computers are compatible with the old drives. What’s more, today’s PCs, Macs, and Linux boxes all can read DOS file systems created in the 1980s. If the disk spins, you can frequently get back the data.
Consumer optical storage media has evolved into an even more stable standard. Music CDs and CD-ROMs created in the 1980s are still readable on today’s DVD drives. When the next generation of optical storage comes out, it’s likely to be backwards compatible as well. A disk drive unable to read old CDs would not be commercially viable.
Electronic archivists do have a significant challenge facing them: computer systems make it easy to put a tremendous amount of information in a single place. If you aren’t careful, it’s easy to lose all of this information at once. And today’s computer systems are so tremendously reliable that fewer and fewer users are properly backing up their data; people just don’t remember the bad old days when a computer might fail at a moment’s notice.
But on the whole, I think that electronic records are far more stable, more durable, and more likely to last than their paper equivalents. The technical problems are largely solved. We know how to create David Stork’s Digital Lock Box. What’s needed now is a plan to make long-term electronic archival services available to the masses.