The Bodleian Library at the University of Oxford in England is the only place you are likely to find an Ethernet port that looks like a book. Built into the ancient bookcases dominating the oldest wing of the 402-year-old library, the brown plastic ports share shelf space with handwritten catalogues of the university’s medieval manuscripts and other materials. Some of the volumes are still chained to the shelves, a 17th-century innovation designed to discourage borrowing. But thanks to the Ethernet ports and the university’s effort to digitize irreplaceable books like the catalogues – which often contain the only clue to locating an obscure book or manuscript elsewhere in the vast library – users of the Bodleian don’t even need to take the books off the shelves. They can simply plug in their laptops, connect to the Internet, and view the pertinent pages online. In fact, anyone with a Web browser can read the catalogues, a privilege once restricted to those fortunate enough to be teaching or studying at Oxford.
The digitization of the world’s enormous store of library books–an effort dating to the early 1990s in the United Kingdom, the United States, and elsewhere–has been a slow, expensive, and underfunded process. But last December librarians received a pleasant shock. Search-engine giant Google announced ambitious plans to expand its “Google Print” service by converting the full text of millions of library books into searchable Web pages. At the time of the announcement, Google had already signed up five partners, including the libraries at Oxford, Harvard, Stanford, and the University of Michigan, along with the New York Public Library. More are sure to follow.
Most librarians and archivists are ecstatic about the announcement, saying it will likely be remembered as the moment in history when society finally got serious about making knowledge ubiquitous. Brewster Kahle, founder of a nonprofit digital library known as the Internet Archive, calls Google’s move “huge….It legitimizes the whole idea of doing large-volume digitization.”
But some of the same people, including Kahle, believe Google’s efforts and others like it will force libraries and librarians to reëxamine their core principles – including their commitment to spreading knowledge freely. Letting a for-profit organization like Google mediate access to library books, after all, could either open up long-hidden reserves of human wisdom or constitute the first step toward the privatization of the world’s literary heritage. “You’d think that if libraries are serious about providing access to high-quality material, the idea of somebody digitizing that stuff very quickly – well, what’s not to like?” says Abby Smith, director of programs for the Council on Library and Information Resources, a Washington, DC, nonprofit that helps libraries manage digital transformation. “But some librarians are very concerned about the terms of access and are very concerned that a commercial entity will have control over materials that libraries have collected.”
They’re also concerned about the book business itself. Publishers and authors count on strict copyright laws to prevent copying and reuse of their intellectual property until after they’ve recouped their investments. But libraries, which allow many readers to use the same book, have always enjoyed something of an exemption from copyright law. Now the mass digitization of library books threatens to make their content just as portable – or piracy prone, depending on one’s point of view – as digital music. And that directly involves libraries in the clash between big media companies and those who would like all information to be free – or at least as cheap as possible.
Whatever happens, transforming millions more books into bits is sure to change the habits of library patrons. What, then, will become of libraries themselves? Once the knowledge now trapped on the printed page moves onto the Web, where people can retrieve it from their homes, offices, and dorm rooms, libraries could turn into lonely caverns inhabited mainly by preservationists. Checking out a library book could become as anachronistic as using a pay phone, visiting a travel agent to book a flight, or sending a handwritten letter by post.
Surprisingly, however, most backers of library digitization expect exactly the opposite effect. They point out that libraries in the United States are gaining users, despite the advent of the Web, and that libraries are being constructed or renovated at an unprecedented rate (architect Rem Koolhaas’s Seattle Central Library, for example, is the new jewel of that city’s downtown). And they predict that 21st-century citizens will head to their local libraries in even greater numbers, whether to use their free Internet terminals, consult reference specialists, or find physical copies of copyrighted books. (Under the Google model, only snippets from these books will be viewable on the Web, unless their authors and publishers agree otherwise.) And considering that the flood of new digital material will make the job of classifying, cataloguing, and guiding readers to the right texts even more demanding, librarians could become busier than ever.
“I chafe at the presumption that once you digitize, there is nothing left to do,” says Donald Waters, a former director of the Digital Library Federation who now oversees the Andrew W. Mellon Foundation’s extensive philanthropic investments in projects to enhance scholarly communication. “There is an enormous amount to do, and digitizing is just scratching the surface.”
Digitization itself, of course, is no small challenge. Scanning the pages of brittle old books at high speed without damaging them is a problem that’s still being addressed, as is the question of how to store and preserve their content once it’s in digital form. The Google initiative has also amplified a long-standing debate among librarians, authors, publishers, and technologists over how to guarantee the fullest possible access to digitized books, including those still under copyright (which, in the United States, means everything published after January 1, 1923). The stakes are high, both for Google and for the library community – and the technologies and business agreements being framed now could determine how people use libraries for decades to come.
“Industry has resources to invest that we don’t have anymore and never will have,” points out Gary Strong, university librarian at the University of California, Los Angeles, which has its own aggressive digitization programs. “And they’ve come to libraries because we have massive repositories of information. So we’re natural partners in this venture, and we all bring different skills to the table. But we’re redefining the table itself. Now that we’re defining new channels of access, how do we make sure all this information is usable?”