But others are more cautious about the leap Google’s partner libraries are taking. Brewster Kahle, who is often described as an inspiring visionary and sometimes as an impractical idealist, founded the nonprofit Internet Archive in 1996 under the motto “universal access to human knowledge.” Since then, the archive has preserved more than a petabyte’s worth of Web pages (a petabyte is a million gigabytes), along with 60,000 digital texts, 21,000 live concert recordings, and 24,000 video files, from feature films to news broadcasts. It’s all free for the taking at www.archive.org, and as you might guess, Kahle argues that all digital library materials should be as freely and openly accessible as physical library materials are now.
That’s not such a radical idea; free and open access is exactly what public libraries, as storehouses of printed books and periodicals, have traditionally provided. But the very fact that digital files are so much easier to share than physical books (which scares publishers just as MP3 file sharing scares record companies) could lead to limits on redistribution that prevent libraries from giving patrons as much access to their digital collections as they would like. “Google has brought us to a tipping point that could define how access to the world’s literature may proceed,” Kahle says.
In Kahle’s view, every previous digitization effort has followed one of three paths; with a bit of oratorical flourish, he calls them Door One, Door Two, and Door Three. (Kahle acknowledges up front that his picture is simplified, and that these aren’t necessarily the only paths open to libraries today.)
Door One, says Kahle, is epitomized by Corbis, an image-licensing firm owned by Microsoft founder Bill Gates. Since the early 1990s, Corbis has acquired rights to digital reproductions of works from the National Gallery of London, the State Hermitage Museum in St. Petersburg, Russia, the Philadelphia Museum of Art, and more than 15 other museums. In some cases, it’s now impossible to use these images without paying Corbis. “This organization got its start by digitizing what was in the public domain and essentially putting it under private control,” says Kahle. “The same thing could happen with digital literature. In fact, it’s the default case.”
Behind Door Two, parallel public and private databases coexist peacefully. Here Kahle cites the Human Genome Project, which culminated in two versions of the DNA sequence of the human genome – a free version produced by government-funded scientists and a private version produced by Rockville, MD–based Celera Genomics and used by pharmaceutical companies to identify new drug candidates. The model has worked well in genomics, and Google seems to be setting out on a similar path, as it keeps one copy of each library’s collection for itself and gives away the other. Kahle worries, however, that the restrictions Google imposes on libraries will prevent them from working with other companies or organizations to disseminate digital texts. Libraries might be barred, for example, from contributing material to projects such as the Internet Archive’s Bookmobile, a van with satellite Internet access that can download and print any of 20,000 public-domain books.
Door Three, Kahle’s favorite, hinges on new partnerships in which private companies offer commercial access to digital books while public entities, such as libraries, are allowed to provide free access for research and scholarship. Here his main example is the Internet Archive’s collaboration with Alexa, a company founded by Kahle himself in 1996 and sold to Amazon in 1999. Alexa ranks websites according to the traffic they attract, and its servers, like Google’s, constantly crawl the Internet, making copies of each page they find. But after six months, Alexa donates those copies to the Internet Archive, which preserves them for noncommercial use. “Jeff [Bezos, Amazon’s CEO] was okay with the idea that there are some things you can exploit for commercial purposes for a certain amount of time, and then you play the open game,” says Kahle. “Libraries and publishing have always existed in the physical world without damaging each other; in fact they support each other. What we would like to see is this tradition not die with this digital transformation.”
So which alternative comes closest to Google’s plans? Google is no Corbis, says Wojcicki, but is nonetheless limited in what it can share. “Door One was never our intention, nor is it even practical,” she says. “And we can’t do Door Three, because we’re not the rights holders for much of this material. So Door Two is probably where we’re headed. We’re trying to be as open as possible, but we need to hold to our agreements with different parties.”
Precisely to avoid questions about copyright, Oxford librarians have decided that only 19th- and early 20th-century books will be handed over to Google for digitization. “Some of the other libraries, including Harvard, have agreed to have some in-copyright material digitized,” says Ronald Milne, acting director of the Bodleian Library. “They are quite brave in taking it on. But we didn’t particularly want to go there, because it’s such a hassle, and we didn’t want to get on the wrong side of the book laws.”
At the same time, though, the American Library Association is one of the loudest advocates of proposed legislation to reinforce the “fair use” provisions of federal copyright law, which entitle the public to republish portions of copyrighted works for purposes of commentary or criticism. And two of Google’s partner universities – Harvard and Stanford – are also supporters of the Chilling Effects Clearinghouse, a website that monitors allegations of copyright infringement brought against webmasters, bloggers, and other online publishers under the controversial Digital Millennium Copyright Act (DMCA) of 1998. Mass digitization may eventually force a redefinition of fair use, some librarians believe. The more public-domain literature that appears on the Web through Google Print, the greater the likelihood that citizens will demand an equitable but low-cost way to view the much larger mass of copyrighted books. “I think this will be another piece of good pressure, another factor in the whole debate over the DMCA,” says Wilkin.