Web

Digitize This

Yahoo hopes to trump Google with its Open Content Alliance publishing venture.

  • October 20, 2005
  • By Wade Roush

Google shook up the worlds of publishing and library science last year when it announced it would digitize millions of books from several of the world's greatest libraries -- including Oxford's Bodleian Library and the New York Public Library -- and make their contents searchable on the Web (see "The Infinite Library").

Many librarians applauded Google's move, and predicted it would jumpstart a broader effort to ensure universal electronic access to human knowledge. But publishers weren't as pleased -- particularly because Google said it would not seek permission to scan and index books still covered by copyright.

Now a group led by one of Google's main rivals, Yahoo, is trying a more collective approach to digitization. On October 4, Yahoo and ten partner organizations announced the formation of the Open Content Alliance, which plans to build a free, permanent online repository for a wide range of print and multimedia content, including both copyrighted works and those that have passed into the public domain.

Yahoo's partners in the alliance are Adobe Systems, the European Archive, Hewlett-Packard Labs, the Internet Archive, the National Archives of the United Kingdom, O'Reilly Media, the Prelinger Archives, the University of California, and the University of Toronto.

Advertisement

In contrast to Google's approach, which requires publishers to "opt out" if they don't want their works to be included, the alliance will only disseminate copyrighted works after their publishers have explicitly opted into the program, according to David Mandelbrot, Yahoo's vice president for search technology.

Mandelbrot says the alliance will encourage other entities, including Google, to contribute to the repository, and will create a set of standards for digitization intended to make it easier to pool the products of various digitization efforts and to make them searchable from any search engine. Technology Review's executive Web editor, Wade Roush, recently interviewed Mandelbrot about Yahoo's approach to digitizing the world's literature.

Wade Roush:  How did the Open Content Alliance come about?

David Mandelbrot: In March of last year we launched our effort to partner with content rights holders. We wanted to move beyond what we could provide just by crawling the Web and improve the quality of Yahoo search. Soon after, we connected with the folks at the Internet Archive, who are doing great work with digitizing works. They were hosting a lot of great content and we wanted to integrate that into our search engine.

As we started that discussion, Brewster [Kahle, the founder of the Internet Archive] became focused on what can we do together to digitize content. They've developed a great scanning technology and a really good way to digitize works of literature, but they were looking for partners to help them get their message out there and get funding flowing. From those discussions, we decided to form this Open Content Alliance.

Print

Related Articles

Confessions of a Scan Artist

You, too, can commit your life to digital -- and throw away your paper records.

How to Digitize a Million Books

Needed: scanning software for 430 languages and a system to organize the next big leap in the information age.

Base-ic Instinct

Google Base encourages users to make their information more findable by uploading it directly to Google. But so what?

To comment, please sign in or register

Forgot my password

Advertisement

MAGAZINE

Can We Build Tomorrow's Breakthroughs?

Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.

Videos

A Social-Media Decoder

More

Advertisement

Technology Review Lists

TR50

Our list of the 50 most innovative companies, including the following:

Claros Diagnostics

PrimeSense

Lyric Semiconductor

Facebook

More

Advertisement

Facebook

Advertisement