Web

Digitize This

(Page 2 of 3)

  • October 20, 2005
  • By Wade Roush

WR: What are the Alliance's goals, and how will this program differ from other efforts -- notably Google's -- to digitize large amounts of non-digital content?

DM: Over the time we were discussing forming the alliance, Google did launch their program, and we looked at their program for ideas about what they were doing and things we might want to do differently. We do want to have copyrighted works available through the Open Content Alliance -- but only with the express permission of the copyright holder.

Secondly, we mainly want the alliance to focus on this theme of openness. One of the things we've seen with other [digitization] programs is they tend to use proprietary technologies to host the content, so it's impossible for third-party search engines to crawl it. So we're using XML and PDF and making the content easily crawlable by search engines. It was important to make this project open so that entities that contribute know they're not just benefiting one search engine.

WR: So you feel you learned directly from the reaction Google's project provoked from the publishing community.

DM: When the topic of making copyrighted works available came up, we always assumed we would need to get permission from copyright holders. We were surprised to see that in other programs, the copyrighted work would be made available without permission.

WR: Okay. But in Google's defense, once you decide you're going to seek express permission to digitize every work, that drastically limits the amount of content that will be available online, doesn't it? Basically, we're talking about everything published after 1923.

DM: There are great gains to be made just by digitizing public-domain works. Edgar Allan Poe and Henry James can be made available in their entirety online. In addition, we've been very excited about the response we've received from both the major publishing associations and the publishers themselves. Many are showing interest in working with us on this program. While there will be agreements that will need to be ironed out, we're confident that we will be able to get a lot of copyrighted work online.

WR: That was going to be my next question: Do you think the Open Content Alliance's approach will be more palatable to publishers?

DM: When it comes to these digitization efforts, the publishers have primarily been speaking through the publisher's associations rather than individually, because they're concerned about any kind of retribution that could come from search engines if they're critical of any particular effort. But what we have heard from the publishers' associations is that they're very happy about the approach we're taking. The Association of Learned Professional Society Publishers, for instance, has been very positive about our program, because of the fact we are working with the copyright holders in advance.

Print

Related Articles

Confessions of a Scan Artist

You, too, can commit your life to digital -- and throw away your paper records.

How to Digitize a Million Books

Needed: scanning software for 430 languages and a system to organize the next big leap in the information age.

Base-ic Instinct

Google Base encourages users to make their information more findable by uploading it directly to Google. But so what?

To comment, please sign in or register

Forgot my password

Advertisement

MAGAZINE

Can We Build Tomorrow's Breakthroughs?

Manufacturing in the United States is in trouble. That's bad news not just for the country's economy but for the future of innovation.

Videos

A Social-Media Decoder

More

Advertisement

Technology Review Lists

TR50

Our list of the 50 most innovative companies, including the following:

Life Technologies

ARM Holdings

Groupon

Google

More

Advertisement

Facebook

Advertisement