How Google Is Making Books Mobile

The company released mobile versions of more than 1.5 million public-domain books.

Erica Naonearchive page

February 6, 2009

Google is working on a better way to read books on mobile devices. Yesterday, the company launched mobile versions of more than 1.5 million public-domain books, available on iPhones and Android phones.

Google Book Search already supports a huge library of scanned books, which can be searched and read online if they’re in the public domain, or previewed if they’re still subject to copyright. But the company says that the existing versions of these books (images of scanned pages) didn’t work well on mobile devices.

I talked with Frances Haugen, product manager on Google Book Search, for some of the details. The mobile books project turns out to be a side effect of an effort to rework the way Google represents its library of scanned books. In the course of trying to restructure books to make searching them more efficient and accurate, Google’s engineers developed better algorithms for converting scanned text to HTML. Essentially, Haugen says, the technology allows Google to automatically understand the structure of the book–headers, paragraphs, and in-line illustrations–so that it can be reformatted for mobile devices.

But what particularly sets the system apart, she says, is its scale. The mobile versions of these books are full of small, clever touches. For example, the system loads up the next few pages of a book in advance, reducing lag time when the user turns the page (similar to the way a YouTube video buffers).

However, there are still plenty of hiccups in the process, and not all of the scanned text gets converted properly. But, says Haugen, this fits with Google’s philosophy of releasing features early and updating them along the way.

In the meantime, features are in place to make it easier to read through a difficult section. Anywhere the reader encounters an incomprehensible paragraph, he or she can tap that paragraph on the phone’s screen, replacing it with a scanned image. Google has also added a “garbage detection” feature that searches for problematic sections and replaces them with the scanned image without requiring a tap from the user.

Interestingly, Haugen says that Google eventually plans to make copyrighted books available for mobile devices. Google Book Search already has deals with 20,000 publishing partners covering more than a million books, she says. Users can preview these books, and the system makes it easy to buy them. A similar system will be rolled out for the mobile versions after these public-domain books help Google work out the kinks, she says.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.