Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

“It’s a valid approach,” says Bruce Maggs, a professor of computer science at Duke University in Durham, NC, and vice president of research at Akamai, a Web content delivery and caching company based in Cambridge, MA. Fully replicating a database at multiple sites, as search companies typically do now, is inefficient, Maggs says, since only a small proportion of data is accessed at each site. A distributed approach “also saves considerably on everything else in the same proportion, such as capital costs and real estate,” he says. This is because, overall, the number of servers required goes down.

For users, the advantage would be quicker search results. This is because most answers would come from a data center that’s geographically closer. A small number of results would take longer than normal–but only 20 to 30 percent longer, says Baeza-Yates. “On average, most queries will be faster,” he says.

Maggs says the performance improvement would need to be high enough to counteract any delay in those search queries that have to be sent further afield.

Another trade-off is that more users would get different results, depending on where they were, than is currently the case, says Peter Triantafillou, a researcher at the University of Patras in Greece who studies large-scale search. This already happens to some extent even under a centralized model, he says, but it could be a bigger concern if many more searches were inconsistent.

However, with search engine data centers already housing tens of thousands of servers, it’s questionable whether they can continue to grow and still function efficiently, Triantafillou says. “Will they be able to go to hundreds of thousands or millions?” he says. Just the practicality of installing the cabling and optics in and out of such facilities would pose serious problems, he says.

The distributed approach remains a long-term aim, Baeza-Yates admits. “But for the Internet,” he adds, “long-term is only about five years.”

1 comment. Share your thoughts »

Credit: Technology Review

Tagged: Communications, Web, Internet, search, networks, algorithms, networking, Yahoo, Internet infrastructure

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me