Mims's Bits

MapRejuice Is SETI@Home on Steroids

Millions of Web surfers have spare computer cycles--why not use browsers to tap them?

Christopher Mims 09/03/2010

  • 12 Comments

Unless a flash ad in one of your open browser tabs has gone rogue, it's likely you've got a few spare processor cycles available on the PC you're using to read this.

And if someone were to appropriate just a few of them -- too few for you to notice -- then you and all the millions of other web surfers out there could be harnessed together like some kind of gigantic insectoid computing hive mind.

The result would be one of the world's most powerful supercomputers.

That's the dream, anyway, of a project called MapRejuice. It's MapReduce - the algorithm invented at Google that powers more or less all the massively parallel processing on which the web's biggest businesses rely - but implemented in javascript. And it runs in the background of any webpage on which it's installed.

You could be running it right now without even knowing it.

Using the spare cycles of the world's internet-connected PCs is nothing new - SETI@Home has been doing just that in service of parsing radio signals from outer space since the late 90's. There are also dozens of projects that use the BOINC system on which SETI@Home runs, in order to serve scientists who need cheap access to supercomputer-scale computing power to work on problems ranging from cryptography to protein folding.

But MapRejuice doesn't require special client software, which means pitching in on giant computational problems could be as simple as leaving the appropriate browser window open.

Right now MapRejuice, which remains primarily a proof of concept, isn't doing that much. The graph on its homepage reveals, in real time, that it's doing about 300 jobs per minute right now.

Its ultimate capacity? Well, how many spare cycles do all the world's internet browsers have?

If you've got a site that serves users who wouldn't mind contributing a little processing power to the project, you can grab the line of code you'll need to drop into your site's HTML here. Or, if you've got a gigantic problem in need of a MapReduce cluster of nontrivial scale, sign up your project here. And if you just think it's a cool idea, you can vote for it (it's an entry in the node.js contest) here.

Print

Close Comments

To comment, please sign in or register

Forgot my password

rsanchez1

213 Comments

  • 523 Days Ago
  • 09/03/2010

Maybe Good

This is a good idea and it remains to be seen if it can beat SETI@home. On the plus side, it doesn't require you to download and install software like you have to do with SETI. All that is required is a web developer to put the lines of code in their web pages to get a similar effect.

The downside I see to it is that most people don't spend too long on a single web page. SETI runs as a screensaver making use of the time when someone steps away from a computer. This is the optimum operation mode for MapRejuice as well, so web developers would need to ask visitors to the website to please leave the browser window open.

I read somewhere that the average user spends only 30 seconds on a webpage before moving on, so for this technology to be more useful it would have to be very widespread to cover a large portion of the user's browsing time. It's the same as asking web developers to load javascript libraries (jquery, prototype, etc.) off CDN instead of their own server to save on load time, but many developers still don't do it.

Reply

ryanmcgrath

2 Comments

  • 522 Days Ago
  • 09/04/2010

Re: Maybe Good

You're right about this. We believe it can still be useful - look at how much time people waste on Farmville, for instance.

Reply

david.carew

1 Comment

  • 521 Days Ago
  • 09/05/2010

Re: Maybe Good

Is anyone bothered by an ethical/moral dimension of server-side hiding of this cpu power theft?  For SETI@home, I approve and download a cpu parasite, for MapRejuice simply touching particular URL's downloads this parasite without my knowledge and consent.  If I own the computer and pay for the power, shouldn't I know of an approve of what uses my computer power is put to?

Reply

jhuckestein

4 Comments

  • 520 Days Ago
  • 09/06/2010

Re: Maybe Good

One way to solve that is to give users a toolbar to interrupt or stop the calculation. I doubt many users will do that if it helps finance the site they are currently enjoying.

Reply

colinnwn

88 Comments

  • 519 Days Ago
  • 09/07/2010

Re: Maybe Good

It is a concern. Unfortunately all the flash ads and bloated javascript of dubious merit on many pages also robs CPU cycles.

The power increase is so negligible, it isn't worth consideration.

Personally, I'd rather sites dump flash ads for static ones, pare down their javascript, and do something worthwhile with my CPU cycles like this.

It would be nice if they provided a disable button out of courtesy. Otherwise I can set my browser to disable running javascript, and the problem would be solved.

Reply

Mapou

356 Comments

  • 522 Days Ago
  • 09/04/2010

Latency Problem

This is a very interesting and useful technology. However, internet-linked computational networks are limited to just a handful of distributed applications, mostly the scientific or data analysis kind. If you need massively parallel, real time computation, this technology won't be much use. For examples, the latency would be way too high if your application must recognize faces or speech instantly or process very high resolution streaming video.

In my opinion, the vast majority of future parallel applications will fall into the real-time, super high bandwidth category that technologies like MapRejuice cannot handle.

Reply

jhuckestein

4 Comments

  • 522 Days Ago
  • 09/04/2010

Re: Latency Problem

Yeah, large-data problems are not well-suited for us at all. Most problems commonly associated with Map Reduce (in particular at Google) fall into this category.

There are classes of problems that don't require large amounts of data. There's simulation applications in which the data is randomized by the worker before the computation, so no data is transferred initially. Apparently protein folding problems also use little data and involve a search over a large space. That we may be able to provide.

Next we'd like to find a few real-world problems and try to solve them reasonably fast. Let us know if you have any ideas.

Reply

Advertisement

ryanmcgrath

2 Comments

  • 522 Days Ago
  • 09/04/2010

Thanks!

Wow, thanks for the write-up, my team and I are really happy to see this kind of enthusiasm surrounding the subject.

We're actually allowed to fiddle with the code base now that the contest is over, so expect more work to be done on this in the coming weeks. We've got a lot more planned. ;)

Reply

jhuckestein

4 Comments

  • 522 Days Ago
  • 09/04/2010

Next steps

I'm one of the developers, thanks for the shout-out. The next steps for us are to figure out classes of computational problems that are well suited for the restrictions of our framework (large numbers of untrustworthy and unreliable workers with slow data transfer speeds).

If anyone is interested in chatting or helping us out, we put an email sign-up form on maprejuice.com. There's also a more in-depth discussion of what comes next here. Get in touch!

As for the contest: Voting has ended and we came out second in the innovation category and seventh overall :) Thanks to all who voted!

Reply

killerbunny

1 Comment

  • 522 Days Ago
  • 09/04/2010

nice twist on community computing

I've been contributor of cycles to BOINC-based World Community Grid. Interesting to see performance of this web-browser-based javascript solution.

Reply

henchan

1 Comment

  • 521 Days Ago
  • 09/05/2010

scalability?

This seems like a good thing in principle, but I don't understand how it should scale in practice.
Say you have w participating web sites with u users of those web sites. Also, behind the scenes somewhere there must be n nodejs servers and c CouchDB servers.

MapRejuice documentation implies that the system's processing power scales with u. But where exactly is the number-crunching  taking place ? As far as I can tell, c is the true computation choke point. But c does not scale with u or w - it is constrained by server resources available to MapRejuice.com.

I hope to be wrong, because I might be able to use this in my own (node/couch) app - both as a computer and to submit jobs.

Reply

jhuckestein

4 Comments

  • 520 Days Ago
  • 09/06/2010

Re: scalability?

The number-crunching-power scales with u because the calculations are run inside user's webbrowsers. The only crunching that is done on the server is partitioning of data, but most key-value stores are built to do that very well. Google's Map Reduce for example runs on BigTable which essentially is a long,sorted list of values.

Reply

Bio

Christopher Mims is a journalist who covers technology and science for just about everybody.

Subscribe to the Mims's Bits RSS Feed

Advertisement
Advertisement

Facebook

Advertisement