Wednesday, March 4th, 2009

Map Reduce in the browser

Category: JavaScript

<>p>Ilya Grigorik of Igvita has proposed and built a collaborative Map Reduce system in JavaScript that allows browsers to dive in and use their CPU to do some things.

On the JavaScript side you can do something like:

javascript
< view plain text >
  1. function map() {
  2.         /* count the number of words in the body of document */
  3.         var words = document.body.innerHTML.split(/\\n|\\s/).length;
  4.         emit('reduce', {'count': words});
  5.       }
  6.  
  7.       function reduce() {
  8.         /* sum up all the word counts */
  9.         var sum = 0;
  10.         var docs = document.body.innerHTML.split(/\\n/);
  11.         for each (num in docs) { sum+= parseInt(num) > 0 ? parseInt(num) : 0 }
  12.         emit('finalize', {'sum': sum});
  13.       }
  14.  
  15.       function emit(phase, data) { ... }

And you can have a job server on the other end (here is an example using Ruby):

  1. require "rubygems"
  2. require "sinatra"
  3.  
  4. configure do
  5.   set :map_jobs, Dir.glob("data/*.txt")
  6.   set :reduce_jobs, []
  7.   set :result, nil
  8. end
  9.  
  10. get "/" do
  11.   redirect "/map/#{options.map_jobs.pop}" unless options.map_jobs.empty?
  12.   redirect "/reduce"                      unless options.reduce_jobs.empty?
  13.   redirect "/done"
  14. end
  15.  
  16. get "/map/*"  do erb :map,    :file => params[:splat].first; end
  17. get "/reduce" do erb :reduce, :data => options.reduce_jobs;  end
  18. get "/done"   do erb :done,   :answer => options.result;     end
  19.  
  20. post "/emit/:phase" do
  21.   case params[:phase]
  22.   when "reduce" then
  23.     options.reduce_jobs.push params['count']
  24.     redirect "/"
  25.  
  26.   when "finalize" then
  27.     options.result = params['sum']
  28.     redirect "/done"
  29.   end
  30. end
  31.  
  32. # To run the job server:
  33. # > ruby job-server.rb -p 80

And with Web Workers you can have the work churn :)

Posted by Dion Almaer at 6:59 am
4 Comments

++---
2.7 rating from 25 votes

4 Comments »

Comments feed TrackBack URI

This would be perfect for something like SETI@Home, IMHO.

Comment by elfpoet — March 4, 2009

You guys must be hard up for stories… The guy might as well have written a Hello World application.

Comment by wleingang — March 4, 2009

The main problem with any serious distributed computing task in Javascript is that the tasks would have to execute unnoticeably (where WebWorkers and Gears may help with), but also in the short time span for about 30 seconds. SETI@Home and Folding@Home work with relatively large datasets and relatively large operations. To the point of which the costs of networking all the users may be larger than the cost of computing it in a cloud.
A year ago, I did something similar for the purpose of breaking some MD5 hashes (No real possibility of this helping humanity, hash cracking is mostly malicious) (GAE server, http://jsdc.appspot.com/). I also made something that calculates pi (appjet server, http://distpi.appjet.net/).

Comment by antimatter15 — March 4, 2009

Hmmm. Probably not a good idea to have visitors’ browsers doing computing work for you. Somehow I think people who do things like that eventually get into trouble…

Comment by pianoroy — March 4, 2009

Leave a comment

You must be logged in to post a comment.