Multi-Threaded JavaScript — A Quick Look.

The strange thing about this post is that this isn’t exactly news–ever since FireFox 3.5 came out in June of ’09, and along with Safari 4 and Google Chrome using a slightly different mechanism, these browsers all support OS Level multi-threading.

The question is, should you care?

The short answer for right now is absolutely, just don’t expect to place any of your multi-threaded code into a live environment any time soon. Until Internet Explorer decides to implement this API we’re left with a world where more than half your users won’t see your threaded code anyway.

Of course this is a pity, as it’s not hard to imagine the many uses spawning worker threads would have. As I type this I’m looking at one right now…the TinyMCE text editor. Sure it’s fine for most web editing tasks, but in FormBoss there are several instances of hooks I need to write into the processing engine to update various data structures, and sometimes these really take a hit with larger data sets. Of course loading times are also quite slow, even on relatively fast machines like I’m using now.

Opining aside, the purpose of this post is to learn a bit more about these guys, how they are used, and what type of performance boost you can expect.

General use

I’m going to stick with the kind of threads that FireFox and Safari use.

The basic work flow is you write a calling script like this (index.html):

  1. var iteration = 30;
  2.  
  3. var worker1 = new Worker('fibonacci.js');
  4. worker1.onmessage = function(event){
  5.  if(event.data == 'Done!'){
  6.  StopTimer('fib1timer');
  7.  }
  8.  document.getElementById('fib1').innerHTML += event.data + '<br\>';
  9. };
  10. worker1.onerrorfunction(error){
  11.  dump("Erorr with thread:" + error.message);
  12.  throw error;
  13. };
  14. // call worker 1
  15. worker1.postMessage(iteration);

…whose job it is to create a worker (the second line), listen for messages from the worker, listen for errors from the worker, and finally, start the worker with some task (the last line).

The interesting thing to me about this code is how in the first line we initialize a Worker by passing in an argument of a JavaScript file. This is not a mistake, it’s how these guys are created. We do not pass functions, we pass files. I’ve heard grumbling about this, but I personally think it’s rather appropriate, at least for now.

Before we take a look at that file argument code though, note the event handlers (onmessage, onerror). These are required, and the names cannot be changed as they are built in functions that need an implementation from your application.

Right, on to the fibonacci.js fie:

  1. // main fibonaccci cycle — careful with anything over 30!
  2. function fib(n) {
  3.  var s = 0;
  4.  
  5.  if(n == 0) {
  6.  return(s);
  7.  }
  8.  
  9.  if(n == 1) {
  10.  s += 1;
  11.  return(s);
  12.  } else {
  13.  return(fib(n – 1) + fib(n – 2));
  14.  }
  15. }
  16.  
  17. // worker thread 'gateway' function.
  18. onmessage = function(event) {
  19.  var n = parseInt(event.data);
  20.  
  21.  var i;
  22.  for(i = 0; i <= n; i++) {
  23.  postMessage(fib(i));
  24.  
  25.  // wasteful, but shows how the messaging system can be called
  26.  if(i == n){
  27.  postMessage('Done!');    
  28.  }
  29.  }
  30. }

So the basic idea is we define an onmessage event listener to wait for our initial call. Think of this as the ‘gateway’ into the threaded function. Once this message is received by fibonacci.js, we grab parameters from the event.data argument and kick off our first iteration of the fibo cycle by issuing a call to postMessage(fib(i)).

As we are passing a function as an argument, and because that function returns a value, the postMessage(fib(i)) call will receive a Fibonacci value back. If we look at the first code block again (index.html), you’ll notice we defined a event handler for these postMessage() events, the function onmessage().

So just to be clear: in index.html we start our thread by calling:

worker1.postMessage(iteration);

This call is received by the event handler in fibonacci.js via onmessage(). Then, when we want to talk back to index.html from fibonacci.js, we do so once again via postMessage(). This call is then also handled by an event handler called onmessage().

In other words, both scripts have a postMessage() call, and both have event handlers for this call, onmessage().

This is super important to understanding how Web Workers function with respect to their inherit limitations–the long and short of it is when we spin off one of these threads the code in the external .js file is for all intents and purposes disconnected from your main script. The code in the web worker cannot change the DOM, and can only pass values back to your main script via postMessage().

Thus, in the code above when the fib() function returns a value it does so as an argument to the postMessage() function call, this value is then received by the “main” script (the top one, index.html), which updates the DOM via:

document.getElementById('fib1').innerHTML += event.data + '<br\>';

So yes we can update the DOM with results and data from our threaded functions, but not from within the Web Worker, only from the messages we send back to the calling script.

This actually brings up two important points:

  1. Safari only supports a simplified version of the messaging system. Whereas FireFox supports full on JSON, Safari only support simple values.
  2. Communication has a cost…

Communication has a cost

Yes it does. One of the main hassles of threaded programming is dealing with the headaches of race conditions, deadlocks, and other data dependencies. Web Workers hide much of this from us, making it quite hard to create such problems at the cost of some flexibility.

That said, to help illustrate how web workers can be used, I’ll created three different test cases ( **see download link below).

single-thread

worker-threads

worker-threads-optimized

single-thread , as the name suggests, simply takes our fib() function and calls it twice on page load, both with 30 iterations. Some defining characteristics of this approach is we loop around 14,098,246 times, and our web page is effectively ‘blocked’ during this time, causing the DOM to freeze until the operation finishes.

worker-threads is basically the same code as above, but now we define two Web Workers, both of which call the same fibonacci.js file. We then loop through the same code as we do in single-thread, but because each call is using its own thread, the operation finishes roughly twice as fast (provided you have a two-core machine!).

Some characteristics of this approach are quite pleasing indeed. Yes the operation finished twice as fast, but this is kind of a generic case anyway. For me, the real benefit is a) our timer call actually runs independently of the main operation and provides an accurate result. b) the DOM is no longer blocked, in fact it’s updated dynamically, which is a pretty neat effect.

Finally, we have worker-threads-optimized. In this case we give up the embarrassingly inefficient recursion-from-hell fib() function and use a far more efficient iterative version:

  1. function fib(length) {
  2.  for( l = [0,1], i = 2, x = 0; i < length; i++ ){
  3.    l.push(l[x++] + l[x]);
  4.    //postMessage(l[x]);
  5.  }
  6.  return l;
  7. }

When I first ran this code I was surprised to see just how much faster it was. With the old function anything over 35 was trouble. With this version we can easily pass 10,000,000.

But there in was a problem: During testing I would set the sequence number to around 300,000 for each of my two threads and wait as I monitored CPU load. Problem was, my CPU load never went over 25% on a four-core machine–I was running single threaded again. But how? Did I run into some limitation of what files could be run, what variables could be used, or what functions called?

I won’t lie: I pondered over for quite a while before it finally dawned on me: the worker-threads version, while computationally expensive, was doing almost nothing to tax the DOM. At 30 iterations each the DOM would receive a request to update every 150 milliseconds or so, leaving plenty of time for the Web Workers and browser to grind away. In other words, we could run at the full potential of the threaded code because the DOM update calls we infrequent enough to not matter.

But not so with the new version. The new version was absolutely pounding the DOM with hundreds of thousands of requests every second. Their was no way it was going to keep up, and so in the end the browser itself started blocking the postMessage() function calls.

That’s why that postMessage() call above is commented out. So long as we’re not trying to call that we use both cores as expected. Lesson learned: If you truly want parallelism you’ll need to forgo DOM communication if the number of calls becomes too great relative to the time between each each DOM update.

In the end, the gateway function ended up being:

  1. // worker thread 'gateway' function.
  2. onmessage = function(event) {
  3.  postMessage('start!');
  4.  fib(parseInt(event.data));
  5.  postMessage('Done!');
  6. };

When we first enter the function we pass a simple start message, and when done, pass a message that we’re done.

In Conclusion

What a wonderful world it could be if we had better standards in the world of web browsers. Alas, for the foreseeable future Web Workers shall remain an enigma to most, which is a shame. Used right they can provide a significantly improved user experience, as well as open the doors to a more dynamic and interesting web. Let the wait begin!

Download the scripts discussed in this post!

worker-thread-benchmarks

Leave a Comment

* are Required fields