Thursday, May 21st, 2009

Digg shows Multipart XMLHttpRequest prototype

Category: Ajax, JavaScript, Performance

>Micah Snyder of Digg posted on DUI.Stream, an experimental library that implements a multipart XHR technique to bundle resources into one request and then breaks them out at the other end:

One of the ways that high-performance websites like Yahoo suggest speeding up load times is by reducing the number of HTTP requests per page. We started thinking about what we could do to reduce HTTP overhead, and where we could get the biggest benefits from it. Well, one thing led to another and the next thing we knew we were talking about writing a generalized framework for bundling files, sending them through a single request, then separating them for use once they head down the pipe.

We call this technique MXHR (short for Multipart XMLHttpRequests), and we wrote an addition to our Digg User Interface library called DUI.Stream to implement it. Specifically, DUI.Stream opens and reads multipart HTTP responses piece-by-piece through an XHR, passing each chunk to a JavaScript handler as it loads.

Why do this? Well, DUI.Stream will allow developers to drastically improve the speed of uncached page loads by bundling most of their resources into a single HTTP request, with a single time-to-first-byte and no request throttling by the user agent. Additionally, the size of the response has no effect on the rendering time of each chunk, as the client handles each piece of the response on the fly and can inject it into the DOM for rendering immediately, in the exact order you specify. On a high traffic, high-activity site like Digg, we have to display incredible amounts of data on each permalink — typically hundreds of user images within the first 50 comment threads on a page alone, not to mention the UI chrome and actual comment data. (You can see this for yourself: notice the number of HTTP requests that queue up when you expand a page of comments). So our primary use case for DUI.Stream is turning that first long, arduous page load on an empty cache into something nearly indistinguishable from a page of data with fully cached resources.

You can take a look at a demo in action. Reloading the puppy shows how life varies so much on each request. The demos looks like this:

javascript
< view plain text >
  1. var s = new DUI.Stream();
  2.  
  3. var content = '';
  4.        
  5. s.listen('text/html', function(payload) {
  6.     content += payload;
  7. });
  8.  
  9. s.listen('complete', function() {
  10.     $('#stream').append('<p>Stream took: ' + ((new Date).getTime() - streamStart) + 'ms</p>' + content);
  11.    
  12.     var normalStart = (new Date).getTime();
  13.    
  14.     for(var i = 0; i < 9; i++) {
  15.         $.ajax({
  16.             'url': 'loremIpsum.html',
  17.             'async': true,
  18.             'type': 'GET',
  19.             'dataType': 'html',
  20.            
  21.             'success' : function(html) {
  22.                 $('#normal').append(html);
  23.             }
  24.         });
  25.     }
  26.    
  27.     $.ajax({
  28.         'url': 'loremIpsum.html',
  29.         'async': true,
  30.         'type': 'GET',
  31.         'dataType': 'html',
  32.        
  33.         'success' : function(html) {
  34.             $('#normal').append(html);
  35.             $('#normal').prepend('<p>Normal took: ' + ((new Date).getTime() - normalStart) + 'ms');
  36.         }
  37.     });
  38. });
  39.  
  40. var streamStart = (new Date).getTime();
  41. s.load('testStreamData.php');

How does Digg see this as a benefit?

Let’s talk a bit about the architectural benefits of implementing MXHRs with DUI.Stream. Back when the web was based largely on a page metaphor (i.e.: one central document with external references), whenever you loaded the page, the page requested its images, stylesheets, etc, then you were done. These days you’re just as often loading an application; the page progressively enhances into a stateful UI by loading extra stylesheets, scripts and a whole mess of UI chrome after the initial request. Yet, we’re still using the old model flow of get markup –> render markup –> request external resources –> load and display externals.

Take our modal login dialog box for example. In order to reduce requests we bundle its JavaScript in with the rest of the page, we put its CSS up in the header with the rest of the styles, then we request only the markup for the dialog box, render it, and let it fire its own HTTP requests for the images that make up its chrome. In this broken model, HTTP connections and rendering behaviors split our UI architecture up into different parts of the page that all render at different times at the browser’s discretion. Even if we put everything into one cohesive structure and loaded the CSS link, script tag and markup together, they’d still all fire their own HTTP requests and the images would still come in afterwards on the first page load. This just won’t do.

Now, let’s rethink how our login dialog could work using DUI.Stream. We can request a Stream that contains everything needed to render and use the dialog box. As each part comes in, it gets passed through to be built, and renders immediately with no image backfill or delayed JS behavior. The DUI.Stream framework can then pass those resources back into cacheable elements for our next page load, which can happily 302 its way quickly through the rendering process. Pretty sweet right? Right.

Posted by Dion Almaer at 7:31 am
19 Comments

++---
2.4 rating from 82 votes

19 Comments »

Comments feed TrackBack URI

hmmm… had 3 shots of the demo, all showing the normal approach is much faster.

Stream took: 1348ms
Normal took: 620ms

Stream took: 1230ms
Normal took: 130ms

Stream took: 1519ms
Normal took: 140ms

* testing on FF 3.5b4 (OS X)

Comment by bogphanny — May 21, 2009

I had similar results as bogphanny. In almost every refresh in both Safari / Mac and FF3 / Mac, the normal approach is faster. Every once in a while, the MXHR request would be a *little* faster.

Comment by nicksergeant — May 21, 2009

It’s a very clever idea indeed – but sadly the demo doesn’t do much to support it.

I’m guessing that in a real world scenario there might actually be great benefits to reap.

Comment by rasmusfl0e — May 21, 2009

It’s a shame the demo really isn’t suited to showing off this idea, in that the implementation bears no resemblance to the situations they describe in the write up. It sounds very cool.

Comment by jeromew — May 21, 2009

How are you gonna display an image you received over XHR in the page? data uri’s dont work in IE7-

Comment by Jaaap — May 21, 2009

The text example isn’t really taking advantage of this approach.

Here is a much better demo with image assets:
http://demos.digg.com/stream/imageDemo.html

However, as Jaaap says, not sure how you’d get around IE’s lack of data URL support…

Comment by russh — May 21, 2009

This technique really comes into its own when the client cannot know beforehand what the response will contain, and hence could never request the components individually.

We’ve been using it for several years and are very happy with it.

Comment by hymanroth — May 21, 2009

Well, any real difference with DWR’s call batches ? You do multiple requests at one time, then receive the multipart response and call the different callbacks associated to each part of the response.

This is fine, but far from new, and batches aren’t that easy to use because you have too know what has to be called and this is most of the time difficult to be determined dynamically.

Comment by temsa — May 21, 2009

We’ve been doing this to deliver CSS and JS files to the browser for a while, and grouping images into large sprites and using CSS to shuffle them around as background images, so I’m not really sure where the application for this comes in.

I’m almost never going to want to request a ton of images together via AJAX when I can use a sprite, and any blocks of text will be requested together anyway. Not to mention IE is funny about data URLs…

Comment by willbo — May 21, 2009

When I viewed the text demo (Chrome 2), MXHR took about 2.5x longer. For the 300 images demo, MXHR was about 15x faster.

Comment by WillPeavy — May 21, 2009

Sounds a lot like Ext.Direct.

Comment by mdmadph — May 21, 2009

I just had a look at the code, and it’s very well written. I’m sure you had a look yourselves, right?

Also, the image demo shows a difference in favor of MXHR, not the one linked to in the article.

Comment by hdragomir — May 21, 2009

Is it just me or is the timer not really accurate. If you look at the actual requests in the Firebug console the one MXHR request takes about 300ms compared to about 900ms for all ten individual requests, this seems to reflect the advantage better.

I think it sounds like a good idea and will definitely see if it can be used on the project I’m currently involved in.

P.S – what’s with the nazi registration process for this site, oh you want my phone number do you? Perhaps I should also set up a web cam and give you a url for a 24/7 live feed. Oh and that CAPTCHA looks secure.

Comment by vitalic — May 21, 2009

Yeah, the demo image shows a difference in favor of MXHR, but what about caching? Specifically: how do you tell the browser to cache the images received through the DUI.Stream?

Without this kind of caching, the technique limits itself to very large sets of unique assets that you’ll never need to load again. Maybe digg has a use for it, but I think the vast majority of us don’t…

Comment by kilburn — May 22, 2009

Shouldn’t HTTP 1.1′s “persistent connections” handle this itself ? http://en.wikipedia.org/wiki/HTTP_persistent_connections
Also to prepare the batched answer on the server a php script(http://demos.digg.com/stream/testImageData.php) is used. So on server-side this technique is also not straightforward and can create additional probs..
Anyway it’s something interesting, on my internet mobile connections with firefox 2connections per server, with imageDemo I got “Stream took: 9461ms” “Normal, uncached took: 89240ms” almost 9.5x faster, not bad.

Comment by AdrenalinMd — May 23, 2009

You guys are funny, for “responders” you use yourself php (http://demos.digg.com/stream/testImageData.php) but into repository you have put only “Python, Ruby, Perl and Java” http://github.com/digg/stream/tree/9010257b1a6e1c236374d5205f0cee108a32256a/examples ;o)

Comment by AdrenalinMd — May 23, 2009

Normal was twice faster on every request, even the first one.

Using Firefox 3.0.2 on Windows in a corporate environment.

Comment by stephaneeybert — May 25, 2009

I see a real life example. How about a site like youtube with a lot of video thumbnails that need to be loaded? instead of 30+ thumbnails being laoded, just 1 request can be made

data url support in ie:
http://www.phpied.com/mhtml-when-you-need-data-uris-in-ie7-and-under/

Comment by cates — May 29, 2009

Interesting results people are getting.

I just tested in Chrome 2.x and found that the MXHR technique was much faster (for both the plain text and images demos).

Here are my differences:

MXHR Stream
Stream took: 384ms

Normal
Normal took: 5002ms

MXHR Stream
Stream took: 1549ms

Normal
Normal, uncached took: 10031ms

Comment by vegitto — June 4, 2009

Leave a comment

You must be logged in to post a comment.