Tuesday, November 20th, 2007

How To Build A Read/Write JavaScript API

Category: Articles, JavaScript

Rakesh Pai has written up a piece on GData JavaScript client library (video), CrossSafe, and SubSpace.

He discusses the high level, and then delves into the requirements and process for getting cross domain code working:

Here’s what you require to get cross-domain read write JavaScript APIs to work.

  • The “setup” required at the client’s end is that he should have at least one static cacheable resource embedded in the page where he’s consuming the API, which is loaded from the same domain as his page. This could be in the form of a static CSS file, or an image. If the page doesn’t have either, it will be required to insert one – maybe in the form of a 1px image hidden away by using inline style attributes. This is usually not too much to ask for, considering that pages are either made up of spacer GIFs or CSS documents, usually loaded from within the same domain. The static resources I mentioned could even be from a different sub-domain within the same domain, but it might complicate scripts slightly to have it set up that way. If this setup is not possible at all (oh, come on!), you could still find a work around2, but I think that this is the easiest way to get things up and running.

  • You will need to do some setup at your end, if you are the creator of the API. In particular, you will need to setup a “proxy” page that intercepts the requests from the JavaScript client API, conditions the data, and passes it along to the REST API. This proxy page also reads the response from the REST API, conditions the data to suit the client, and flushes it down to the JavaScript.

Now, let’s go over the process of actually orchestrating the communication.

  1. The API client library is included on the page by means of a script tag pointing to your domain (your domain being the host of the client library). This is similar to including the Google Maps API on the page.

  2. Once included, the script scans the page for the static resource mentioned above. This is done by walking the DOM looking for link or img tags, and checking the value of the href/src attribute to ensure it lies within the same domain as the calling page. The URL of this resource is stored for use later. At this point, if required, the client library can signal to the developer that it is ready for communication with the server. If the resource is not found, the client-library should throw an error and terminate.

  3. When a request requires to be made, the client library takes the request parameters and prepares the markup for a form. This form can have any method attribute value, and should have it’s action attribute set to the proxy page on your domain. The parameters to be sent to the server should be enumerated as hidden fields within the form. The client library also specifies the resource (in a RESTful sense) that needs to be acted upon. Also, the name of the static resource we had hunted down earlier is passed on to the server. This form is not appended to the document yet. This markup is then wrapped into <html> and <body> tags. The body tag should have onload=”document.forms[0].submit();”.

  4. The client library then creates a 0px x 0px iframe, without setting the src attribute, and appends it to the page’s DOM. This makes the browser think that the iframe exists in the same domain as the calling page. Then, by using the iframe document object’s open(), write() and close() methods the markup created in the previous step is dumped into the iframe. As soon as the close method is called, the form gets submitted to the proxy page on your domain because of the onload in the body tag. Also note that this gives the server access to any cookies it might have created from within it’s domain, letting you do things like authentication. In this way one part of the communication is complete, and the data has been sent to the server across domains. However, the iframe’s document.domain has now switched to point to your domain. The browser’s security model now prevents any script access to most parts of the iframe.

  5. The proxy page sitting on your server now queries your REST API – basically doing it’s thing – and gets the response. Response in hand, the proxy is now ready to flush the response to the client.

  6. If the response is rather large in size, as might be the case with a huge GET call for instance, the proxy breaks it up into chunks of not more than say 1.5 kb2.

  7. The proxy is now ready to flush the response. The response consists of iframes – one iframe for each of these 1.5 kb chunks. The iframe’s src attribute is set to the static resource we had discovered earlier. It is for exactly this purpose that we had hunted the resource down and passed on the URL to the server. At the end of each of these URLs, the proxy appends one of the chunks of the response, after a “#” symbol, so that it works as a URL fragment identifier. Also, the iframe tags are each given a name attribute, so that the client script can locate them.

  8. Meanwhile, the client-side code is where it had left off at the end of step 4 above. The script then starts polling the iframe it created to check for the existance of child iframes. This check of iframes will need to based on the iframe name the server will be sending down. It will look something like this: window.frames[0].frames[“grandChildIframeName”]. Since the static resource we have loaded into the grandchild iframe is of the same domain as the parent page, the parent page now has access to it, even the intermediate iframe is of a different domain.

  9. The client script now reads the src attributes of the iframe, isolates the URL fragments (iframe.location.hash), and reassembles the data. This data would typically be some JSON string. This JSON can then be eval’d and passed on to a success handler. This completes the down-stream communication from the server to the client, again across domains.

  10. With the entire process complete, the client-library can now perform some cleanup actions, and destroy the child iframe it created. Though leaving the iframe around is not a problem, it is not necessary and simply adds to junk lying around in the DOM. It’s best to get rid of it.

Posted by Dion Almaer at 8:46 am
12 Comments

++---
2.3 rating from 66 votes

12 Comments »

Comments feed TrackBack URI

Broken link -> “GData JavaScript client library”

Comment by Jack — November 20, 2007

the link to http://piecesofrakesh.blogspot.com
is getting a Blogger not found page

Comment by David WIlhelm — November 20, 2007

Here is an honest criticism that would help me read Ajaxian. These long blockquotes are somewhat annoying to deal with when you’re trying to skim through the front page and get a birds-eye view of what’s new for the day. Furthermore, I think they’re poor form because if I can read nearly the whole thing right here, why would I ever click the link to go to that poor author’s blog?

Comment by Steve — November 20, 2007

What is described here is not Subspaces/CrossSafe closure passing script tag sandboxing, but rather fragment identifier messaging (FIM), and it appears he is describing a less comprehensive form of FIM compared to SMASH and Dojo’s implementation. I believe, both SMASH and Dojo have automatic chunking capabilities, and Dojo includes XHR proxying, so you are not limited to name-value sets for your POST data. Perhaps it is intentionally a simpler form of FIM, I will give him the benefit of the doubt.

Comment by Kris Zyp — November 20, 2007

Where is the article these quotes are extracted from?

Comment by Michael S. — November 20, 2007

Hi Kris Zyp.

Thanks for the info about the Dojo Iframe Proxy transport. I gave the tests a spin, and from the looks of it, it seems that the consumer of such an API would need to have the xip_client.html file served from his domain. In other words, the client will have to use Dojo itself from his domain. This means that the client will need to do that little extra bit of setup at his end to enable the technique to work. Also, the xip_client.html file seems to have a fair amount of stuff in it, making it an overhead over the wire. Though the technique I’ve suggested isn’t different in it’s principles, it doesn’t make any additional requests (except, of course the one to communicate with the server), and it doesn’t require any form of setup at the client’s end.

Then again, I might not have understood Dojo’s technique correctly. Please correct me if I’m mistaken.

Also, I couldn’t find a link to the SMASH thing you mentioned. Could you please provide a link, so I can find out about what they are doing?

Comment by Rakesh Pai — November 20, 2007

Rakesh,
Indeed your approach is seemingly lighter than Dojo’s, which has it’s advantages. Here is a link to SMash (at least to the paper behind it, I don’t know if there is public access to the source):
http://domino.research.ibm.com/library/cyberdig.nsf/1e4115aea78b6e7c85256b360066f0d4/0ee2d79f8be461ce8525731b0009404d?OpenDocument
SMash is currently being pushed as an FIM for interframe communication for the OpenAjax hub 1.1.

And here is a link to a demo of XDDE, which is also using FIM:
http://www.samedesk.com:23460/~0.0.0/webhost/com.openspot.webhost.OResource/com/xdde/demo3/system.html

Anyway, I do appreciate your work, FIM is an important area for development and education.

Comment by Kris Zyp — November 20, 2007

Been there, done that. My way though. 1 year ago. Called it XDGate. Wrapped in ASP.NET control.

Comment by Andrew Rev — November 21, 2007

first link doesn’t work.

Comment by backdraft — November 23, 2007

Rakesh and Kris: I work on the Dojo FIM implementation. I agree with your assessments: The approach Rakesh describes requires less server requests, but is limited to a form submission. The Dojo implementation does require an extra resource request, but it is cacheable, and it provides a generic XHR proxy (for instance setting http headers). The Dojo implementation does have some ties to Dojo, but they are minimal, and can be replaced with your own equivalents (or inlining the functions used from Dojo). I would like to make a “no Dojo”, stand-alone version of the dojox.io.proxy code to make this easier.

Rakesh, I am curious if you notice any resource hits for using more than one iframe for the response chunks. The Dojo implementation will always use 2 iframes (3 for IE 7 due to a weird security choice in that browser). I can see where Rakesh’s approach avoids the goofiness that I am doing with the extra iframe for IE 7.

One thing I need to improve in the Dojo implementation is avoiding “leaks” in IE after destroying the iframes when the transaction is done. I’m curious if you run into a similar issue. There was a suggestion that setting the iframe src to javascript:false, then destroying the iframe a little bit after that would work, but I have not tried that yet.

Neat stuff! Thanks for taking the time to document the approach. I look forward to following any new developments.

Comment by James Burke — November 23, 2007

Hi jburke,

Firstly, to answer your questions:

No, I didn’t notice any extra resource hits to load any number of iframes. Maybe the way I was looking at the HTTP traffic was wrong, but I seriously doubt that possibility. This is because the resource I’m loading in the iframes is already loaded on the page before this iframe FIM communication started in the first place.

About the memory leaks, I haven’t measured it at all. I would be glad to find solutions to this issue, if it is a problem at all.

I tried reading up about the XHR proxy thing you mentioned, but I couldn’t understand it – especially the part about handling headers, etc. Admittedly, the technique I’ve described does a bad job about handling headers and readystate values, so I’d be very interested in understanding how you’ve handled it. If we could find a good solution to the problem, possibly one that utilizes the best of both your implementation and this technique, I guess we’ll be set. So far, what I’m thinking is that there are umpteen ways to “spoof” the headers until it reaches the proxy page. After all, the HTTP headers really matter at the proxy when it’s querying the REST API. Also, a similar technique can be used to flush the response headers from the REST response to the client. If a decent abstraction wrapper is provided to handle this, the user needn’t know the difference.

Also, I didn’t understand what you meant when you said that this technique is “limited to a form submission”. Did you mean that it is not possible to send right headers, again? If so, the above mentioned process should solve it. If you meant that the request is limited to name value pairs, that’s pretty much how HTTP works. Anything that doesn’t use name-value pairs is basically a protocol built on top of HTTP (like say SOAP), and we know how that sucks. But then again, maybe I’m understanding this wrong.

Without making any promises, I’m hoping to have a working implementation of this in code ready asap. If all goes well, I should have it ready in a couple of weeks. I am hoping that this will be tested for network efficiency, handling of HTTP headers, and memory leaks, and generally making it bullet-proof across browsers / platforms.

(BTW, Dojo happens to be my toolkit of choice. If there’s any way I can help in getting things like this into dojox, I would only be glad to have contributed. Please let me know.)

Comment by Rakesh Pai — November 26, 2007

Rakesh: sorry for the delayed answers:

On the resource consumption part for multiple iframes, I was referencing more browser memory/CPU usage than HTTP requests. I have not done any conclusive analysis myself, I’ve just heard of others complain about iframe memory usage.

When I talk about headers, I mean transferring HTTP headers that you can set on an XMLHttpRequest call. What I do in dojo.io.proxy is to create a “fake” XMLHttpRequest object that supports the XHR interface, so you set headers on that fake object, then when you do the send on the fake object, it serializes all the XHR info over to the other domain’s iframe via fragment IDs. That other frame then constructs a real XHR object and sets all the data using the serialized data.

Setting headers in important, for example, in some authentication schemes, where they want an HTTP authenticate header that uses a token. I’ve seen this used in some blogging APIs.

“limited to form submission” was a reference that by using a form tag, I thought that limited you to doing POST and GET HTTP operations, but I’m actually not sure on that now.

If you are interested in contributing to Dojo, see the community page. Basically, start filing tickets with patches. You can also post to a forum topic on the dojo website if you want to work out design issues.

Comment by James Burke — December 15, 2007

Leave a comment

You must be logged in to post a comment.