Monday, February 16th, 2009

Designing a JavaScript client for a REST API

Category: Articles, JavaScript

>This is a guest post by Jared Jacobs of the KaChing, an exciting new way to do your own hedge fund, the Web 2.0 way (a.k.a. don’t give it to Madoff!). I was very happy when he said he would be willing to do a post on REST APIs, and what makes a good design.

So you want to write a script that sites all over the web can use to access your REST API, eh? Well, that would be pretty straightforward if it weren’t for two things:

  1. browser same-origin restrictions on XMLHttpRequests (XHRs) and inter-window/frame access
  2. the lack of wide browser support for HTML 5-style message passing between windows.

The Same-Origin Policy and its minor browser-specific variations are detailed elsewhere, so I’ll just summarize it with a few key points. I use the term window to mean window object, which can be a top-level page or reside inside a frame or iframe.

  • XHRs can only be issued to the same domain as the originating window (not the originating script).
  • Two windows can read, write, and call each other’s properties if and only if they are from the same domain (host, port, and scheme).
  • There is one notable exception to the previous statement that is consistent across browsers: A window can change (but not necessarily read) the location of its own iframes and of the top window, regardless of their domains.

Recall that native browser primitives for sending or receiving information across domains include frames, iframes, images, scripts, stylesheets, and forms.

With these constraints and possibilities in mind, let’s first consider how best to support GET requests to the REST API.

Transport for GET Requests

XHRs would be the ideal way to retrieve data, since they provide meta information about each response, such as the the HTTP status code, in addition to the content. As noted above, however, other sites cannot use XHRs to directly to access your REST API.

One possible workaround is for each site that uses the JS client to install a server-side proxy on its domain that proxies REST API calls to your servers. That’s a pretty big barrier to the adoption of your JS client, though. We can do better.

Another possibility is having the JS client delegate the XHRs to a helper iframe that it creates specifically for that purpose. The iframe’s page must, of course, reside on the same domain as the REST API. Then there are also the problems of 1) passing the request URI from the parent window to the helper iframe and 2) passing the response status and content from the helper iframe back to the parent window. These problems are manageable, but not worth solving for GET requests, it turns out. We’ll return to them later.

Textual data can also be loaded across domains within scripts and stylesheets, both of which are lighter-weight than iframes and free of same-origin restrictions. To use one of these techniques, you must 1) either embed the data in a comment or encode it as part of either a valid script or stylesheet, and 2) determine when the data has finished loading.

A popular solution that addresses these two issues nicely, sometimes called script transport or JSONP, is formatting the data in JavaScript Object Notation (JSON) and wrapping it in a JS function call. This is the route I chose for kaChing’s JS client. A JSONP request typically includes the callback name as an HTTP request parameter. You’ll want to host a simple proxy on your domain that handles these requests by taking off the callback name parameter, relaying the request to your REST API servers, then formatting the response as the appropriate JS function call. Your JS client should generate a unique callback name for each request to avoid mix-ups in the event that multiple requests overlap in time. Also make sure that your proxy can relay both success and error responses from the REST API to your JS client.

Another nice property of the JSONP approach is that anyone can write a JS client and run a JSONP proxy server for any REST API, then share their JS client with anyone who trusts them.

Two weaknesses of the JSONP approach to be aware of: 1) it adds another failure point (the proxy server), and 2) detecting loading errors is harder with scripts than with XHRs.

Transport for Non-GET Requests

For HTTP methods other than GET (e.g. HEAD, POST, PUT, DELETE), script transport would require disguising the request as a GET, which has different semantics. Then there’s also the fact that a large request might not fit into a single URL. There must be a better alternative.

HTML forms can do POSTs, but that still leaves out the other HTTP methods.

What about the approach we proposed earlier—delegating XHRs to helper iframes? The JS client can construct a hidden iframe each time it needs to make a non-GET request. Iframe construction is quite fast (7-25 ms for most browsers on today’s machines) if the iframe page is small and cached. The request specification (i.e. verb, URL, and possibly parameters and content) can be passed to the helper iframe in its URL fragment.[1] The helper iframe page deserializes the request specification from its URL fragment immediately upon loading and issues the request. Simple enough. Now to relay the server’s response to the parent window. The simplest and most efficient way is HTML 5’s window.postMessage. The latest versions of some browsers support it already, but most people’s browsers don’t yet. (IE 6 & 7 don’t, but IE 8 will.) Unfortunately, the best two alternatives[2] to window.postMessage both require additional cooperation from sites using the JS client.

Approach 1. Sites using the JS client must host a small HTML file that we’ll call caller.html. They should inform your JS client of its URL as an initialization step. When the XHR response arrives, the XHR iframe creates a child iframe pointing to caller.html and including a URL fragment containing the response status, content, and a callback name from the original window. caller.html simply parses its URL fragment and passes the response information to its grandparent window via a direct function call that looks something like this: parent.parent[callbackName](status, responseText).

Approach 2. Sites using the JS client must inform your JS client of the URL of any valid resource on their domain as an initialization step.[3] That’s right, it can be an image, a stylesheet, an HTML page, or anything else—anything they don’t mind you requesting repeatedly. Its content is irrelevant. It just has to exist. We’ll call it cleardot.gif. When the XHR response arrives, the XHR iframe creates a child iframe pointing to cleardot.gif and including a URL fragment containing the response status and content. The JS client will have been polling for the existence of this grandchild iframe and can decode the response from its location’s URL fragment.

For kaChing’s JS client, I chose Approach 1 because it avoids polling and feels less hacky. Asking sites to allow our JS client to reuse an existing resource on their site seems more fragile because the resource’s original purpose may change or disappear someday. I also built in future compatibility with window.postMessage. (For our implementation, see the _send method in kaChing’s client.js. The helper iframe pages are xhr.html and caller.html.)

Regardless of which approach you choose, it’s important for optimal performance that the helper files (both the ones that you are hosting and the ones that your clients are hosting) be long-term cacheable. Also consider having your JS client preload them to warm the browser cache.

The Programming Interface

Now that we’ve decided on transport mechanisms, let’s turn our attention to the JavaScript programming interface. Since our transport mechanisms are all asynchronous, callers who care about the results of their requests will need to provide callbacks.

A first stab might be to have a JS method for each (verb, resource) pair in the REST API:

kaching.getUser(31, function(user) {

  // Use user.
});

That approach is certainly clear and simple, and is probably the best pattern for requests with side effects, like POST, PUT, and DELETE.

An improvement for GET requests would be what I call the shopping list pattern, which allows callers to easily request multiple resources together with a single callback:

kaching.get(
  kaching.user(31),
  kaching.portfolio(31),
  kaching.watchlist(31),

  function(user, portfolio, watchlist) {
    // Use user, portfolio, and watchlist.
  });

In addition to the added convenience, this pattern reduces latency by allowing your JS client to request the desired resources in parallel. The two tricky parts in the implementation of this pattern are what I call the joining callback, which collects all of the expected responses before firing, and the use of Function.prototype.apply to invoke the caller’s callback. For a working example of this pattern, see kaching.fetch in kaChing’s client.js.

Other best practices for JavaScript libraries to consider:

  • Avoid polluting the global namespace. Define only a single global name (e.g. kaching).
  • Make the loading of your library idempotent, in case it gets included in a page multiple times.
var kaching = kaching || ...;

Authentication

Authentication issues are beyond the scope of this post. If you’re looking for more information on this topic, you might consider starting with the Google Data APIs Authentication Overview.

Your Prize

For making it this far, you’ve earned an invitation to:

The API Garage Event

1:00pm Sat Feb 21 at kaChing HQ

Come eat, drink, celebrate, meet the kaChing team, and hack on our API.

Oh, and we’re hiring!

Footnotes

[1] We brush up against a limit on request size here—namely, the browser’s upper bound on URL length—but it’s not insurmountable. If you anticipate large requests, you can create additional helper iframes that each accept a portion of the request data in their URL fragments along with a sequence number and then pass them to the XHR iframe via a direct function call that looks something like this: parent.frames[xhrFrameName].acceptRequestData(data, seqNumber).

[2] Here I thought I’d catalog a few other ways of relaying the server’s response to the parent window that I considered or tried and why I nixed them.

Attempt Outcome
When the response arrives, set the parent window’s location fragment. Doing this adds an annoying new browser history entry. Plus, the new location fragment is visible in the browser’s location bar.
Before issuing the request, create a helper iframe with no src attribute in the parent window; then when the response arrives, set its location to that of the parent window plus a fragment. In IE 7 this works, but it results in the browser loading the containing page in the helper iframe, a potentially large waste of resources. Permissions errors in Firefox and Safari.
Before issuing the request, create a helper iframe with src=”about:blank” in the parent window; then when the response arrives, add a url fragment. Permissions errors again while adding the fragment in Firefox and Safari. Permissions error when the parent tried to read the fragment in IE 7.
Before issuing the request, create a helper iframe with src=”/favicon.ico” in the parent window; then when the response arrives, add a url fragment. This works in IE for sites that have favicons, but it’s not a very robust solution. Plus, if the site’s favicon isn’t very cacheable, your JS client could end up re-downloading it with each API call.
Instead of adding the xhr iframe directly to the page, create an intermediate iframe with no src attribute and document.write its content (just the xhr iframe); when the response arrives, the xhr iframe sets the url fragment of its parent, the intermediate iframe. I thought this might work because the intermediate iframe inherits the document.domain of its parent, so the parent should be able to poll its location. The problem I encountered was that the intermediate iframe’s window had no location property to set (presumably since that iframe had no src attribute).

[3] A variation on Approach 2 is not to ask sites to inform your JS client of the URL of a cacheable resource, but instead have your JS client probe for a favicon and/or walk the DOM of its host page to identify a suitable image or stylesheet. While this could work in many cases, I’d advocate being up front about the requirements of your JS client and avoiding making unnecessary assumptions.

Related Content:

Posted by Dion Almaer at 9:01 am
8 Comments

+++--
3.5 rating from 28 votes

8 Comments »

Comments feed TrackBack URI

See also http://softwareas.com/cross-domain-communication-with-iframes – there are some basic demos of both the approaches described here (“URL polling” and “Marathon”).

Comment by Michael Mahemoff — February 16, 2009

JSONP doesn’t necessarily require a proxy, that is completely up to whoever implements it.

Comment by vikingstad — February 16, 2009

Excellent article.

Talking to other domains is a pain point I’ve been fortunate enough to avoid so far but I’ll refer back to this when my luck runs out. Thanks!

Comment by EdSpencer — February 16, 2009

Great article!
Same could apply to developing bookmarklets, like (warning, blatant self promotion coming up) my own http://www.mapanui.com , which could use some improvements, like what you described here.

Comment by halans — February 16, 2009

this is an excerpt on the topic of “CRUD”-style RESTful interfaces; originally posted to http://jjinux.blogspot.com/2009/02/rest-restful-shopping-carts.html
where you can also read a nice discussion on the subject.
.
i’ve said it before, and i’ll say it again, 10³ times if necessary: if RESTful means you go from GET and POST to GET, POST, PUT, DELETE—then it’s a bad idea.
.
RESTful thinking is a great tool to clear up in your mind essential principles of the way the web works. GET, POST, PUT, DELETE are not so hot. GET and POST are great, they are what is actually implemented in browsers, and they’re sufficient. to put CRUD (Create, read, update and delete) into the HTTP method is bad.
.
any semantics that go beyond the ‘underlying request modalities’ of HTTP communication do NOT belong into the HTTP method, they belong into the URL. One very obvious piece of evidence to corroborate this is the fact that in a RESTful app, you cannot specify both your object and your intent in one URL.
.
that there are even more HTTP ‘verbs’—HEAD, TRACE, OPTIONS, CONNECT—that are not really talked about within the RESThype makes me suspicious. i want to argue that just as there are HTTP methods that are seldomly used (at least not by clients, only under the hood), so there are HTTP methods that got designed once, but are as superfluous as e.g. the POP and the FTP protocols (they got invented when HTTP was less prevailant than today—one protocol for a single purpose. they can’t do anything that HTTP can’t do).

DELETE and PUT are just as obsolete as the [font/] tag—they looked like a good idea at the time, and when they got replaced by something (much) better, no-one looked back.

Comment by loveencounterflow — February 17, 2009

more to the point: “”"For HTTP methods other than GET (e.g. HEAD, POST, PUT, DELETE), script transport would require disguising the request as a GET, which has different semantics. Then there’s also the fact that a large request might not fit into a single URL. There must be a better alternative.”"”—YES, there is a better alternative. do not disguise a DELETE request as a GET request. just cut it out. you want to DELETE http://example.com/product/42 ? fine. goto http://example.com/delete/product/42 and you’ll be fine.
.
while we’re at it: RESTfolks always try to sell DELETE and PUT—what if i want to BUY product/42? will i have to issue an HTTP BUY request to http://example.com/product/42 ? what if it’s time to checkout? do i have to issue an HTTP CHECKOUT to http://example.com/shop ? what if i want to pay? issue HTTP PAY http://example.com/cashregister/$7.00 ??
.
this is hilarious.

Comment by loveencounterflow — February 17, 2009

Another option for the cross-domain XHR, if you don’t want to hack around with a bunch of timers and iframe creation (very memory intensive and time consuming for clients) is to use flXHR [ http://flxhr.flensed.com/ ] which is a javascript+flash re-implementation of the identical native XHR object’s API, but since it uses an invisible flash component, it can make cross-domain calls if the server opt-in policy allows it. Since the API is identical, flXHR drops in as a simple replacement without *any* futher coding changes, even with frameworks and existing code that know how to speak to an XHR object.

Comment by shadedecho — February 17, 2009

@jared-understood, the flash “barrier” is not something so trivial as to not think about or take for granted. Not to hijack this thread too much, but to address your flash concern, I did have a couple of thoughts (as I’m very opinionated and passionate about this particular topic).
.
With *so* much flash out there, and with Adobe reporting numbers in excess of 99% global penetration, and with those same numbers even showing *huge* (>50% over 3 months) rates of adoption of their latest 10.x plugin version, I think we’re getting to the point where flash is arguably a ubiquitous technology. It is certainly more ubiquitous than Java, Silverlight, or even any one single browser.
.
And, in the same way that you need to provide a “graceful degradation” (or, depending on your perspective, “graceful enhancement”) for those with or without javascript support (etc), I think the same can be true of flXHR and flash based solutions.
.
For instance, you can use the underlying javascript toolset (CheckPlayer, built on top of SWFObject) to easily upfront detect if flash is enabled for the end-user, and of the right version, and then fork your logic at that point to handle it however you see fit.
.
One way (what I think is the best user experience) is to use Adobe’s express-install (which flXHR fully exposes) to automatically and in-line, in the same browser page/instance, provide users a way to update flash as necessary. With well over 99% of people on the internet having *some* flash installed, that’s an incredibly high number of people that will be able to simply upgrade their plugin and use your site as you intended with minimal user-experience impact.
.
On the other hand, you could have that same flash (version) detection logic simply decide that if they don’t have the flash installed, then dynamically you load a simple “shim” script like the one for doing the dynamic-script-tags, or another one which does dynamic iframes, to take the place of flXHR.
.
If you choose/design that “shim” in the right way (that is, with a compatible API to normal native XHR, like flXHR did), or at least some wrapper thereof, it’ll br a graceful fallback for non-flash users and the rest of your code logic is happily unaware.
.
The reason I advocate choosing flash (flXHR) as the primary (rather than fallback) method is that there are important benefits to be gained in terms of security (the full server policy model leveraging) and efficiency (one small flash instance as opposed to several memory-hogging invisible iframe instances). Also, the communication part of flXHR uses no timers (or ugly JSONP callbacks), so in general it should be quicker and more direct to get data to and from the server. In performance tests, it performed admirably and competively with several other various popular cross-domain solutions.
.
My point is, a few years ago, the argument against flash as mission critical app technology was compelling. But nowadays, I think it’s much less so, and really it’s only a few stalwart holdouts who refuse to embrace the parts of flash that can genuinely be helpful in UI technology and better UX.

Comment by shadedecho — February 17, 2009

Leave a comment

You must be logged in to post a comment.