Monday, June 14th, 2010
JSonduit: Turn the Web into a JSON feed
<p>Chris Winberry recently built a node-htmlparser library that we posted on. Now we know why he built that library. He has released JSonduit.com:Any data, anywhere.
JSonduit is a service that can turn practically anything on the web into a JSON feed that any website may consume. A JSON conduit, if you will.
Feeds are created by specifying one or more source URLs and a custom transform, written in JavaScript, that can manipulate the data before the feed is served.
JSonduit also provides a hosting service for web widgets so that any site can easily display JSonduit feeds on their pages. In fact, those recent/popular lists of you see below are actual widgets served by the JSonduit service; all done in a couple of lines of JavaScript (go ahead, view the page source).
To see what it is like to query the world in this manner, check out the most popular feed.... a view on hacker news:
-
-
var result = [];
-
-
var items = getElements(
-
{
-
class: "title"
-
},
-
data[0]
-
);
-
-
items.forEach(function(item){
-
var links = getElementsByTagName('a', item);
-
links.forEach(function(link) {
-
// Remove the "more" link.
-
if(link['attribs']['rel'] == 'nofollow') return;
-
-
result.push({'title': link['children'][0]['data'], 'link': link['attribs']['href']});
-
});
-
});
-
This returns something like:
-
-
{
-
"error": null,
-
"result": [
-
{
-
"title": "U.S. Discovers Est. $1 Trillion of Minerals in Afghanistan",
-
"link": "http://www.nytimes.com/2010/06/14/world/asia/14minerals.html"
-
},
-
// ..
-
If this looks a touch familiar though... remember that the awesome YQL gives you acccess to the Web from a simple query language too.
Related Content:











Sweeeeeeet. I really like the look of this. Maybe SQL-style (a la YQL) is better for querying data, but writing custom transforms in JavaScript feels a lot more flexible, and much, much more fun :)
Doubt this will scale. Also, there must be a huge amount of IP law being violated here. I would not use this for any real business.
sounds a lot like what YQL does.
@leptons – It has already been exercised to 1000s req/sec per instance and the fetch queue is keyed so that 10,000 requests for an uncached feed still only result in one source URL request.