Friday, May 22nd, 2009

JSPlacemaker – Geo data extraction in pure JavaScript

Category: Examples, JavaScript

<p>Content extraction is still a hot topic on the web. We have lots of great text content but not much clue as to what those texts are. To make it more obvious we do term extraction for tagging but also geo location extraction for giving the text some spacial reference.

A fairly new web service that does this for us is Yahoo’s Placemaker. What it does is analyze a text (or the document defined by an HTML or feed URL) and give you back all the geographical locations that are mentioned in it. Pretty awesome, but the problem is that the API only allows for POST values and has either XML or RSS output. This means you can’t do it in simple XHR because of the cross-domain problem and you can’t use generated script nodes as there is no JSON output. You’d have to use a server-side proxy service. This is pretty easy with PHP and cURL as explained in this blog post but can be annoying, too.

That’s why I wrote a small wrapper in JavaScript that allows JS access to the Placemaker service called JS-Placemaker. I am not hosting a proxy for you, all you need to do is get your own application ID for Placemaker and use the JavaScript which you can host yourself if you wanted to.

Analyzing a text using JS-Placemaker is as easy as this:

javascript
< view plain text >
  1. Placemaker.config.appID='YOUR-APP-ID';
  2. Placemaker.getPlaces('Hi I am Chris, I live in London. Originally I am from Germany',
  3.  function(o){
  4.    console.log(o);
  5.  },
  6.  'en-uk');

The console output is an object or an array of places the service returned from the text:

JS-Placemaker - geolocate texts in JavaScript by you.

The first parameter is the text you want to analyze (this could be a pointer to the innerHTML of a DOM element, for example), the second is the callback function and the third the locale of the text – the demo page shows that Placemaker groks several languages.

Under the hood, JS-Placemaker uses YQL to work around the issue of proxying the request. YQL allows you to define your own data tables and even allows for doing JavaScript conversion of data on the server-side before sending it on. YQL has JSON output, so I was able to use that to access Placemaker in JavaScript.

The geo.placemaker Open Table was built by Balaji Narayanan and Tom Hughes-Croucher and can be used in YQL directly. Say you want to get the geo location data from the Slashdot Homepage in JavaScript, you can do this with the following statement in YQL.

  1. select * from geo.placemaker where documentURL="http://slashdot.org" and documentType="text/html" and appid="...the app id..."

You can choose JSON as the output and you get the data a a nice object.. Define a callback method and you could use it directly in a script node.

Have a Play with the YQL console using the Open Table, but better get your own AppID, before this one exceeds the daily limits.

Related Content:

2 Comments »

Comments feed TrackBack URI

holy cow, i was thinking of inventing this kind of thing this summer. now i don’t have to!

Comment by jwlrs — May 22, 2009

Apologies for that. Didn’t anticipate the link being tidied up that way. Anyway, bookmarklet available if you follow the “post” link.

Comment by tschaub — May 28, 2009

Leave a comment

You must be logged in to post a comment.