Tuesday, June 1st, 2010

Two JS solutions to run SQL-like statements on Arrays and Objects

Category: JavaScript, Library

<>p>Thomas Frank released SQLike a script that allows you to filter arrays and objects in JavaScript with a SQL-like syntax.

Query:

javascript
< view plain text >
  1. SQLike.q(
  2.     {
  3.         Select: ['firstName','|count|','firstName','|sum|','salary','|min|','salary','|max|','salary','|avg|','salary'],
  4.         From: dataArray,
  5.         GroupBy: ['firstName'],
  6.         Having: function(){return this.count_firstName>1},
  7.         OrderBy: ['sum_salary','|desc|']
  8.    }
  9. )

Results:

javascript
< view plain text >
  1. [{"firstName":"Stuart", "count_firstName":3, "sum_salary":180000, "min_salary":41000, "max_salary":90000, "avg_salary":60000}, {"firstName":"Vicki", "count_firstName":2, "sum_salary":163000, "min_salary":79000, "max_salary":84000, "avg_salary":81500}]

All of this happens in-memory so this is not a contender for YQL which offers the same functionality as a web service.

Another, very similar solution is LINQ to JavaScript by Chris Pietschmann:

javascript
< view plain text >
  1. var myList = [
  2.   {FirstName:"Chris",LastName:"Pearson"},
  3.   {FirstName:"Kate",LastName:"Johnson"},
  4.   {FirstName:"Josh",LastName:"Sutherland"},
  5.   {FirstName:"John",LastName:"Ronald"},
  6.   {FirstName:"Steve",LastName:"Pinkerton"}
  7. ];
  8. var exampleArray = JSLINQ(myList)
  9.                    .Where(function(item){ return item.FirstName == "Chris"; })
  10.                    .OrderBy(function(item) { return item.FirstName; })
  11.                    .Select(function(item){ return item.FirstName; });

Personally I like the LINQ syntax better, what do you think?

Related Content:

Posted by Chris Heilmann at 4:13 am
23 Comments

+++--
3 rating from 3 votes

23 Comments »

Comments feed TrackBack URI

Personally, I’ve never liked SQL, and I find it very hard to understand why people would spend a lot of time porting SQL to other paradigms.

I can certainly admire the effort, but ultimately I prefer to use something that is more native to the platform I am working on… let’s say the Array methods of ECMAScript rev. 5.

- filter
- forEach
- map
- some
- reduce
etc

https://developer.mozilla.org/En/Core_JavaScript_1.5_Reference/Objects/Array

Comment by MorganRoderick — June 1, 2010

I’m not sure why you would like this:
.OrderBy(function(item) { return item.FirstName; })
over this:
OrderBy: ['sum_salary','|desc|']

I think I’use SQLike quite soon

Comment by bfred — June 1, 2010

And why exactly is “SQL” needed for in-memory data? When dealing with in-memory data, you choose the correct data structure and everything else falls into place.

Comment by khs4473 — June 1, 2010

Something I build recently and uses localStorage and sessionStorage, with a straightforward SQL kind of interface:
http://code.google.com/p/dom-storage-query-language/

It stores data in structured JSON and uses basic SQL syntax to manipulate it.

Comment by PeteB — June 1, 2010

Shame that YQL is not open source fully, so one cannot run it on own server.

As of LINQ I would like to suggest http://smp.if.uj.edu.pl/~mkol/il2js/ as it fully supports LINQ to run it in the browser.

Comment by movax — June 1, 2010

Thank you for covering the release of SQLike!

I haven’t compared it much to other tools like LINQ or dom-storage-query-language. However that could be interesting: What features do the different engines support (JOINS, GROUP BY, aggregate functions etc)? How fast/slow are they performing the same sort of queries on the same data sets/arrays?

I also fully agree with MorganRoderick that the built in data manipulation possibilites in a programming language are oftimes enough. However I think there might be a place in JavaScript and ActionScript for both XPath/JSONPath-like query engines used on deep structures and SQL-like query engines used on table like structures.

When I wrote SQLike (an 2-3 day exercise) I thought it was great fun to write the more hairy parts like different types of JOINs. I also had fun getting the code to run in identical format in both JS, AS2 and AS3.

Comment by ThomasFrank — June 1, 2010

Suggestions:

- remove pipes, use reserved words like count,avg, etc.
- single item lists can be replaced by a string like groupby:’firstname’
- avoid functions in Having.
- better order by: ['name desc','date asc'] split by space and apply

I like SQLike

Comment by Ajaxerex — June 1, 2010

Like this:

SQLike.q({
  Select : ‘firstName as fn,count(firstName) as n,sum(salary) as s’,
  From : ‘data’,
  GroupBy: ‘fn’,
  Having : ‘n>1′,
  OrderBy: ‘s desc’
})

Or even passing the whole sql as a string…

Comment by Ajaxerex — June 1, 2010

@Ajaxerex:
I have to agree with you ;-) to a certain degree.
Like using strings instead of arrays for specifying fields and stop using pipes – the hell with it: reserve a few words :-D
I’ll certainly look into those!

However one thing I think is sweet is the fact is that the where argument is a function, since this gives you the whole of the JavaScript language to express your where clause….

The way I have implemented “having” is actually as a “recursion” of “where” on a rough result set. (Does this explanation make any sense?) So in the current implementation/logic: if the argument to where is a function so is the argument to where…

Comment by ThomasFrank — June 1, 2010

sorry slipped on a word at the end: if the argument to “where” is a function so is the argument to “having”

Comment by ThomasFrank — June 1, 2010

I think the thing people miss when they set out to build something “like LINQ” in environments other than .NET is that they don’t realize that LINQ is a native language construct within the .NET environment. It doesn’t depend on interpreting strings as various parts of a subquery. LINQ provides three very real benefits that any also-ran can’t achieve without an actual language construct:

1. Proper syntax highlighting of the grammar within a LINQ query.
2. Much greater flexibility in debugging.
3. The ability to properly structure a complex query (for instance, with subqueries) in its natural order without ambiguity about what will come out.

I should note that I’m not a .NET developer, and I have no interest in pushing .NET as a development environment. Instead, I think it would be good to push for better data querying in the languages I use most (currently Javascript and PHP).

Comment by eyelidlessness — June 1, 2010

@eyelidlessness:
I agree – I’m not a .NET protaganist, however I can be a .NET developer it the pay is good ;-), and I’ll like to see better data querying in ECMA-script (JavaScript and ActionScript). I think we will get there eventually and then the need for solutions like SQLike will disappear. But until then… And having said this I didn’t know about LINQ when I wrote SQLike ;-D

@ajaxerex: One thought that struck me after reading your comments once more: Why not implement a new part of SQLike that is a string parser converting a pure string into the input object that I currently use. This wouldn’t even break the current API – since I could detect string type input. However it would limit where and having clauses to some extent… The hardest part about this is actually to make it continue to work in AS2 – since not regexp parsing could be used. On the other hand: I don’t know how important AS2 support is – I just thought it was fun that it could be accomplished for those of us with legacy code bases :-D

Comment by ThomasFrank — June 1, 2010

I can see writing these libraries as an ‘in memory adapter’ for a sql library, but using these as stand alone libraries to query an array seems… crazy. Not saying they’re not fun to implement, but I would follow the first commenter’s advice and use the ECMA5 functions like map, some, and filter to accomplish this in a native (fast) way. I use underscore.js to provide fallbacks for older browsers, or you could extend the Array prototype directly if you wanted to http://documentcloud.github.com/underscore/

The LINQ library was pretty cleanly but it’d be interesting to see a version that acts more like LINQ in that it builds up a query relation object that doesn’t actually get executed till you start to access the data.

Sorry Thomas, but I felt like I was reading fortran, or compressed javascript looking at sqlike. ???:

var a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,obj2,d2,toSort,sortParas,ohaving,ot,coMax,co,labels, …

Comment by ratbeard — June 1, 2010

@ratbeard:

I like powerful native array functions as well. However they don’t exactly perform joins yet do they?

Sorry about all the variable initialization at the beginning ;-)
One thing that becomes painfully clear when writing something for AS3 as well as JS is that ECMA-script is that the local variable scope is function wide. I for one tend/tended to declare “var i” for each loop, thus declaring it several times within the same variable scope…

I wanted to write all code inside a function, since this was an easy way to run identical code in JS and AS… I just moved all var declarations to the top to get rid of all AS3 compiler warnings “variable already declared”.

Also I wasted a few bytes declaring vars for all single letters – since I know myself too well by now: I tend to use short var names from time to time…

After this I think the code clears up a bit, divided into blocks corresponding to a SQL key word, with some comments.

But yes: I will probably have to refactor the code a bit with more descriptive variable names at some point… I hope this will make it clearer to read.

Comment by ThomasFrank — June 1, 2010

There are a lot of unimpressed comments in here, and just wanted to throw my marbles in the pit: I think it’s fantastic. We’re currently working on a web app which deals with hundreds of objects client side and the size of the objects make hash-tables a little unrealistic. A simple structure to allow selecting any group by any property will speed development time tremendously! This combined with the HTML5 local storage could make for some pretty awesome apps…

Comment by jamiewilson — June 1, 2010

@Thomas
You’re right, there isn’t a native join method. I guess its something I haven’t really needed, though a groupBy function I have found very useful and would be a nice addition to JS. There’s a simple property-based one for underscore below. I’m a little hazy on my SQL, but I think this is a join implementation for 2 collections as well:

_.group_by = function (collection, prop) {
return _.reduce(collection, {}, function (group, item) {
var value = item[prop];
(group[value] = group[value] || []).push(item);
return group;
});
};

_.join = function (collection, prop, other_collection) {
var other_map = _.group_by(other_collection, prop);
return _.reduce(collection, [], function (results, item) {
var value = item[prop];
_.each(other_map[value], function (other_item) {
results.push(_.extend({}, item, other_item));
});
return results;
});
};

// 100% test coverage:
a = [{1:1, 2:2}, {1:11, 2:22}]
b=[{1:1, yeah:'hi'}, {1:1, 5:55}, {1:11, x:5}]

JSON.stringify(_.join(a, 1, b))
[{"1":1,"2":2,"yeah":"hi"},{"1":1,"2":2,"5":55},{"1":11,"2":22,"x":5}]

It seems like a join might be a leaky abstraction for a language like javascript. Since a resultSet object will be mutable, you’d need to make a tradeoff between safety and performance as to whether you do a deep or a shallow copy when constructing each row. I think this is the problem the sealed property and other stuff in ECMA5 is tackling – returning an object thats read-only.

Comment by ratbeard — June 1, 2010

@ratbeard:
Interesting code. Mind if I borrow it ;-)
Yes – when to copy and when to reference… I hope that can be tackled in ECMA5 for less leakyness and also I’m hoping for a native “deep copy”.

Comment by ThomasFrank — June 2, 2010

I have a same solution: http://code.google.com/p/kombai-js/

Comment by mnx2012 — June 2, 2010

And some example code:

//create a new database
var mydata = K.create.database({
fieldName: ["name", "phone"],
primaryKey: “phone”
});
// insert in to database
mydata.Insert(["ohay", "0987654321"]);
mydata.Insert(["oh", "0987654322"]);
mydata.Insert(["ha", "0987654323"]);
mydata.Insert(["hy", "0987654324"]);

mydata.Insert({“phone” : “0987654324″, “name” : “hy”});
mydata.Insert({“phone” : “0987654325″, “name” : “yo”});

//update
mydata.Update({“phone” : “232093430″, where: “name like ‘o%’”});
mydata.Update({“name”: “none”, where: “phone == ’0987654325′”});

//select
mydata.Select({fieldName: “phone”, where: “name like ‘%o%’”, orderBy: “phone”, Limit: 3});

//delete
mydata.Delete({Where: “name == ‘ohay’”});

//drop database:
mydata.Drop();

//create from a collection;
vnjs.data = K.create.database({
fromCollection:[
{id:2,title:"Fontyukle",code:"fontyukle"},
{id:6,title:"Full text search",code:"full text search"},
{id:56,title:"Helloworld extension",code:"helloworld extension"},
{id:57,title:"FFont",code:"ffont"},
{id:58,title:"Sadf",code:"sadf"},
{id:59,title:"Dragdrop demo",code:"dragdrop demo"}
]
});

Comment by mnx2012 — June 2, 2010

@Jamie Wilson:
Thanks for the praise. Good luck with using SQLike. Keep me posted if you encounter problems or/and when you release your project!

@mnx2012:
I think you mean you have a similar solution :-D That’s great.

However SQLike seems to do a few things other SQL-like libraries mostly don’t – like group by, joins and set operators (union etc), so I hope it might find a natural place in some people’s toolboxes.

During the weekend I will start up sqlike.org and collect some of the opionions expressed here and elsewhere as to how to continue development – focusing on query speed and syntax. I’ll let you know when the site is up and running.

Comment by ThomasFrank — June 3, 2010

Some personal experience with these JS-based ORM wrappers:

Was looking for a good ORM for jqGrid (jQuery data grid) in-memory storage implementation and stumbled against same two contenders plus a few more.

Personally found “magic property name parsing” of the SQLike (this.count_firstName > 1) hair-raising. The key to majority of “where” statements is dynamic assembly or string concatenation. Would rather see where statements like (this._count(‘column_name’) > 1) so the function could be pulled from hash and the argument from html query. All-in-all a terrible put off.

Found the “LINQ to JavaScript” tremendously verbose and still limiting in comparison to other “linq-like” libs.

In the end stuck with http://www.hugoware.net/Projects/jLinq (Linq for jQuery)
It’s much more succinct and is more applicable to real-time projects. It has a few follies – some sorting control oddities and it does not work well with deeply nested data objects. I had to write my own crude sorting code for deeply nested data structures. (http://www.accentsolution.com/static/jqgrid_localstore.html)

jqGrid guys saw some good in jLinq and are picking it up and improving it for local storage manipulation for their next version. If all goes well, and they happen to contribute their deep-object-sorting and other code improvements back to jLinq, i’d say there’ll be no Link-like JS lib better than jLinq.

Comment by ddotsenko — June 14, 2010

@ddotsenko:
Why don’t you do
this["count_"+fieldName]>1
Is that dynamic?

Well I’m currently working on a major upgrade to SQLike and it might put you off even more. The new syntax is qoing to be very SQL like – basically just a standard SQL query sent to SQLike as a string :-D

Would that help or hurt in your opinion?

Comment by ThomasFrank — June 22, 2010

Some similar javascript-sql proyect:

http://javascriptsql.sourceforge.net

some demo:.
http://javascriptsql.sourceforge.net/ARCH/JSBD/demo.html

http://javascriptsql.sourceforge.net/ARCH/JSBD/demo.html?sql=+Select+NombreProducto%2C+CantidadPorUnidad%2C+UnidadesEnExistencia+from+productos+where+IdCategoria%3D%3D2+%26%26+NombreProducto.match%28%2Fsalsa%2Fi%29!%3Dnull

(only in spanish for now)

Comment by joaquinbecerra — August 24, 2011

Leave a comment

You must be logged in to post a comment.