Monday, January 11th, 2010

Jison: Build parsers in JavaScript

Category: JavaScript

If you have ever wanted to create your own “language” (or DSL if you want to play 2008 buzzword word bingo) then you may have delved into the worlds of yacc/bison (and their siblings: lex/flex) to get this done in a more declarative manner.

Jison lets you play in this world thanks to Zach Carter:

Jison generates bottom-up parsers in JavaScript. Its API is similar to Bison’s, hence the name. It supports many of Bison’s major features, plus some of its own. If you are new to parser generators such as Bison, and Context-free Grammars in general, a good introduction is found in the Bison manual.

A brief warning before proceeding: the API is ridiculously unstable right now. The goal is to mirror Bison where it makes sense, but we’re not even there yet. Also, optimization has not been a main focus as of yet.

Briefly, Jison takes a JSON encoded grammar specification and outputs a JavaScript file capable of parsing the language described by that grammar specification. You can then use the generated script to parse inputs and accept, reject, or perform actions based on the input.

At the end of the day, you can then declare your language like this:

javascript

  1. {
  2.     "comment": "Parses end executes mathematical expressions.",
  3.  
  4.     "lex": {
  5.         "rules": [
  6.            ["\\s+",                    "/* skip whitespace */"],
  7.            ["[0-9]+(?:\\.[0-9]+)?\\b", "return 'NUMBER';"],
  8.            ["\\*",                     "return '*';"],
  9.            ["\\/",                     "return '/';"],
  10.            ["-",                       "return '-';"],
  11.            ["\\+",                     "return '+';"],
  12.            ["\\^",                     "return '^';"],
  13.            ["\\(",                     "return '(';"],
  14.            ["\\)",                     "return ')';"],
  15.            ["PI\\b",                   "return 'PI';"],
  16.            ["E\\b",                    "return 'E';"],
  17.            ["$",                       "return 'EOF';"]
  18.         ]
  19.     },
  20.  
  21.     "operators": [
  22.         ["left", "+", "-"],
  23.         ["left", "*", "/"],
  24.         ["left", "^"],
  25.         ["left", "UMINUS"]
  26.     ],
  27.  
  28.     "bnf": {
  29.         "S" :[[ "e EOF",   "print($1); return $1;"  ]],
  30.  
  31.         "e" :[[ "e + e",   "$$ = $1+$3;" ],
  32.               [ "e - e",   "$$ = $1-$3;" ],
  33.               [ "e * e",   "$$ = $1*$3;" ],
  34.               [ "e / e",   "$$ = $1/$3;" ],
  35.               [ "e ^ e",   "$$ = Math.pow($1, $3);" ],
  36.               [ "- e",     "$$ = -$2;", {"prec": "UMINUS"} ],
  37.               [ "( e )",   "$$ = $2;" ],
  38.               [ "NUMBER",  "$$ = Number(yytext);" ],
  39.               [ "E",       "$$ = Math.E;" ],
  40.               [ "PI",      "$$ = Math.PI;" ]]
  41.     }
  42. }

and then use it, even within the world of CommonJS:

javascript

  1. // mymodule.js
  2. var parser = require("./calculator").parser;
  3.  
  4. function exec (input) {
  5.     return parser.parse(input);
  6. }
  7.  
  8. var twenty = exec("4 * 5");

Zach scratched an itch and created a real world example…. an Orderly parser.

Posted by Dion Almaer at 6:35 am
5 Comments

++++-
4.3 rating from 31 votes

5 Comments »

Comments feed TrackBack URI

Wow. This is going to open up a lot of exciting possibilities!

Comment by Skilldrick — January 11, 2010

For thosewho prefer top down parsing, I believe there’s a prettygood JavaScript back-end for Antlr. I haven’t tried it yet, but Antlr itself is great!

Comment by genericallyloud — January 11, 2010

I also wanted to say that this project looks pretty cool too. I wasn’t trying to knock it! I’m just doing a project with Antlr now and thought I would mention it.

Comment by genericallyloud — January 11, 2010

i’ve played a bit with the ANTLR javascript port which needs ~100kb javascript just for the runtime (!!) and created an a ja parser framework based on jQuery which is much smaller : have a look at
http://github.com/lgersman/jquery.orangevolt-parse. The project includes AST rebuilding/optimization and a bunch of tests(including a calculator grammar, a js expression resolver and some more)

go to the bottom of the http://github.com/lgersman/jquery.orangevolt-parse/raw/master/tests/parser.html test file to see the calculator grammar.

hmmm, time to make an offical introduction to the project … :-)

Comment by lgersman — January 11, 2010

for build parsers in JavaScript see also OMeta and OMetaJS (with Narwhal module)

Comment by veged — January 11, 2010

Leave a comment

You must be logged in to post a comment.