Tutorial :Lexer written in Javascript?


I have a project where a user needs to define a set of instructions for a ui that is completely written in javascript. I need to have the ability to parse a string of instructions and then translate them into instructions. Is there any libraries out there for parsing that are 100% javascript? Or a generator that will generate in javascript? Thanks!


Something like http://jscc.phorward-software.com/, maybe?

JS/CC is the first available parser development system for JavaScript and ECMAScript-derivates. It has been developed, both, with the intention of building a productive compiler development system and with the intention of creating an easy-to-use academic environment for people interested in how parse table generation is done general in bottom-up parsing.

The platform-independent software unions both: A regular expression-based lexical analyzer generator matching individual tokens from the input character stream and a LALR(1) parser generator, computing the parse tables for a given context-free grammar specification and building a stand-alone, working parser. The context-free grammar fed to JS/CC is defined in a Backus-Naur-Form-based meta language, and allows the insertion of individual semantic code to be evaluated on a rule's reduction.

JS/CC itself has been entirely written in ECMAScript so it can be executed in many different ways: as platform-independent, browser-based JavaScript embedded on a Website, as a Windows Script Host Application, as a compiled JScript.NET executable, as a Mozilla/Rhino or Mozilla/Spidermonkey interpreted application, or a V8 shell script on Windows, *nix, Linux and Mac OSX. However, for productive execution, it is recommended to use the command-line versions. These versions are capable of assembling a complete compiler from a JS/CC parser specification, which is then stored to a .js JavaScript source file.


If you want to build JavaScript parsers and code generators, check out the MetaII implementation in Javascript.

A MetaII Compiler tutorial walks you through building a completely self-contained compiler system that can translate itself and other languages:

MetaII Compiler Tutorial

This is all based on an amazing little 10-page technical paper by Val Schorre: META II: A Syntax-Oriented Compiler Writing Language from honest-to-god 1964. The MetaII compiler complete self-description is about 30 lines! I learned how to build compilers from this back in 1970. There's a mind-blowing moment when you finally grok how the compiler can regenerate itself....

The tutorial explains MetaII, how it works, and implements MetaII compiling MetaII into JavaScript. You can easily modify this compiler to parse other langauges, and produce different Javascript.

I know the website author from my college days, but have nothing to do with the website.


Jison is probably the best and most active lexer & parser generator out there for Javascript. It mimics Bison and Yacc.

Jison: http://zaach.github.io/jison/

If you want just a light weight lexer (~100 sloc) you can take a look at Lexed.js: https://github.com/tantaman/lexed.js


For simple parsing tasks I'm quite fond of using a variant of Pratt's Top Down Operator Precedence parser. While Pratt wrote the original paper using an old Lisp dialect, the same concepts can easily be used in most any language. In fact, Douglas Crockford wrote an excellent article on Top Down Operator Precedence parsing in JavaScript, which might be just what you need.


Depending on the design of the 'set of instructions', you may be able to use Javascript's built-in eval function, which parses Javascript source; you may be able to write a simple translator to convert the instructions to Javascript code.

By the way, be very careful about XSS holes.


if you're really looking for just a lexer, try prettify.


Here is an example of a parser for a "pseudo" natural language of instructions, which was implemented in pure JavaScript with Chevrotain Parsing DSL:


This example even includes support for multiple natural languages (English & German) using grammar inheritance.

Chevrotain falls under the category of "libraries out there for parsing that are 100% javascript" as it performs no code generation. Using Chevrotain is similar to "hand crafting" a recursive decent parser, only without most of the headache such as:

  • Lookahead function creation (deciding which alternative to take)
  • Automatic Error Recovery.
  • Left recursion detection
  • Ambiguity Detection.
  • Position information.
  • ...

as Chevrotain handles that automatically.


Antlr version 4.5 now has a Javascript target.


I was looking for something similar that wouldn't have any security holes and I came across two resources. They don't parse the script, but actually run it in a "safe" environment - something you can't guarantee when using the eval function. So, I don't know if it's exactly what you are looking for but take a look:

  1. jsandbox - Javascript sandbox
  2. Google Caja - virtual iframe.


If you want a lexer and nothing but a lexer then take a look at this: https://github.com/aaditmshah/lexer

It's a pure JavaScript lexer with lots of powerful features written in just a few lines of code.

Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Next Post »