Tutorial :How can I parse a list of name=value pairs in a parser generator (ANTLR, YACC etc)?



Question:

I want to parse a list of (whitespace separated) pairs in the form of

name1=value1 name2=value2 ...  

where:

  • NAME can contain anything except whitespace and equal sign
  • VALUE can contain anything except whitespace (including equal signs!)

The problem is getting the parser to match input like

name1=value1  

as separate 'NAME EQUALS VALUE' tokens, not as a single 'VALUE' token.

PS. I know this is trivial to code directly, but I need this in the context of a larger parser.


Solution:1

Here is something in antlr, which parses this;

a=b=c=d c=d e=f  

This may not be everything you need, but it should be the core.

grammar NameValuePairs;    pairs   :  namevaluepair (WS namevaluepair)*;    namevaluepair    :  name '=' value;    name  :  ID;    value  :  ID ('=' ID)*;    WS  :  ' ' {skip()};    EQ  :  '=';    ID  :  ~(' ' | '=')*;  


Solution:2

I think you may end up with an issue if VALUE can contain the equal sign. I think it would be better, if possible, to make the equal sign a reserved character, or switch to a different reserved character to mean '='.

I'm not sure if this would work in the context of your larger parser, but you could split on the space, giving you an array (or whatever data structure your language would use) of 'NAME=VALUE' pairs. Then loop through the array and split again on the reserved character you are using for '='. If you can't change or reserve '=', you could regex to just match the first instance of '='. Hope I'm not way off base!


Solution:3

You dont need a strong parser for name value pairs, regex would be sufficient. Unless you have some contextual or nested structure, this 'job' belongs in the lexer, not the parser :)


Note:If u also have question or solution just comment us below or mail us on toontricks1994@gmail.com
Previous
Next Post »