wiki:CommentsFormalGrammar

Version 1 (modified by christof, 16 years ago) (diff)

--

Formal Grammar for UCUM

This grammar was created from a parser project employing ANTLR as a LL(*) parser generator. Note that this is only one of many ways to express a formal Grammar for UCUM. Alternatively the grammar could be written for LR-parser generators (e.g. yacc). Lexer rules have been omitted for clarity. The objective is to create UCUM-parsers from a declarative definition of the grammar.

ucumExpr   :  DIVIDE expr
           |  expr 
           ;
multiply   :  TIMES term
           |  DIVIDE term
           ;
expr       :  term (multiply)*  
           ;
term       :  element (exponent)? (ANN)*
           ;
element    :  simpleUnit
           |  LPAREN expr RPAREN
           |  ANN
           ;
exponent   :  (SIGN)? DIGITS	// allow zero exponent?
           ;
simpleUnit :  prefix metricAtom // prefix is token from lexer
           |  metricAtom  
           |  nonMetricAtom	// token from lexer
           |  DIGITS	// allow zero?
           ;
metricAtom :  baseUnit // token from lexer
           |  derivedMetricAtom // token from lexer
           ;

The following is an original code snippet from a working project using the ANTLR parser generator:

//...
startRule returns [UnitExpr u=null]
    :   u=ucumExpr EOF  // or EOL 
    ;

ucumExpr returns [UnitExpr u=null]
	:	DIVIDE u=expr { u.invert(); }
	|	u=expr 
	;

multiply[UnitExpr a] returns [UnitExpr u=null]
	:	TIMES u=term { u=a.multiply(u); }
	|	DIVIDE u=term { u.invert(); u=a.multiply(u); }
	;
	
expr returns [UnitExpr u=null]
	:	u=term (u=multiply[u])*  
	;
	
term returns [UnitExpr u=null]
	{ int exp = 1; }
	:	u=element (exp=exponent)? (ANN)* { u.setExponent(exp); }
	;

element returns [UnitExpr u=null]
	:	u=simpleUnit
	|	LPAREN u=expr RPAREN
	|	ANN 				{ u = new UnitExpr();}
	;

exponent  returns [int exp=1]
	:	(s:SIGN)? e:DIGITS	// allow zero?
		{
			exp = Integer.parseInt(e.getText());
			if(s != null && s.getText().equals("-") ) exp *= -1;
		}
	;

simpleUnit returns [UnitExpr u=null]
	{ double p=0; }
	:	p=prefix u=metricAtom { u.setPrefix(p);	} 
	|	u=metricAtom  
	| 	u=nonMetricAtom	
	|	d:DIGITS { u = new UnitExpr(Integer.parseInt(d.getText())); }	// allow zero?
	;

metricAtom returns [UnitExpr u=null]
	: 	u=baseUnit 
	|	u=derivedMetricAtom 
	;
//... lexer definitions follow