SEMANTIC ACTIONS

Each terminal and nonterminal may be associated with its own type of semantic value. For example, in a simple calculator using , the type associated with exp and INT might be int; the other tokens would not need to carry a value. The type associated with a token must, of course, match the type that the lexer returns with that token. For a rule AB C D, the semantic action must return a value whose type is the one associated with the nonterminal A. But it can build this value from the values associated with the matched terminals and nonterminals B, C, D.

RECURSIVE DESCENT

In a recursive-descent parser, the semantic actions are the values returned by parsing functions, or the side effects of those functions, or both. For each terminal and nonterminal symbol, we associate a type (from the implementation language of the compiler) of semantic values representing phrases derived from that symbol. is a recursive-descent interpreter for part of . The tokens ID and NUM must now carry values of type string and int, respectively. We will assume there is a lookup table mapping identifiers to integers. The type associated with E; T; F; etc., is int, and the semantic actions are easy to implement.Recursive-descent interpreter for part of .

class Token {int kind; Object val;
 Token(int k, Object v) {kind=k; val=v;}
 }
final int EOF=0, ID=1, NUM=2, PLUS=3, MINUS=4, ...
int lookup(String id) { ... }
int F_follow[] = { PLUS, TIMES, RPAREN, EOF };
int F() {switch (tok.kind) {
 case ID: int i=lookup((String)(tok.val)); advance(); return i;
 case NUM: int i=((Integer)(tok.val)).intValue();
 advance(); return i;
 case LPAREN: eat(LPAREN);
 int i = E();
 eatOrSkipTo(RPAREN, F_follow);
 return i;
 case EOF:
 default: print("expected ID, NUM, or left-paren");
 skipto(F_follow); return 0;
 }}
int T_follow[] = { PLUS, RPAREN, EOF };
int T() {switch (tok.kind) {
 case ID:
 case NUM:
 case LPAREN: return Tprime(F());
 default: print("expected ID, NUM, or left-paren");
 skipto(T_follow);
 return 0;
 }}
int Tprime(int a) {switch (tok.kind) {
 case TIMES: eat(TIMES); return Tprime(a*F());
 case PLUS:
 case RPAREN:
 case EOF: return a;
 default: ...
 }}
void eatOrSkipTo(int expected, int[] stop) {
 if (tok.kind==expected)
 eat(expected);
 else {print(...); skipto(stop);}
}

The semantic action for an artificial symbol such as T′ (introduced in the elimination of left recursion) is a bit tricky. Had the production been TT * F, then the semantic action would have been

int a = T(); eat(TIMES); int b=F(); return a*b;

With the rearrangement of the grammar, the production T′ → *FT′ is missing the left operand of the *. One solution is for T to pass the left operand as an argument to T′, as shown in .

AUTOMATICALLY GENERATED PARSERS

A parser specification for JavaCC consists of a set of grammar rules, each annotated with a semantic action that is a Java statement. Whenever the generated parser reduces by a rule, it will execute the corresponding semantic action fragment. shows how this works for a variant of . Every INTEGER_CONSTANT terminal and every nonterminal (except Start) carries a value. To access this value, give the terminal or nonterminal a name in the grammar rule (such as i in ), and access this name as a variable in the semantic action.JavaCC version of a variant of .

void Start() :
{ int i; }
{ i=Exp() <EOF> { System.out.println(i); }
}
int Exp() :
{ int a,i; }
{ a=Term()
 ( "+" i=Term() { a=a+i; }
 | "-" i=Term() { a=a-i; }
 )*
 { return a; }
}
int Term() :
{ int a,i; }
{ a=Factor()
 ( "*" i=Factor() { a=a*i; }
 | "/" i=Factor() { a=a/i; }
 )*
 { return a; }
}
int Factor() :
{ Token t; int i; }
{ t=<IDENTIFIER> { return lookup(t.image); }
| t=<INTEGER_LITERAL> { return Integer.parseInt(t.image); }
| "(" i=Exp() ")" { return i; }
}

SableCC, unlike JavaCC, has no way to attach action code to productions. However, SableCC automatically generates syntax tree classes, and a parser generated by SableCC will build syntax trees using those classes. For JavaCC, there are several companion tools, including JJTree and JTB (the Java Tree Builder), which, like SableCC, generate syntax tree classes and insert action code into the grammar for building syntax trees.