sYNTAX-DIRECTED DEFINITIONS (Chapter 5 Review) For parsers to be useful, something must happen during parsing - Construction of parse tree - Type checking and code generation - Even execute program (e.g., simple calculator inputs) Theoretically, solution is to associate ``attributes'' with each grammar symbol - Attributes are described by equations based on only: + intrinsic attributes of symbol (e.g., '+') + attributes of children + attributes of parent - Practically, the solution is to allow C code to execute at certain points during parsing (which can create these attributes or do anything else) - Attributes for a grammar symbol have names and can be treated like the fields of a record that is embedded in the parse tree: E.type - There are two types of attributes + Synthesized: value for a grammar symbol computed from its children - result of x + y depends if one of x or y is real E ::= E:e1 '+' E:e2 {: RESULT = c.combine(e1.type, e2.type); :} E (t1, real) / | \ (x, int) E + E (y, real) | | ID ID (x, int) (y, real) + Inherited: value for a grammar symbol computed from its parent (or from other other children of its parent) - type of identifier comes from type: int x, y, z D ::= T:t IDLIst:L {: L.in := t.type; :} IDList ::= ID:id {: c.addType(id.string, RESULT.in); :} | IDList:L COMMA ID:id {: L.in := RESULT.in; c.addType(id.string, RESULT.in); :} + first rule evaluates between T and IDList + second rule evaluates before ID + third rule evaluates first statement before IDList, second statement after ID D / \ T IDL / | \ (int) (x) | \ ID ... (y) Examples: calculator, syntax trees L : E E : E + T E : T T : T * F T : F F : digit Bottom-up evaluation of attributes - The complexity of the dependences among attributes can profoundly affect the performance of computing an attribute's value. An attribute that can be computed solely in terms of its children is called a a bottom-up evaluation. It can be naturally handled by a parser that recognizes bottom-up, such as CUP or YACC. - How does this work? Parser stack is augmented with a value stack (show with calculator, syntax tree, types) | | | | F | 5 | 4 (top) 1 + 3 * 5 | * | * | 3 (top-1) ^ | T | 3 | 2 (top-2) | | + | + | 1 | E | 1 | 0 +----+----+ state val depth L : E print(val[top]) T : T + F val[newtop] := val[top-2] + val[top] - name notation is just a more convenient notation for accessing the stack T ::= T:t + F:f { RESULT := t + f; } - Note that we don't have to have history of entire parse to evaluate these attributes, just immediate children! This allows an efficient implementation--only a simple stack is required. L-Attributed Definitions - Limiting to purely bottom-up evaluation is too limiting + Can't even declare variables int x, y; // not bottom-up wrt assignment statement x = x + y; - Not all sets of grammar rules can be handled by an LALR(1) parser like CUP or YACC - An L-Attributed definition is one in which the provided expressions can be correctly executed in depth-first order and also accommodate cases like the above - An LALR(1) parser executes a post-order traversal, giving a left-to-right, bottom-up traversal + although "left" is not directly accessible through references to children, they are guaranteed to parse/execute first, so the results of those can be accessed through what amounts to global variables (e.g., the symbol table) - But how are those inherited attributes handled? + only children of parse tree stored on stack, not parents + but there are often nodes on the stack that correspond to parents - in calculator example, + is logically the parent of 1 and 3 (actually, its the parent of 1 and *, that's the problem) - More on this later