CSE131B Compiler Project I -- Semantic Analysis
CSE 131B -- Winter 2003
Compiler Project #1 -- Semantic Analysis
Due: Midnight on Sunday, May 4, 2003

Note about Turn-in

Please refer to the turn-in procedure document for instructions on the turn-in procedure.

Background

In this assignment we will augment an existing Oberon syntax and scope checker with a semantic checker component. Semantic checking involves storing information in a symbol table (mostly during a declaration) and then accessing that information (mostly during a statement) in order to verify that the input conforms to Oberon semantic rules.

The rules below are divided into three phases. The intent here is to give you an indication of (roughly) how long it should take for parts of the project to be completed, as well as show the grading breakdown. It is implied that you implement the early phases before the later ones.

Error Messages

The error messages are precisely specified. These messages must be printed exactly as specified or points will be deducted. A Java class and string literal definitions are provided in ErrorMsg.java order to assist conforming to the required format.

Note that all error messages below are to be printed to standard output, NOT standard error. All error messages will be preceded by a line of text specifying current filename and line number on which the error occurs:

        Error, "file", line NN:
	message
Note that the file and line are on a separate (preceding) line from the message.

What We're Not Testing or Using in Testcases

The Assignment

Your task for this assignment is to implement the following semantic checks, reporting the appropriate error message (from ErrorMsg.java) with a line number that is correct within 1 line.

Note that frequently the terms "equivalent" and "assignable" are used -- these will be discussed in lecture. They are also defined in the lecture notes and in the oberon spec. For convenience we generally use the term "equivalent" in place of Oberon's more cumbersome "equal types".

When a check fails, your compiler is expected to compute a result at the point of the check that will permit the compiler to keep checking subsequent expressions or statements. 5% of your grade will be allocated to this. Of course, the initial error may result in misleading error messages in subsequent checks. However, your compiler should not crash or terminate at the point of the check.

Note that the time periods for the various phases are estimates to help you plan your time -- they are not due dates.


Phase I (40% -- 1 1/2 weeks)

Declarations, statements and expressions consisting of variables and literals of basic type (integer/boolean/real).

Check #1: Detect a type conflict in an expression -- that is, expressions of

        X OP Y
where the types of either X or Y are incompatible with OP. The valid types (and resultant types) are as follows: In Phase I, the operands are restricted to simple variables of basic type (integer, real, boolean) and literals (this will be extended in the later phases).

Check #2: Detect an illegal assignment -- that is, an assignment of the form

        A := Expr
where A is either a type, a procedure, or a constant (i.e., anything other than a variable). Note that this is an elaboration of a check from an earlier assignment. The old message should be replaced with this one.

Check #3: Detect a type conflict in an assignment -- that is, an assignment of the form

        X := Y
where the type of Y is not assignable to the type of X.

Check #4 Detect a type conflict in an IF/WHILE/REPEAT statement, that is, statements of the form

        IF x THEN ... END;
        WHILE y + 1 DO ... END;
        REPEAT ... UNTIL z;
where the expression is not equivalent to boolean.

Check #5 Detect a type conflict in a FOR statement. Given a statement such as

        FOR index := StartExpr TO EndExpr BY IncrementExpr DO ... END;
the types of (in this case) index, StartExpr, EndExpr, and IncrementExpr must be equivalent to integer. Additionally, index must be a variable.

Phase II (40% -- 2 weeks)

Procedures (global, not type-bound), variables of array type and string literals, in addition to Phase I objects.

Check #6 Detect an illegal procedure call. Errors should be generated if

Note that in Phase II only global (non-type-bound) procedures are tested.

Check #7a Detect an illegal RETURN statement -- that is, a statement of the form

        RETURN
that is within the body of a procedure that has a declared return type. (in other words, RETURN can appear in a "pure" procedure, but not a function). Note that for the purposes of 131B, our main program is considered a procedure (with no return type) and thus a RETURN; is legal there.

Check #7b Detect an illegal RETURN statement -- that is, a statement of the form

        RETURN expr
that is within the body of a procedure that has no declared return type. (in other words, this RETURN can appear in a function, but not a "pure" procedure, and not in the main program). Additionally, an error should be generated if the type of expr is not assignable to the return type of the enclosing procedure.

Check #8 Detect a missing RETURN statement. If a procedure is declared as a function (i.e., has a declared return type), there must be at least one RETURN statement (of the form in Check 7b) somewhere in the procedure body.


Check #9 Detect an illegal constant declaration -- that is, a declaration of the form

        CONST C = Expr;
where the value of Expr is not known at compile time.

Check #10 Detect an illegal array declaration. Given a type declaration such as

        VAR List : ARRAY Index OF Type;
an error should be generated if

Note the difference between an array of array (tested) and a two-dimensional array (not tested). Consider the following declarations:

        VAR array1 : ARRAY 10 OF ARRAY 10 OF INTEGER;  (* tested *)
        VAR array2 : ARRAY 10, 10 OF INTEGER;          (* not tested *)


Check #11 Detect an illegal array usage. Given a designator such as

        MyList [nIndex]
an error should be generated if

Check #12 Extend Check #1 through Check #6 to allow for operands consisting of

(In other words, the rules for Checks #1 through #6 are the same, but in Phase II we allow more complex expression operands)

Check #13 Detect an illegal EXIT statement -- that is, a statement of the form

        EXIT
that is not within the body of a loop (WHILE, REPEAT, LOOP, FOR).

Phase III (20% -- 1 week)

Records, pointers, type-bound procedures, NIL.

Check #14 Detect an illegal pointer type usage -- that is, a declaration containing a usage of the form

        POINTER TO Type
where Type is not a record type.

Check #15 Detect an illegal record usage. Given a designator such as

        MyStruct . SomeField
an error should be generated if Note that the type of a record field can be a pointer to the record's type (e.g, linked list). See ``What We're Not Testing'' at the top of this document for an example.

Check #16 Detect an duplicate record field -- that is, the same identifier twice in the same record declaration. Note that type-bound procedures are essentially the same as fields.


Check #17 Detect an illegal receiver variable. The type of a receiver variable in a type-bound procedure must be of pointer type. (See also check #14.)


Check #18 Detect an illegal NEW statement -- that is, a statement of the form

        NEW ( x )
where x is not a variable of type pointer.

NOTE: you may need to add the following rule to your grammar:

        NewStmt -> T_NEW T_LPAREN Designator T_RPAREN


Check #19 Extend Check #1 to allow for operands consisting of variables of pointer type, for the following operators: