This handout is not perfect. Corrections may be made.
Updates will appear here or on the discus board.
Be sure to check these as often as you can.
Description
In this project you will generate assembly code for a subset of the Oberon-2
language that results in an executable program.
You will be adding code to the actions in your compiler to output Sparc
assembly language code.
In particular, your compiler should, given a Oberon-2 program as
input, write the corresponding assembly code to a file
named oberon.s.
At that point, oberon.s can be fed to the C compiler (which accepts assembly code as well as C code) to generate an executable named a.out.
We will run the a.out executable and check that it produces the expected output.
The features are assigned in phases, each worth a
percentage of the grade.
Phase I Features: 40% (1.5 weeks)
-
Integer literals, constants and variables.
-
Integer arithmetic expressions containing +, -.
-
WRITE integer expressions (no return is appended on output),
and the string constant NL, interpreted as a return.
-
IF statements of the form: IF expression THEN statements END.
expression will be limited to the form
X < Y, where X
and Y are integer expressions.
For Phase I, the only boolean operator you need to
deal with is less-than (<).
-
Procedures and functions with no parameters.
-
RETURN statements using integer expressions.
Phase II Features: 40% (2 weeks)
-
Integer arithmetic expressions containing *, DIV,
and MOD.
-
Boolean literals, constants and variables.
-
Boolean expressions containing >=, <=, >, <, =, #, &, OR, and ~.
-
String literals and constants (only included for WRITE).
- WRITE
boolean expressions and string literals.
Booleans should be ouptut as TRUE or FALSE.
- IF statements with (optional)
ELSE parts (no ELSIF).
- WHILE statements.
- EXIT statements.
- RETURN statements using boolean expressions.
-
Procedures and functions with parameters - both value and VAR parameters.
Parameter types are restricted to integer and boolean.
- READ
integer variables.
READ behaves the same as scanf("%d"), and returns
are ignored on the input (treated as separators).
-
Arrays
-
Arrays will have constant bounds. No bounds checking is required on run-time array access.
-
The base type of an array may be either integer or boolean.
Phase III Features: 20% (1 week)
The group of advanced features to be implemented include:
-
Pointer to records. Fields may be of type integer, boolean, array, or pointer to record.
-
NEW and NIL.
-
Type bound procedures. Remember, that a receiever to a type bound
procedure must be a pointer to record.
-
Procedure value parameters of type pointer. VAR parameters to procedure of type array or pointer to record. RETURN statements of type pointer.
-
Combinations - arrays of pointers, records with arrays, arrays of arrays.
What We Aren't Doing
As a clarification, here are the kinds of things we are not doing at all.
- Constructs omitted in previous assignment.
- Real variables or real arithmetic.
-
String expressions (String literals are OK, of course).
-
FOR loops, LOOP, or REPEAT constructs.
-
Nested functions or procedures.
Makefile change for target language
Since the code you generate will be dependent on the conventions of
the C compiler you are using for assembly, you must also add the
following rule in your Makefile:
CC="C-compiler-that-you-are-mimicking"
compile:
$(CC) oberon.s
where variable CC is bound to the compiler on ieng9 whose
conventions you are imitating.
This will permit us to compile your assembler
output as you wish us to with the command "make compile", so that we
can generate an a.out file for testing.
Notes/Hints/Etc.
-
Only syntactically, semantically legal Oberon-2 programs will be
given to your compiler (furthermore, they will only contain
the aforementioned mini-Oberon-2 statements).
This doesn't mean you can scrap all your code from last project --
some of it will be needed later on.
-
For procedure and type-bound procedure call, we will not give
testcases that define procedures that (a) pass more arguments than can be
passed on the stack, or (b) allocates more local memory than can be indexed
in simple base+offset load calculation.
-
We may define two type bound procedures with the same name.
-
The C compiler is your friend.
This cannot be overemphasized.
A wealth of knowledge can be gained by simply seeing how the C
compiler does things, and emulating it.
(How do you think we learned Sparc assembly language?!)
In most cases, the assembly code generated by cc will be
similar to what you want your compiler to produce.
You may also look at gcc, but it is much less straightforward. However, you
should only emulate one
compiler, since the techniques you use have to be internally consistent.
In deciding which to use, think about which one produces the simpler code
(in your opinion).
-
To see how the C compiler works, write a C program (make it small!)
and compile it with the -S option: cc -S program.c. This
produces program.s, which contains the assembly code.
-
Outputting assembly language comments has been found to be
very helpful in debugging.
|