# Mini-project

The goal of the mini-project is to gain experience in using theorem provers, and in encoding statements about programs formally. The definitions on this web page are left intentionally informal -- part of what you have to do is formalize these definitions.

## I. Getting started (setup and playing around)

You should get started as soon as possible with installing the software and trying it out.

### Simplify

• Documentation
• Here is a sample file that I wrote showing you some of the features of Simplify. You should read it carefully. To run it, simply execute "Simplify examples.simp".
• More details can be found in a man page; there is also a description of the input syntax. Note that these two links point to a web archive because the original web page doesn't exist anymore.

## II. A simple C-like programming language: CMPL (pronounced "seemple")

We will be proving theorems about a very simple programming language. This language is defined by the following grammar, with entry symbol Prog:

```Prog       ::= "" | Stmt ";" Prog
Stmt       ::= "skip" | Var "=" Expr
Expr       ::= IntLiteral | Var | Var "+" Var
Var        ::= identifier
IntLiteral ::= integer literal```

In the above grammar, strings in brackets are terminals, whereas strings outside of brackets are non-terminals. The last two lines of this grammar are described informally: identifier stands for variable names, and integer literal stands for integer constants (such as 0, 245, -345).

Here is an example of a program in this language:

```skip;
a = 0;
b = 1;
c = a + b;
d = b + c;
e = c + d;
f = d + e;```

The skip statement is simply a noop; it doesn't do anything. An assignment statement X = E has the usual C-like semantics: the expression E is evaluated, and the resulting value is stored in the variable X.

Expression evaluation works as you would expect: an integer literal evaluates to its own value, a variable X evaluates to the value of X in memory, and an expression A + B evaluates to the sum of the values in memory of variables A and B.

## III. The properties you have to prove

Before getting to the theorems you will prove, we have define a few terms.  First, a program state represents the state a CMPL program during its execution. Typically, the program state has two components: the memory state, which captures the contents of memory (in our case the values of variables), and the control state, which tells us what statement the program is going to execute next. For the mini-project, we will ignore the control state of a program, and so the program state will just contain the contents of memory. In particular, a program state will simply be a data structure that maps variables to values.

Furthermore, we also have the following three definitions:

• We say that a statement defines a variable X if the statement is an assignment whose LHS is X.
• We say that a statement uses a variable if the statement is an assignment whose RHS has a reference to X. Notice that these notions are completely syntactic. Determining whether or not a statement defines or uses a variable is determined simply by looking at the structure of the statement.
• We say that two program states are equal up to a variable X if, for any variable Y different from X, the two program states agree on the value of Y.

Finally, now we are ready to look at the theorems you are going to prove. Here they are:

Theorem 1 (preservation of variables). If a statement S does not define X, then the value of X after executing S will be the same as the value of X before.

Theorem 2 (preservation of equal-up-to). If a statement S does not use a variable X, and two program states PS1 and PS2 are equal up to X, then the two program states produced by executing S in program states PS1 and PS2 are equal up to X.

You should convince yourself that these two theorems in fact hold of the CMPL language. Your task will be to prove these two theorems using the Simplify theorem prover and the ACL2 theorem prover. For Simplify, you are required to prove both of these. For ACL2, you are only required to prove theorem 1 -- you can prove theorem 2 for bonus points (I've proven theorem 2 in ACL2, and it's very doable, but it is a lot harder than the other ones)

## IV. Discussion questions

You'll document your experiences in a short report that will be about 2 pages. In this report, you should (at least) answer the following questions:

• What was difficult, and what was easy? What did each theorem prover make easy/hard?
• Discuss the similarities and differences between the two theorem provers.
• Do you see any ways of improving either one of these theorem provers?
• Describe any other issues that came up that you think are interesting.

### Improving CMPL

• Suppose you had to add other arithmetic operators to CMP, like -, * and /. How would you go about doing this?
• Suppose you had to model arithmetic overflow. How would you go about doing this?
• Suppose you had to add pointers to CMPL, for example statements of the form *x := y, and x := *y. How would you model such statements? Would you still be able to express the notion of defines (as in a statement defines a variable) by just looking at the statement? Explain why or why not.