CSE130 LECTURE NOTES
October 18, 1999
ADMINISTRATION
The first project is due Friday between 5pm and 6pm; see the course
description for exact submission instructions.
We will distribute the description of the second project Wednesday this
week.
TYPES IN ML
The ML type system is quite elegant. The principal builtin
primitive types are int, real, string, and bool.
There are operators that create new types from old types:
-
-> creates a function type
-
* creates a tuple type
ML infers types automatically:
- fun d2 x = 2.0*x;
val d2 = fn : real -> real
Deduction: 2.0 is a real, so * must be the function of type real
* real -> real, so x must be a real, so d2 must be of type
real
-> real.
NOTATION IN THE "ML" LANGUAGE FOR UNIONS AND PRODUCTS
ML has two important differences from traditional languages:
-
components of products (i.e. tuples) are not named, unlike components of
records
-
a union must be tagged by an enumeration type, which is given in the union
type definition.
For example:
-
Here is a type that is the tagged union of int and real:
datatype
mynumber = exact of int | approx of real;
-
Here is a variable named num whose type is mynumber:
val
num = approx 2.5;
-
Here is a variable named trunc that is initialized to a value
determined by a case operation that looks at the tag of num:val
trunc = case num of exact i => i | approx x => floor(x+0.5)
The righthand side of the = above is an expression, not a command.
However it is similar to a switch command in C. Each line
in the ML case expression has a pattern such as exact i on the
lefthand side of the => symbol.
RECURSIVE TYPES IN ML
In ML recursion is allowed with the datatype notation for tagged unions.
So we can write
datatype intlist = nil
| cons of int * intlist;
Remember that you have to choose a tag value for each type being unioned:
we choose nil and cons which is short for "construct".
So in ML we could write for example
val doubles = cons (2,
cons (4, cons (8, cons (16, nil))));
For binary trees, in ML we would write
datatype tree = leaf
of int | node of tree * int * tree;
Consider this tree with two internal nodes and three leaves:
4
/ \
2 5
/ \
1 3
It would be written node (node (leaf 1, 2, leaf 3), 4, leaf 5)
Remember that in ML parentheses are used to indicate tuples.
RECURSIVE TYPES IN TRADITIONAL LANGUAGES
In C and Pascal and other similar languages, the programmer must implement
a recursive type using pointers. Recursive types are also implemented
with pointers in ML. The difference is that the compiler generates the
code for manipulating pointers automatically. This is one justification
for calling ML "higher-level". (Prolog is similar to ML in that
pointers are completely hidden from the programmer. In Lisp pointers
are mostly hidden, and in Java they are slightly hidden.)
The advantages of using recursive types directly are:
-
Code is less verbose, hence probably more readable.
-
The programmer doesn't have to do memory management: allocating and deallocating
space when a list grows or shrinks.
-
Pointer types and the types that use them are automatically associated.
It's harder to create spaghetti-like pointer structures.
Probably the most important advantage of recursive types is that the programmer
cannot make pointer manipulation mistakes.
CONCRETE VERSUS ABSTRACT SYNTAX
Syntax as a field of study in (programming) linguistics is about the appearance
and structure of expressions. (We'll use the word expression to include
any self-contained fragment of a program, for example a subroutine definition.)
Deep structure is more important than surface appearance because knowing
the structure of an expression is a first step towards knowing its meaning.
A concrete syntax specification defines precisely which strings of symbols
are legal expressions.
For example, 0.0 may be allowed but not 0. or 00.0
A grammar specifies (1) conceptual structure and (2) lexical detail.
We are interested in understanding and representing (1), which is called
"abstract" syntax because it leaves out the irrelevant details (2).
Copyright (c) by Charles Elkan, 1999.