Functional Programming

Functional languages have lot of theoretical attraction. They have developed semantically clean ways to denote objects, functions, and function application. (Functional languages are also known as applicative languages.) Functional constructs denote objects, in contrast to procedural constructs which specify methods for constructing objects. The formal basis for functional languages is lambda calculus.

Evaluation Model

An expression can contain variables. When an expression is evaluated, it returns a value. The values for the variables are taken from an environment which holds (<name>, <value>) pairs. Ideally, the evaluation of an expression should not change the environment. This property is called referential transparency. The expressins also said to be side-effect free. The C expression a++ changes the value of the environment and hence is not referentially transparent.

Functional lanuguages have the facility for function creation. Such functions are associated with two environments: definition environment and activation environment. So a function should always carry its definition environment. The pair (function, defintion environment) is called a closure. Consider the following code.

- val a=10;
val a = 10 : int
- fun timesA(x) = a*x; // "a" is a "global" variable.
val timesA = fn : int -> int
- timesA(1);
val it = 10 : int
- val a=20.0;
val a = 20.0 : real
- timesA(1);
val it = 10 : int  // old value of "a" is not forgotten
                   // but just buried in "closure"
-

Expressions vs. statements

An expression is evaluated mainly for the value it returns.

A statement is evaluated mainly for the change of binding it effects. So a statement is evaluated for the side effect. Conside the assignment in ML.

val x=10;

Generally the value returned is not of interest. What is of interest is the binding created. ML minimizes the severity of side effects by making assignment non-destructive.

Types of Evaluation

Functional programming is all about expression evaluation. There are two ways of evaluating expressions. In the innermost evaluation rule, a function application <name> <actual parameter> is calculated as follows:

evaluate the expression represented by <actual parameter>
substitute the result for the formal in the function body.
evaluate the body
return the result

Under the outermost evaluation rule, a function application is computed as

substitute the actual for the formal in the function body
evaluate the body
return its value as the answer

Both types of evaluation produce the same result under "normal" circumstances. Also, under normal circumstances, the outermost evaluation is inefficient. But examples can be constructed to highlight the differences.

For example, functions simulating short-curcuit evaluation can be written if outermost evaluation is used. In such cases, the innermost evaluation may never terminate. Outermost evaluation is also called lazy evaluation since an expression is not evaluated unless it is required. Compiler can prevent duplicate evaluations. So the programmer is free from such concerns.

Characteristic features of FP

Some of the characterizing features of functional languages are:

functions as first-class objects
conditional expressions instead of conditional statements
repetition implicitly with recursion instead of explicit looping
absence of destructive assignments
absence of statement sequencing
lazy evaluation of parameters

Assignment elimination

Destructive assignment causes semantic problems because it creates a time- and sequence-dependent change in the meaning of a name. Write-once variables enhance program readability and efficiency without compromising with the above aim. Assignment is used primarily for

to store the result of computing a subexpression
in conjunction with conditional statements
with a loop variable, to mark the progress of a loop
to build complex data structures

ML provides let to store the values of subexpressions. let creates a local environment in which the names are bound to the values.

Conditional statements are replaced by conditional expressions.

Recursion is used instead of explicit loops.

The languages provide notational support constructing and accessing data structres. Good notation compromises on efficiency since a new data structure has to be created after each change, however small. The basic trick is that destructive assignment is done by the compiler under certain circumstances.

Statement sequences

A sequence of statements that does not to destructive assignment can be emulated by let expression.

Another way is to lazy evaluation.

Continuations

Functional language need the equivalent of a break statement. Reason: readability and efficiency. The notion of continuations is used for this. A continuation is a function which "acts like" the remainder of the program which is still to be executed. Consider an example which calculates the value of pi and prints it. This has two parts.

Calculate the value of pi
a print routine which receives a number and prints it

We can think of the second step as a continuation of the first - "rest of the program still to be executed".

Another way of thinking about continuations as checkpoints. You checkpoint where you want to return and start a recursion. If you want to break out of recursion, you return to the checkpoint. A glorified goto? Yes.

Formally, a continuation is a function with a sigle argument. A continuation carries an environment with it. When called, it discards the current environment and restores the one which it is carrying. It also returns its argument as the return value.

How to create continuations? Compiler support is neeed. What does the continuation contain? the program counter, the stack, the environment, etc. In Scheme, the function call-with-current-continuation. It takes another function f as input. It

creates a one argument function cc from the current continuation
calls f with argument cc

f is written by the programmer which may or may not call cc. If cc is called, it restores the saved environment and returns its argument.

The following code illustrates continuations in ML syntax, but ML does not support continuations.

fun find (key lst) =
  call-with-current-continuation (fn cc => search(key, lst, cc));

fun search(_, nil, _) = nil
   | search(a, a::rest, whatNext) = whatNext(rest) // illegal ML syntax!
   | search(a, _::rest, whatNext) = search(a, rest, whatNext);

- find(3, [1, 2, 3, 4]);
   [3, 4]

- find(5, [1, 2, 3, 4]);
   nil

Efficiency issues

Pure functional notation can lead to inefficient code, unless the compiler makes special efforts. So some form of imperative notation is indispensable, like let expression.

The compiler can do the following optimizations.

spotting tail recursion: Tail recursion is a recursive call to a function in such a position within the function that it will always be the last thing to be executed before the function return. Examples of tail recursive functions are printList, binarySearch on arrays, etc.
identifying and evaluating subexpressions
optimizing lazy evaluations
changing data structures

Note that what is prohibited for the programmer is sometimes done by the compiler. This is an important lesson in language design: Do it with the compiler. Note that the underlying machine does the destructive assignment. jumps, etc. The aim of a PL is to hide the underlying machine. But the compiler need not be hidden from the underlying machine. Also since the compiler is written "once", it can be tested extensively. So language design and compiler technology go hand in hand.

Side effects

Side effects are unavoidable mainly in I/O. Another convenient use of expressions with side-effect is in creating bindings.

Polymorphism

ML supports a kind of polymorphism. Consider the definition of a function which calculates the length of a list. A single function takes lists of any type. This type of polymorphism is called parametric polymorphism. In C++, templates can be used to get similar behavior. Note that these two are not equivalent since the C++ compiler generates code for all the types that are used.

Most popular languages support a form of polymorphism called overloading. This type of polymorphism is called ad-hoc polymorphism.

Constraints on syntax

Expression evaluation causes some restriction on if-then-else expressions and mutually recursive function defintions.

Moral

In functional languages, difficult things (higher order functions, recursion, etc.) have become easy, but easy things (breaking out of recursion, sequencing, etc.) have become difficult. The research issues are to find clean constructs for sequencing, etc. Other languages can use the results of previous research like higher order functions, recursive data structures, etc.

A fallout of research is that compilers can be made to handle much of the complexity efficiently. This technology can be used by other languages.