Functional Programming
Functional languages have lot of theoretical attraction. They have developed
semantically clean ways to denote objects, functions, and function
application. (Functional languages are also known as applicative
languages.) Functional constructs denote objects, in contrast to
procedural constructs which specify methods for constructing objects.
The formal basis for functional languages is lambda calculus.
Evaluation Model
An expression can contain variables. When an expression is evaluated, it
returns a value. The values for the variables are taken from an
environment which holds (<name>, <value>)
pairs.
Ideally, the evaluation of an expression should not change the
environment. This property is called referential transparency.
The expressins also said to be side-effect free. The C expression
a++
changes the value of the environment and hence is not
referentially transparent.
Functional lanuguages have the facility for function creation.
Such functions are associated
with two environments: definition environment and
activation environment. So a function should always carry its
definition environment. The pair (function, defintion environment) is
called a closure.
Consider the following code.
- val a=10;
val a = 10 : int
- fun timesA(x) = a*x; // "a" is a "global" variable.
val timesA = fn : int -> int
- timesA(1);
val it = 10 : int
- val a=20.0;
val a = 20.0 : real
- timesA(1);
val it = 10 : int // old value of "a" is not forgotten
// but just buried in "closure"
-
Expressions vs. statements
An expression is evaluated mainly for the value it returns.
A statement is evaluated mainly for the change of binding it effects.
So a statement is evaluated for the side effect. Conside the assignment
in ML.
val x=10;
Generally the value returned is not of interest. What is of interest is the
binding created. ML minimizes the severity of side effects by making
assignment non-destructive.
Types of Evaluation
Functional programming is all about expression evaluation. There are
two ways of evaluating expressions. In the innermost evaluation rule,
a function application <name> <actual parameter>
is calculated as follows:
- evaluate the expression represented by
<actual parameter>
- substitute the result for the formal in the function body.
- evaluate the body
- return the result
Under the outermost evaluation rule, a function application is
computed as
- substitute the actual for the formal in the function body
- evaluate the body
- return its value as the answer
Both types of evaluation produce the same result under "normal"
circumstances. Also, under normal circumstances, the outermost evaluation is
inefficient. But examples can be constructed to highlight the differences.
For example, functions simulating short-curcuit evaluation can
be written if outermost evaluation is used. In such cases, the innermost
evaluation may never terminate. Outermost evaluation is also called
lazy evaluation since an expression is not evaluated unless it is
required. Compiler can prevent duplicate evaluations. So the programmer
is free from such concerns.
Characteristic features of FP
Some of the characterizing features of functional languages are:
- functions as first-class objects
- conditional expressions instead of conditional statements
- repetition implicitly with recursion instead of explicit looping
- absence of destructive assignments
- absence of statement sequencing
- lazy evaluation of parameters
Assignment elimination
Destructive assignment causes semantic problems because it creates a
time- and sequence-dependent change in the meaning of a name. Write-once
variables enhance program readability and efficiency without compromising
with the above aim. Assignment is used primarily for
- to store the result of computing a subexpression
- in conjunction with conditional statements
- with a loop variable, to mark the progress of a loop
- to build complex data structures
ML provides let
to store the values of subexpressions.
let
creates a local environment in which the names are
bound to the values.
Conditional statements are replaced by conditional expressions.
Recursion is used instead of explicit loops.
The languages provide notational support constructing and accessing
data structres. Good notation compromises on efficiency
since a new data structure has to be created after each change, however
small. The basic trick is that destructive assignment is done by the
compiler under certain circumstances.
Statement sequences
A sequence of statements that does not to destructive assignment can
be emulated by let
expression.
Another way is to lazy evaluation.
Continuations
Functional language need the equivalent of a break
statement. Reason: readability and efficiency. The notion of
continuations is used for this. A continuation is a function
which "acts like" the remainder of the program which is still to be
executed. Consider an example which calculates the value of pi and
prints it. This has two parts.
- Calculate the value of pi
- a print routine which receives a number and prints it
We can think of the second step as a continuation of the first -
"rest of the program still to be executed".
Another way of thinking about continuations as checkpoints. You
checkpoint where you want to return and start a recursion. If you
want to break out of recursion, you return to the checkpoint.
A glorified goto? Yes.
Formally, a continuation is a function with a sigle argument. A
continuation carries an environment with it. When called, it discards
the current environment and restores the one which it is carrying. It
also returns its argument as the return value.
How to create continuations? Compiler support is neeed. What does the
continuation contain? the program counter, the stack, the environment, etc.
In Scheme, the function call-with-current-continuation
.
It takes another function f
as input. It
- creates a one argument function
cc
from the current
continuation
- calls
f
with argument cc
f
is written by the programmer which may or may not call
cc
. If cc
is called, it restores the saved
environment and returns its argument.
The following code illustrates continuations in ML syntax, but ML does
not support continuations.
fun find (key lst) =
call-with-current-continuation (fn cc => search(key, lst, cc));
fun search(_, nil, _) = nil
| search(a, a::rest, whatNext) = whatNext(rest) // illegal ML syntax!
| search(a, _::rest, whatNext) = search(a, rest, whatNext);
- find(3, [1, 2, 3, 4]);
[3, 4]
- find(5, [1, 2, 3, 4]);
nil
Efficiency issues
Pure functional notation can lead to inefficient code, unless the compiler
makes special efforts. So some form of imperative notation is indispensable,
like let
expression.
The compiler can do the following optimizations.
- spotting tail recursion: Tail recursion is a recursive call to a
function in such a position within the function that it will always
be the last thing to be executed before the function return. Examples of
tail recursive functions are
printList
,
binarySearch
on arrays, etc.
- identifying and evaluating subexpressions
- optimizing lazy evaluations
- changing data structures
Note that what is prohibited for the programmer is sometimes done by
the compiler. This is an important lesson in language design:
Do it with the compiler. Note that the underlying machine
does the destructive assignment. jumps, etc. The aim of a PL is to
hide the underlying machine. But the compiler need not be hidden from
the underlying machine. Also since the compiler is written "once", it
can be tested extensively. So language design and compiler technology
go hand in hand.
Side effects
Side effects are unavoidable mainly in I/O. Another convenient use
of expressions with side-effect is in creating bindings.
Polymorphism
ML supports a kind of polymorphism. Consider the definition of a function
which calculates the length of a list. A single function takes lists of
any type. This type of polymorphism is called parametric polymorphism.
In C++, templates can be used to get similar behavior. Note that these
two are not equivalent since the C++ compiler generates code for all the
types that are used.
Most popular languages support a form of polymorphism called
overloading. This type of polymorphism is called ad-hoc
polymorphism.
Constraints on syntax
Expression evaluation causes some restriction on if-then-else
expressions and mutually recursive function defintions.
Moral
In functional languages, difficult things (higher order functions,
recursion, etc.) have become easy, but easy things (breaking out of
recursion, sequencing, etc.) have become difficult. The research issues
are to find clean constructs for sequencing, etc. Other languages
can use the results of previous research like higher order functions,
recursive data structures, etc.
A fallout of research is that compilers can be made to handle much of
the complexity efficiently. This technology can be used by other languages.