The path towards what eventually became known as Prolog starts with Alan Robinson, who around 1965 developed his resolution method (which includes unification) as a computational approach to theorem proving. The next step occurred in the early 1970s, when a group headed by Robert Kowalski in the AI Department of the University of Edinburgh was using interpreters for logic in experiments on natural language understanding. Although this work was very interesting, the programs were too inefficient to be practical. A little later, Alain Colmerauer in France built a much more efficient interpreter, and called the system Prolog (for "programming in logic"); he also introduced some features which departed significantly from pure logic, such as the "cut". In the 1980s, David Warren at SRI designed an efficient abstract machine which became the basis for all subsequent Prolog compilers. Logic programming was the focus of the Japanese Fifth Generation Project, which aimed to build highly efficient machines for logic programming languages, and use them for applications that required much greater intelligence than was then available. The fact that the Fifth Generation Project was a very expensive failure contributed to a decline in interest in logic programming for large applications. However, logic programming remains one of the best approaches for natural language processing, as well as a very interesting adventure in programming language design. Note that all these projects were AI oriented; as usual, this cultural context helps to understand the design decisions that went into Prolog, such as its reuse of the Lisp data structure for lists.

As a footnote, Alfred Horn, for whom Horn clauses are named, had nothing to
do with logic programming; he was a professor of logic at UCLA who in 1951
wrote paper using the sentences that now bear his name for reasons having
little to do with computer science. As a second footnote, it seems to me
rather misleading to call Prolog a "logic programming" language, since it
departs rather far from logic; I would rather have had it called a
"*relational programming*" language, because it is the use and
manipulation of relations that is most characteristic of its programming
style.

BinProlog is the "official" version of Prolog for this class. You should read the short "getting started" Notes on BinProlog, where many important and useful facts are given. However, if you prefer to use a different version of Prolog for exercises, that is certainly permissible.

We may divide programming languages into procedural languages, and
**non-procedural languages**, where the first includes imperative languages
(where assignment is fundamental, as in C, Java, etc.) and functional
languages (where functions are first class citizens, as in ML, LISP, etc.).
Non-procedural languages are not just what's left over, but are supposed to
have **declarative** flavor, i.e., to (more or less) allow programers to
state a problem without stating how to solve it. Most non-procedural
languages are **logic languages**, in which the syntax and semantics are
supposed to be based on some form of logic. For example, Prolog is
(partially) based on Horn clause logic, and OBJ is based on equational logic.
However, SETL is based on set theory, which is not a form of logic, so it is a
non-logical and non-procedural language.

The following may help you to convert from thinking in the imperative
programming paradigm to that of the logic paradigm; this material was adapted
by Fox Harrell from **The Art of Prolog**, by Leon Sterling and Ehud
Shapiro.

- This is a sample thought process for writing a recursive procedure in
Prolog; in this view, the key to writing the Prolog program is being able to
switch back and forth between procedural and declarative thought. The example
involves writing a
`delete`

procedure which removes all occurrences of some element from a list.

- Specify the meaning of the arguments of the relation. In this case there
are three: the element X to be deleted, the element L1 which may contain
occurrences of X, and the element L2 which does not contain any occurrences of
X. The input will have the form
delete(L1,X,L2). - Think of an example query, such as
`?- delete([a,b,c,b],b,X)`

, and its answer, which in this case should be`X = [a,c]`

.

- Begin by thinking procedurally about the first recursive element. The
recursive form for a list is
`[X|Xs]`

. There are two possibilities.- X is the element to be deleted.
- It is not.

delete([X|Xs],X,Ys) :- delete(Xs,X,Ys). Now we should take a step back and look at this declaratively. The declarative reading of this rule is "The deletion of X from [X|Xs] is Ys if the deletion of X from Xs is Ys."In the second case since the head of the list is not equal to X we want to keep it around. In this case an appropriate rule is

delete([X|Xs],Z,[X|Ys]) :- X != Z, delete(Xs,Z,Ys). The declarative reading of this rule is: "The deletion of Z from [X|Xs] is [X|Ys] if Z is not equal to X and deleting Z from Xs is Ys."

- So are we done? No! We need a base case to stop the recursion (after
you get used to writing Prolog programs you may want to start coming up with a
base case before the other rules). The base case is straightforward here. No
elements can be removed from the empty list. The rule for this is
delete([],X,[]). Now we are done, and here is the final program:delete([],X,[]). delete([X|Xs],X,Ys) :- delete(Xs,X,Ys). delete([X|Xs],Z,[X|Ys]) :- X != Z, delete(Xs,Z,Ys).

- Specify the meaning of the arguments of the relation. In this case there
are three: the element X to be deleted, the element L1 which may contain
occurrences of X, and the element L2 which does not contain any occurrences of
X. The input will have the form

There is an interesting reason why it is difficult to explain cut well:
whereas Prolog is supposed to be a non-procedural language, cut cannot be
understood in non-procedural terms; i.e. *cut is procedural*, and
thus is in conflict with the basic design philosophy of the language; i.e.,
cut is a kind of kludge, and kludges are usually difficult to explain.
Also, one can do some rather surprising things with cut, which at first may
seem counter-intuitive, and are easy to get wrong.

Cut is a nullary predicate, denoted **!**, which always succeeds when
the search reaches it, but which also has a rather drastic side effect on the
search. I like to think of a cut in a clause as a way for the programmer to
tell Prolog something like the following:

- Trust me - if you get this far in this clause, there's no need to backtrack and try another clause for proving its goal, or to try any other way of satisfying any of the subgoals that were already proved for this goal.

`p1(X)`

and `p2(X)`

, which are mutually
exclusive, i.e. exactly one of them is true for each value of `X`

;
an example is predicates for even and odd integers. Now suppose we have the
following:
`X`

is even, we should check `a(X)`

, but there
is no need to check the second `q(X)`

clause, because we know it
will fail. Therefore it is safe to write the following, which is also more
efficient:
`X`

is even, the first literal of the first clause succeeds,
so the cut prevents backtracking on `q(X)`

or
`even(X)`

, and the second clause will not be tried.
Here is another example, which is a little more complex, and which also illustrates the Prolog search strategy:

`a(X)`

, the first clause will match, giving us
a tentative value `1`

for `X`

, provided the body
`b`

of the clause succeeds; next, the third clause matches
`b`

, and raises the goal `c`

, which fails; because of
the cut, Prolog will not undo its commitment to the third clause for
`b`

, and so it will never find the fifth clause, which would have
succeeded; therefore it backtracks all the way to the top, undoes the first
clause, and tries the second, now with a tentative value `2`

for
`X`

and a subgoal `e`

, which then succeeds. So in this
case, the cut causes Prolog to miss a solution. Of course, without the cut,
Prolog will find both solutions. (Note that the fifth clause would not be
needed for some other versions of Prolog, which automatically fail if there is
no clause for some literal.) Here is a link to the
BinProlog output for this example, which includes a trace.
Now let's look at a couple of stranger things that one can do with cut. The first is something that rather excited researchers when they first discovered it, a way of introducing exceptions into Horn clause rule sets. This was considered important because exceptions are common in knowledge representation, and knowledge representation was supposed to be one of the most important applications of Prolog. This example says that all birds fly, except for penguins, which are birds, but do not fly (the version of this example on page 201 of Stansifer is wrong):

`fly(sparrow)`

will succeed in the usual
way, but `fly(penguin)`

will fail due to the fourth clause, and the
cut will prevent to fifth clause from being tried, so penguins will not be
claimed to fly, even though they are birds. The query `bird(X)`

will find all three birds, but the query `fly(X)`

does not work
correctly (at least, not in BinProlog); this is due to the conflict between
the logic and procedural features.
Another odd looking use of cut is to implement so-called **negation by
failure**, the code for which is as follows:

`not`

is `fly(penguin)`

. The above
query succeeds by showing that `fly(penguin)`

fails. Here's how it
works: the first clause for `not`

will try to make
`fly(penguin)`

succeed; if it did succeed, then the subsequent cut
and `fail`

would make that clause fail without trying the next
`not`

clause. However, since `fly(penguin)`

fails, the
second `not`

clause `not(fly(penguin))`

to succeed. On the other hand, the query
We can look at **unification** as solving systems of equations, using
a process rather like what is called Gaussian elimination in linear algebra.
Let's do some simple examples to illustrate the idea. Suppose the two terms
to be unified are

`s`

such that
`X, Y, Z`

such that
`f`

, so this is OK. Then in order for
the two terms to be equal, the following equalities have to hold:
`Y`

in terms of
`X`

, while the second is solved in the same way that we started
this process, checking that the head functors are equal, and then noting that
for the two terms to be equal, we must have
`Y = g(X)`

, from the second of these
equations we get
`X = Z`

, which we have already
deduced. So putting everything together, we get
`X`

, and then the values of `Y, Z`

are
uniquely determined from that. In fact, the above is a The above procedure is essentially the unification algorithm devised by Montanari and Martelli; it is linear in the size of the problem. Notice that if the problem had been slightly different, there would have been no solution. For example, if the original terms had been

Of course, unification can also fail in simpler ways, for example, if the heads of subterms fail to match, as in

`f(X, Y)`

and `h(Y, g(Z))`

cannot be
unified because `f`

is unequal to `h`

.
The algorithm used for solving certain queries in BinProlog actually goes
beyond unification, in that it introduces new variables, and searches for
values for them; this is called **narrowing**. Examples of this can be
given using the the code in pl/addn.html, such as

**Difference lists** are an important idiom in Prolog programming; they
are a tricky way of representing lists that allow for certain kinds of
computation on lists to go much faster, including parsing. First, the
intuition behind this idiom is to represent a list as the "difference" of two
other lists. As an analogy, the number `6`

can be represented as
the difference `9 - 3`

, or `13 - 7`

, or more generally,
`(6 + Y) - Y`

for any number `Y`

. Similarly, we can
represent the list `[1, 2, 3]`

as the difference between ```
[1,
2, 3, 4, 5]
```

and `[4, 5]`

, or between ```
[1, 2, 3, a, b,
c]
```

and `[a, b, c]`

, or more generally between ```
[1, 2, 3
| Y]
```

and `Y`

, for any list `Y`

. Symbolically, we
may write

`dl`

is particularly common. Thus, the above representations of
`[1, 2, 3]`

would be written
One very nice example of the use of this idiom is to get a very fast version of append for difference lists; the definition is also very simple.

`X`

and `Y`

with the difference of
`Y`

and `Z`

is the difference of `X`

and
`Z`

. In the analoguous arithmatic notation, we may write this as
It is common to use difference lists for parsing in Prolog, instead of
using append, because difference lists can be made to do appends implicitly,
which is much more efficient. This is illustrated in the linked Prolog code for parsing a simple English language
grammar; and here is a link to its output. A
good reference for difference lists and their use in parsing is **The Art of
Prolog**, by Leon Sterling and Ehud Shapiro (MIT Press).

There is a small bug on page 458, on the 4th line from the bottom, "L = [a]" should be "X = [a]" (or else previous X's should be L's).

To CSE 230 notes page

To CSE 230 homepage

Maintained by Joseph Goguen

© 2000, 2001, 2002, 2003 Joseph Goguen

Last modified: Thu Mar 6 11:21:39 PST 2003