The path towards what eventually became known as Prolog starts with Alan Robinson, who around 1965 developed his resolution method (which includes unification) as a computational approach to theorem proving. The next step occurred in the early 1970s, when a group headed by Robert Kowalski in the AI Department of the University of Edinburgh was using interpreters for logic in experiments on natural language understanding. Although this work was very interesting, the programs were too inefficient to be practical. A little later, Alain Colmerauer in France built a much more efficient interpreter, and called the system Prolog (for "programming in logic"); he also introduced some features which departed significantly from pure logic, such as the "cut". In the 1980s, David Warren at SRI designed an efficient abstract machine which became the basis for all subsequent Prolog compilers. Logic programming was the focus of the Japanese Fifth Generation Project, which aimed to build highly efficient machines for logic programming languages, and use them for applications that required much greater intelligence than was then available. The fact that the Fifth Generation Project was a very expensive failure contributed to a decline in interest in logic programming for large applications. However, logic programming remains one of the best approaches for natural language processing, as well as a very interesting adventure in programming language design. Note that all these projects were AI oriented; as usual, this cultural context helps to understand the design decisions that went into Prolog, such as its reuse of the Lisp data structure for lists.
As a footnote, Alfred Horn, for whom Horn clauses are named, had nothing to do with logic programming; he was a professor of logic at UCLA who in 1951 wrote paper using the sentences that now bear his name for reasons having little to do with computer science. As a second footnote, it seems to me rather misleading to call Prolog a "logic programming" language, since it departs rather far from logic; I would rather have had it called a "relational programming" language, because it is the use and manipulation of relations that is most characteristic of its programming style.
We may divide programming languages into procedural languages, and non-procedural languages, where the first includes imperative languages (where assignment is fundamental, as in C, Java, etc.) and functional languages (where functions are first class citizens, as in ML, LISP, etc.). Non-procedural languages are not just what's left over, but are supposed to have declarative flavor, i.e., to (more or less) allow programers to state a problem without stating how to solve it. Most non-procedural languages are logic languages, in which the syntax and semantics are supposed to be based on some form of logic. For example, Prolog is (partially) based on Horn clause logic, and OBJ is based on equational logic. However, SETL is based on set theory, which is not a form of logic, so it is a non-logical and non-procedural language.
6.1 It is annoying that Stansifer reverses the usual convention for capitalization for variables and relations (and function symbols, which are discussed later); these notes will use the same conventions that most Prolog systems use, including BinProlog, which is the "official" version of Prolog for this class. You should be sure to read the short "getting started" Notes on BinProlog, where many important and useful facts are given.
The following may help you to convert from thinking in the imperative programming paradigm to that of the logic paradigm; this material was adapted by Fox Harrell from The Art of Prolog, by Leon Sterling and Ehud Shapiro.
deleteprocedure which removes all occurrences of some element from a list.
?- delete([a,b,c,b],b,X), and its answer, which in this case should be
X = [a,c].
[X|Xs]. There are two possibilities.
In the second case since the head of the list is not equal to X we want to keep it around. In this case an appropriate rule is
6.1.2 On page 195, Stansifer says that one of the examples there has an infinite number of solutions. This is false; what is true is that the Prolog search procedure goes into an infinite loop for this example. Aslo, in Figure 6.4 on page 199, the arrow that loops back to the top should begin from "6" not "5" since it is rule 6 that allows the leftmost "F" of "F,A,E" to be eliminated, yielding "A,E" (thanks for Rober Bharath for this).
6.2 The discussion of cut in Stansifer is not very good; I will try to do better. Actually, there is an interesting reason why it is difficult to explain cut well: whereas Prolog is supposed to be a non-procedural language, cut cannot be understood in non-procedural terms; i.e. cut is procedural, and thus is in conflict with the basic design philosophy of the language; i.e., cut is a kind of kludge, and kludges are usually difficult to explain. Also, one can do some rather surprising things with cut, which at first may seem counter-intuitive, and are easy to get wrong.
Cut is a nullary predicate, denoted !, which always succeeds when the search reaches it, but which also has a rather drastic side effect on the search. I like to think of a cut in a clause as a way for the programmer to tell Prolog something like the following:
p2(X), which are mutually exclusive, i.e. exactly one of them is true for each value of
X; an example is predicates for even and odd integers. Now suppose we have the following:
Xis even, we should check
a(X), but there is no need to check the second
q(X)clause, because we know it will fail. Therefore it is safe to write the following, which is also more efficient:
Xis even, the first literal of the first clause succeeds, so the cut prevents backtracking on
even(X), and the second clause will not be tried. (This example is on page 201 of Stansifer, but it is not explained clearly.)
Here is another example, which is a little more complex, and which also illustrates the Prolog search strategy:
a(X), the first clause will match, giving us a tentative value
X, provided the body
bof the clause succeeds; next, the third clause matches
b, and raises the goal
c, which fails; because of the cut, Prolog will not undo its commitment to the third clause for
b, and so it will never find the fifth clause, which would have succeeded; therefore it backtracks all the way to the top, undoes the first clause, and tries the second, now with a tentative value
Xand a subgoal
e, which then succeeds. So in this case, the cut causes Prolog to miss a solution. Of course, without the cut, Prolog will find both solutions. (Note that the fifth clause would not be needed for some other versions of Prolog, which automatically fail if there is no clause for some literal.) Here is a link to the BinProlog output for this example, which includes a trace.
Now let's look at a couple of stranger things that one can do with cut. The first is something that rather excited researchers when they first discovered it, a way of introducing exceptions into Horn clause rule sets. This was considered important because exceptions are common in knowledge representation, and knowledge representation was supposed to be one of the most important applications of Prolog. This example says that all birds fly, except for penguins, which are birds, but do not fly (the version of this example on page 201 of Stansifer is wrong):
fly(sparrow)will succeed in the usual way, but
fly(penguin)will fail due to the fourth clause, and the cut will prevent to fifth clause from being tried, so penguins will not be claimed to fly, even though they are birds. The query
bird(X)will find all three birds, but the query
fly(X)does not work correctly (at least, not in BinProlog); this is due to the conflict between the logic and procedural features.
Another odd looking use of cut is to implement so-called negation by failure, the code for which is as follows:
fly(penguin). The above query succeeds by showing that
fly(penguin)fails. Here's how it works: the first clause for
notwill try to make
fly(penguin)succeed; if it did succeed, then the subsequent cut and
failwould make that clause fail without trying the next
notclause. However, since
fly(penguin)fails, the second
notclause is tried; it matches, and causes
not(fly(penguin))to succeed. On the other hand, the query
6.4 We can look at unification as solving systems of equations, using a process rather like what is called Gaussian elimination in linear algebra. Let's do some simple examples to illustrate the idea. Suppose the two terms to be unified are
X, Y, Zsuch that
f, so this is OK. Then in order for the two terms to be equal, the following equalities have to hold:
Yin terms of
X, while the second is solved in the same way that we started this process, checking that the head functors are equal, and then noting that for the two terms to be equal, we must have
Y = g(X), from the second of these equations we get
X = Z, which we have already deduced. So putting everything together, we get
X, and then the values of
Y, Zare uniquely determined from that. In fact, the above is a substitution, and moreover, it is the most general substitution which is a solution to the original unification problem, in the sense that any other solution is a substitution instance of this one. The existence of such solutions when there are any solutions at all, is the main result about unification, first proved by Alan Robinson.
The above procedure is essentially the unification algorithm devised by Montanari and Martelli; it is linear in the size of the problem. Notice that if the problem had been slightly different, there would have been no solution. For example, if the original terms had been
Of course, unification can also fail in simpler ways, for example, if the heads of subterms fail to match, as in
h(Y, g(Z))cannot be unified because
fis unequal to
The algorithm used for solving certain queries in BinProlog actually goes beyond unification, in that it introduces new variables, and searches for values for them; this is called narrowing. Examples of this can be given using the the code in pl/addn.html, such as
Difference lists are an important idiom in Prolog programming; they
are a tricky way of representing lists that allow for certain kinds of
computation on lists to go much faster, including parsing. First, the
intuition behind this idiom is to represent a list as the "difference" of two
other lists. As an analogy, the number
6 can be represented as
9 - 3, or
13 - 7, or more generally,
(6 + Y) - Y for any number
Y. Similarly, we can
represent the list
[1, 2, 3] as the difference between
2, 3, 4, 5] and
[4, 5], or between
[1, 2, 3, a, b,
[a, b, c], or more generally between
[1, 2, 3
| Y] and
Y, for any list
Y. Symbolically, we
dlis particularly common. Thus, the above representations of
[1, 2, 3]would be written
One very nice example of the use of this idiom is to get a very fast version of append for difference lists; the definition is also very simple.
Ywith the difference of
Zis the difference of
Z. In the analoguous arithmatic notation, we may write this as
It is common to use difference lists for parsing in Prolog, instead of using append, because difference lists can be made to do appends implicitly, which is much more efficient. This is illustrated in the linked Prolog code for parsing a simple English language grammar; and here is a link to its output. A good reference for difference lists and their use in parsing is The Art of Prolog, by Leon Sterling and Ehud Shapiro (MIT Press).