An Essay on Comparative Programming Linguistics
by Joseph Goguen

A common pastime of computer scientists is to argue about which programming language is the best; people outside the field would probably be surprised to see how emotional, sometimes even violent, these discussions can become. This essay takes the point of view that such arguments are generally based on a serious misunderstanding of the subject of programming languages, which has been fostered by (1) the way that programming languages are generally taught, plus (2) some inherent properties of human beings. I will also argue (3) that social factors must play a key role in any serious understanding of programming langauges.

To begin with point (2), mastering a programming language represents a considerable investment, which one is naturally reluctant to lose, and even more reluctant to have to repeat with another language, and then another, ... So some inertia is a good thing here, or we would spend much of our time learning new languages, with little time left for using them. Less obviously, but at least as importantly, each language comes along with a certain mindset or worldview, that is, a certain way of thinking, which can be hard to change, because one is not even aware of having it - it just "seems natural" and goes without question. In the 1950s, assembly language programmers had a hard time accepting FORTRAN; later, FORTRAN and COBOL programmers had a hard time learning the then new so-called "high level" langauges like ALGOL 60; and even today, people reared on C may have trouble with "strange" languages like Prolog, Lisp and ML. This kind of analysis is essentially cognitive, having to do with the psychology of individuals.

Passing to (3) and a social level of analysis, I believe it is not possible to properly evaluate a programming language without examining its cultural and historical context, which includes who created it, for what reasons, with what constraints, and also who uses it, how they use it, and for what purposes. Languages have communities of users, and these communities have different needs, different skills, and different goals. When looked at carefully from this point of view, all of the currently popular languages are "good" for certain communities, for certain purposes. There is no such thing as a "best" language!

For example, C is great for systems programming. It gives you access to low level details, bits even, along with high level features like procedures. If you have to write, say, a device driver, this is just what you want. The disadvantage is that the low level details are much more likely to be machine dependent, so that if you use these features, you lose portability. But for something like a device driver, this is only to be expected. So what is best about C is also what is worst about it. C was developed for systems programming, and its first major application was Unix. (In terms of the actual history, C is derived from BCPL, "Basic Christopher's Programming Language," which was a subset of CPL, "Christopher's Programming Language," designed by Christopher Strachey at Oxford. Although it was never fully implemented, this little known language introduced many important features that are in common use today.)

FORTRAN is great for numerical programs, and that of course is what it was originally designed for, by John Backus and others at IBM, starting in the early 1950s. It was also designed to run fast, so that assembly programmers could not complain that the language was useless. These two facts explain many seemingly arbitrary design choices, such as fixed sizes for arrays, and the huge effort that went into the optimizer in the compiler. Again, exactly what makes the language good for what it was designed for may make it poor for other, different purposes. Of course, later versions of FORTRAN have many more features than the original version did, now including even certain aspects of object orientation.

COBOL may look dumb, but it was designed in the early 1960s for bookkeeping applications, where readability by relatively untrained people can be more important than it is in (say) scientific applications. COBOL was designed for the Navy by Grace Murry Hopper, who later became an Admiral, and it is still widely used in business, e.g., by banks.

ALGOL 60 was originally designed, from the late 1950s into the 60s, by a an IFIP committee (WG 2.1) which met largely in Europe, particularly for expressing algorithms, and it later became popular for education. John Backus, Peter Naur, and John McCarthy were some of the important committee members. Pascal and the Modula languages are part of the same tradition, and were largely designed and implemented by Wirth. Important features found in many other languages today were introduced by ALGOL 60.

LISP was the first language to support symbolic computation and recursion. It was designed by John McCarthy and others at MIT in the late 1950s to support research in Artificial Intelligence, which needed exactly these features, plus (they thought at the time) the possibility of self-modifying programs. These goals help explain seemingly strange features like S-expressions, which can represent both data and code. LISP was also the first interactive language, the first language to be supported by an interpreter, the first to have garbage collection, and the first to have higher order functions.

Simula 67 emerged in the late 1960s from the Norwegan Computing Center, as a language for doing simulations; its designers were Kristin Nygaard and Ole-Johan Dahl. Simula is the source of all object oriented languages; its features include objects, inheritance, messages, and more. Smalltalk popularized these ideas, but was not necessarily a step forward. C++ is hampered by the requirement that it is compatible with C, and it has features, including inheritance and modules, that make it better than C for large programs.

Prolog was designed by AI researchers interested in logic, who were especially concerned with natural language processing, and indeed, it is probably easier to write parsers and semantic analyzers in Prolog than in any other language. The early design work was done in the early 1970s at Edinburgh by Robert Kowalski and others, and the first interpreter was done in Marseille, France, by Phillipe Roussel and Alain Colmerauer.

Ada was designed for embedded real time systems, and it also has a module system that supported code reusability better than any previous language. Ada was designed during the early 1980s in France by a team headed by Jean Ichbiah, funded by the US Department of Defense, and the concerns of the military can be seen in the goals of the language. Bernd Krieg-Bruckner designed the module system. In terms of its general features, Ada is in the ALGOL family. It is named after Lady Ada Augusta Lovelace, the world's first programmer, and daughter of the great English poet Lord Byron; she worked closely with Charles Babbage, and wrote programs for his gear-driven mechanical computers.

Java emerged in the mid 1990s, and was designed with the world wide web in mind. And again, features that make it great for that purpose make it less suitable for some other purposes. For example, byte code running on an interpreter is not a very good idea for an operating system. Similarly, security concerns dictated leaving out many powerful features that other languages have, such as pointers, and direct access to the user's file system.

ML was developed in Edinburgh in the early 1980s by Robin Milner, Mike Gordon, Dave MacQueen and others, as the metalanguage (hence "ML") for a theorem prover; it then gradually developed into a programming language in its own right. Its systematic use of types for polymorphic higher order functions is one of its main innovations; its sophisticated type checker uses unification. These features were developed to ensure that things of type "theorem" really did have complete and correct proofs. The current compiler and module system are largely due to Dave MacQueen, who got his PhD under Rod Burstall at Edinburgh.

OBJ is not really a programming language, but rather an algebraic specification language with an executable subset. It was originally developed, first at UCLA and then at SRI, beginning in the mid 1970s, for mechanizing algebraic semantics, but it later found several other uses, including theorem proving, specifying systems, and writing programs, especially prototypes. Parts of its design, especially ideas for its powerful parameterized module system, come from an earlier specification language called Clear, developed by Goguen and Burstall. These ideas, in the concrete form that they take in OBJ, have influenced the module sytems of Ada, C++, ML, Lotos, and Modula-2 and -3; ML even uses some of the terminology of algebraic semantics for its module system, which provides the best support for code reuse of any current generation language.

Obviously, a great deal more could be said on this topic, going into much more detail on particular languages, discussing additional languages, and comparing their features in greater depth; indeed, I am aware of having oversimplified certain points. But the main point should be clear by now: a purely technical analysis of programming languages can never be adequate for understanding their diversity, their features, their uses, or the arguments that people get into over them; it is essential to look at the context of their use, in order to understand them in any real depth; it is also helpful to take account of their history. Of course, I am not arguing for neglect of the technical features of languages, but rather for understanding them in their context. I believe that the almost exclusive emphasis on technical aspects is a serious problem with the way that programming is often taught today. Such an approach is even more harmful when an instructor pushes his or her own favorite language as the "best".


After that introduction, I would like to look at the overall trajectory of the historical development of programming languages from a view that combines social and technical aspects. A first observation is that new programming languages have provided increasingly abstract features over time. The very first languages were machine languages, perhaps even binary. After that came assembly languages, which at first just allowed symbolic names for binary sequences, but which later became more sophisticated, for example, in allowing symbolic names for lines.

Next come the so-called "high level" languages, which allow users to write at least numerical expressions and to declare at least arrays of numbers (e.g. FORTRAN). There has been a lot of subsequent evolution in this category, including procedures, blocks, and types with ALGOL 60, and then classes with inheritance for the object oriented languages. Modules allow an even higher level of abstraction, especially if they can be parameterized. Finally, we see the emergence of languages that are more directly based on ideas from logic, namely functional languages like ML, and so-called "logic" languages like Prolog.

We can make some further observations about this evolution. First, all of these ways of achieving greater abstractness have been inspired by developments from mathematics, though their realization in programming does not usually have the same concise form and semantics as in mathematics. For example, the idea of having local declarations within procedures and blocks mimics the conventions of mathematical exposition (e.g., the scope of "let n be an integer" within a part of a proof). Second, we can observe that larger amounts of program text are affected by features that support greater levels of abstraction. For example, variables are tiny, expressions are small, procedures are larger, and modules can be quite large. Current research concerns software architectures and architecture patterns, which are even larger.

Now let us inquire into the causes of this rapid evolution. Two of these seem the most important: (1) the ongoing exponential growth in computer power; and (2) the demand from society for ever more complex functionality. Point (1) is perhaps most famously expressed in "Moore's Law," which says that roughly every 18 months, processor power roughly doubles. Point (2) comes from people perceiving new needs, wanting to extend existing systems, wanting to take advantage of new technologies, etc. A current example is the DoD's new programs to combat terrorism using advanced computer technology. The Japanese Fifth Generation project is a famous older example. Point (1) has made it possible to run ever more complex systems, and point (2) has led to wanting ever more complex systems. So programming languages have been forced to evolve to meet the demands of writing ever larger programs, and that has meant providing ever higher levels of abstraction, so that it is easier to read, write, and maintain them.

Next, let's consider programming paradigms, where a programming paradigm is defined to be a logic that underlies a distinctive style of programming. For example, the imperative paradigm is built on the logic of assignment and sequencing. In this light, object oriented programming is not a separate paradigm, but rather a refinement of the imperative paradigm, extending the logic from single cells (for program variables) to objects (which can hold multiple data items), and by generalizing assignment to methods and attributes. Both cells and assignment come from the Turing model of computation, which inspired the von Neumann computer architecture on which these languages typically run.

It is interesting to notice that the syntactic features that support assignment and sequencing are typically extremely simple in imperative languages, e.g., equality and semi-colon; and in some languages, there isn't any symbol at all for sequential composition - the empty symbol "space" is used. This is an instance of a very general observation from linguistics, that the more important something is, the shorter it is. For example, among the names of colors, the most important are the shortest (red, blue, green, black, white) while the more esoteric have longer names (chartreuse, vermillion, magenta, cappuccino cream, Arizona white, ...).

Functional programming is another distinctive paradigm, built on the logic of computable functions, either in the form of Church's lambda calculus, or else Curry's combinatory calculus. There is a nice theorem of logic that the functions computable in this paradigm are the same as those computable in the imperative paradigm; one usually says that both paradigms are Turing complete in the sense of being able to compute all functions computable by a Turing machine. As a special kind of functional programming, I really should mention equational programming, since this is an area within which I have worked, in the even more special case where functions are defined by first order equations; again, it is known that this approach, even with the first order restriction, is Turing complete, and OBJ is one of several rather efficient implementations.

Another quite distinctive paradigm is so-called logic programming, which really would better be called relational programming, since relations are the essence of its programming style, or perhaps "Horn clause programming," since Horn clauses are the syntactic form of its logical basis. Again there is a nice theorem which says that the functions computable in this paradigm are the same as those of the other two paradigms, i.e., it too is Turing complete. It is again interesting to notice the extreme simplicity of the two most basic syntactic constructs in Prolog's Horn clause notation, namely ":-" for "if" and "," for "and".

Pure functional and logic programming do not have assignment statements (although most actual languages make some provision for assignment, as a concession to the efficiency resulting from better use of the von Neumann processor architecture).

Both functional and logic programming are directly inspired by certain logics, whereas with imperative programming, one has to struggle a bit to see the underlying logic. Notice also that modularity is not part of any paradigm, and it may or may not be present in any one of them; so this feature is orthogonal to the three main paradigms. Prolog and Lisp do not support modularity, whereas ML provides very good support.

We should also mention some other developments, which in my opinion do not qualify as distinct paradigms. These include real time computing, distributed computing, and parallel computing. In fact, the main languages for each of these can be seen as refinements of the imperative paradigm, although there are also interesting refinements within other paradigms.

Finally, I would like to briefly summarize some concepts from linguistics that partially inspired the approach taken in this essay. Diachronic linguistics is the study of trends across time, that of the historical development of languages, whereas synchronic linguistics is the comparative study of languages (including dialects, etc.) at the same time. So this course has largely taken a synchronic approach to programming languages, being mainly a comparative study of currently popular languages. However, we found (just as in the study of natural languages) that it was very revealing to also consider the historical development of languages, and in particular, to relate languages to the particular culture in which they were developed; without this, it is impossible to understand why languages have many of the otherwise strange looking features that they do have.

The programming language paradigms that we have studied, including the imperative, functional, and logic (or relational) paradigms, are similar to the largest families of natural languages, such as Indo-European. Just as with natural languages, programming languages often have several dialects, which are spoken within particular subcultures, often for very particular social, cultural and historical reasons. (For example, consider the different dialects of Java promoted by Microsoft and by Sun Microsystems.)

One can also find examples of the natural language concept idiom in programming languages. An idiom is a phrase or way of using language that is difficult or impossible to understand except as a whole; its meaning cannot be easily (or at all) determined from its parts. An example from English is the phrase "How do you do?", which sounds like an inquiry about health, or luck, or some such, but really just means "Hello." That it is treated as a whole is shown by the fact that it is contracted to "Howdy". An example of an idiom from Prolog is difference lists, usually indicated by a functor named dl, but used to speed up certain algorithms in a very particular way, that cannot be easily guessed from any single line of code.

Another important distinction in linguistics is that between a syntagmatic focus, and a paradigmatic focus. The former is concerned with syntax as such, while the latter considers the structure of items that can be substituted for each other within certain contexts; we could say that syntagmatics is concerned with the horizontal structure of language, whereas paradigmatics is concerned with the vertical structure of language. In linguistics, the paradigmatic approach is a bit of a kludge, because it tries to address semantic issues with syntactic tools. For programming languages, paradigmatic considerations bring us into areas of semantics, such as the values associated to types, run time errors and run time types. For example, the "paradigm" of a variable consists of all the values that can be substituted for it, which amounts to the set of data items that have the same type as that variable.

For further discussion of the above natural language concepts (and many others which are also of considerable interest), I would particularly recomment the following:

Aspects of Language, by Dwight Bolinger, Harcourt Brace Janovich, 1975.
It is out of print, but available in libraries, and used copies are still for sale; the second edition is the latest.


To CSE 230 homepage
Maintained by Joseph Goguen
© 2000-2002 Joseph Goguen
Last major modification: Thu Mar 14 15:40:51 PST 2002
Thanks to Prof. Gunter Rote for some corrections.