CSE 230: Principles of Programming Languages
Notes on Chapter 2 of Stansifer (syntax, grammar, and Post systems)

2.1 Stansifer's ideas on natural language seem to have come mainly from formalists like Chomsky, rather than from linguists who study what real people actually write and say. For example, it is easy to write a short play about painters redoing a trading room in a bank, where desks are named "one desk", "two desk", "FX desk", etc., and where one of the painters has the line, "Painted two desk" in response to his boss's asking what he did. A number of disgruntled empirical linguists have written little poems that end with the line "colorless green ideas sleep furiously", meaning something like "Chomsky's uninteresting untried theories do nothing much after a lot of effort". (It is an interesting exercise to try this yourself.) Similarly, it easy to imagine a Star Trek episode in which some creatures called "time flies" have affection for a certain arrow. We may conclude that almost anything can be made meaningful, given the right context.

The important point here is that in natural language, context determines whether or not something makes sense; formal syntax and semantics are very far from adequate, and indeed, the distinction among syntax, semantics and pragmatics does not hold up under close examination. On the other hand, the formal linguists' way of looking at syntax and semantics works rather well for programming languages, because we can define things to work that way, and because traditionally, programming language designers want to achieve as much independence from context as we can (though this might change in the future).

2.1.1 These principles are really important; please think about them, and the examples that are given. Also, notice that the situation for natural language is very different.

2.2 If we denote the empty set by "{}" and the empty string by the empty character, then there will not be any way to tell the difference between the empty set and the set containing the empty string. So this is a bad idea. Instead, we can use the greek letter epsilon for the empty string, and the Danish letter "O-with-slash" for the empty set, as is usual in mathematics. Sometimes I like to write "[]" for the empty string, while Stansifer sometimes writes "" for it, and some other people use the Greek letter lambda! I am afraid that you will have to get used to there being a lot of notational variantion in mathematics, just as there is a lot of notational variation for programming languages; in fact, I am afraid that notational variation will be an ongoing part of the life of every computer science professional for the foreseeable future, so that developing a tolerance for it should be part of your professional training. But to help out a bit, I will write the epsilon for the empty string and for set membership in different ways; the set epsilon will be much bigger, while the string epsilon will be (insofar as I can do it) slanted a bit.

Notice that the definition of "language" in section 2.2.1 is not suitable for natural language, where context determines whether something is acceptable or not, and where even then, there may be degrees of acceptability, and these may vary among different subcommunities, and even individuals, who speak a given langauge; morever, all of this changes (usually slowly) over time. For example, "yeah" is listed as a word in some dictionaries but not in others; perhaps it will gradually become more and more accepted, as have many other words over the course of time. At the level of syntax, English as spoken in black Harlem (a neighborhood of New York City) differs in some significant ways from "standard English" (if there is such a thing), in its lexicon, its syntax, and its pronunciation; this dialect has been studied under the name "Black English" by William Labov and others, who have shown that in some ways it is more coherent than so called "standard English".

It may be interesting to know that the first (known) grammar was given by Panini for Sanskrit, an sacred language of ancient India, more than 2,500 years ago. This work included very sophisticated components for phonetics, lexicon, and syntax, and was in many ways similar to modern context free grammars. The motivation for this work was to preserve the ancient sacred texts of Hinduism.

If we let R be the set of regular expressions over a finite alphabet A, then the "meaning" or denotation for R is given by a function

   [[_]] : R  ->  2 ** A* , 
where ** indicates exponentiation (sorry about that - HTML is lousy for formulae), so that 2 ** X indicates the set of all subsets of X, and A* is the set of all finite strings over A. Notice that the semantics given by Stansifer for regular expressions in compositional, in the sense that the denotation of each part is computed from the denotations of its part.

Although there are clever ways that can give a compositional semantics the property that the meaning of a part depends on the context within which it occurs, in the sense of the other parts of which it is a part, a mathematical semantics for a programming language is not going to give us what we usually intuitively think of as the meaning of programs written in it. For example, the fact that a certain FORTRAN program computes a certain formula does not tell us that it gives the estimated yield of a certain variety of corn under various weather conditions, let alone that this figure is only accurate under certain assumptions about the soil, and even then it is only accurate to within about 5 percent. Yet all this is an important part of the meaning of the program.

2.3 Stansifer's exposition of attribute grammars can seem pretty difficult to understand. However, the special case of grammars with just synthesized attributes is much easier; it can be explained without too much notation, and it can also be illustrated rather simply. For instance, here is a context free grammar for a simple class of expressions:

   S -> E
   E -> 0
   E -> 1
   E -> (E + E)
   E -> (E * E) 
where N = {S, E} and T = {0, 1, (, ), +, *}. We can find the value of any such expression by using an attribute grammar with just one (synthesized) attribute, val, by associating the following equations with the above rules:
   S.val = E.val
   E.val = 0
   E.val = 1
   E1.val = E2.val + E3.val
   E1.val = E2.val * E3.val 
Then a parse tree for the expression (1 + 1) * (1 + 0) looks as follows
              S
              |
              E
            / | \
           /  |  \
          /   *   \
         /         \
        E           E
      / | \       / | \
     E  +  E     E  +  E
     |     |     |     |
     1     1     1     0  
where for simplicity the parentheses are left out. Then the synthesized attribute will percolate up the tree, by applying the equations from the bottom up, producing the values that are shown here
              2
              |
              2
            / | \
           /  |  \
          /   *   \
         /         \
        2           1
      / | \       / | \
     1  +  1     1  +  0  
where we see that the final value is 2.

For another illustration, we can do the binary digits example in section 2.3.1 of Stansifer in a simpler way. The grammar here is

   B -> D
   B -> DB
   D -> 0
   D -> 1 
and the equations associated with these four rules are
   B.pos = 0              B.val  = D.val
   B1.pos = B2.pos + 1    B1.val = D.val *(2 ** B1.pos) + B2.val
   D.val = 0
   D.val = 1 
It is now a good exercise to compute the value of the binary number 1010 by first writing its parse tree and then computing how the values of the two (synthesized) attributes perculate up the tree.

We will see later than the parse trees of a context grammar form a very nice little algebra, such that the values of synthesized attributes are given by a (unique) homomorphism into another algebra of all possible values.

There is a slight inconsistency in Stansifer about whether In(S) should be empty; my preference is that it need not be, because S can occur on the right side of rules, as in S -> ASA on page 55. Stansifer is also not very forthright about the evaluation of attributes; the fact is that it is possible for a definition to be inconsistent, so that some (or even all) attributes do not have values; however, it is rather complex to give a precise definition for consistency. I also note that the diagrams for attribute evaluation are much more effective if they are drawn in real time using several colors; there are many other cases where live presentation works much better than reading a book; this should be good motiviation for your coming to class!

2.4 The first two occurrences of the word "list" in section 2.4.1 (page 62) should be replaced by "set". Stansifer uses the phrase "tally notation" the notation that represents 0, 1, 2, 3, ... by the empty string, |, ||, |||, ..., but it is a variant of what we will later call Peano notion. There is a typo on page 64, where it says that || + || = |||, i.e., 2 + 2 = 3! For future reference, there is an OBJ version of the Post system for the propositional calculus. Also, the production

     xax
     ---
     xxx  
on page 65 is called an "axiom" but it isn't.

The "proof" on page 68 does not really deserve to be called a proof, because it only sketches one direction and it completely omits the other direction, which turns out to be much harder than what is sketched. It is remarkable that the term "theorem" appears at different three levels: (1) a theorem of the predicate calculus, i.e., some x for which Th x is provable; (2) a theorem of the Post system for the predicate calculus, which means a derivable term of that system, which includes some terms of the form Th x, others of the form P x, etc.; and (3) a theorem of mathematics, the proof of which is discussed in the previous sentence.

2.5 I don't know why Stansifer is so dismissive about issues of concrete syntax in this section; he should be pleased that he did such a good job discussing them, and motivating various formalisms earlier in the chapter.

In algebraic terms, destructors are left inverses of constructors. For example,

  FirstOfBlock(Block(W1, W2))  =  W1
  SecondOfBlock(Block(W1, W2)) =  W2


To CSE 230 homepage
To CSE 230 notes page
Maintained by Joseph Goguen
© 2000, 2001, 2002 Joseph Goguen
Last modified: Wed Feb 13 21:47:46 PST 2002