Sethi first defines context free grammars using the notation of Backus-Naur
forms (BNF), and then later also discusses extended BNF (EBNF) and syntax
charts. These three formalisms are **equivalent**, in the sense that
exactly the same set of languages can be defined using any of them as using
the others; moreover, there are several other formalisms that are equivalent
to these, including context free grammars in the sense of Chomsky, which
replace the symbol `::=` by `->`, and do not use the
`|` symbol. Moreover, there are several other kinds of grammar,
including recursive grammars and Post systems, which are more expressive than
context free grammars, in the sense that the set of languages that can be
defined using them is strictly larger than the set that can be defined using
context free grammars; and there are also grammars that are less expressive
than the context free grammars, including regular expressions.

Perhaps it will help to be really formal about some of this, since Sethi is not very formal about BNF. So here we go:-

Mathematically, a **language** is a set of (finite) strings, where a
**string** is a sequence of letters, from another set that is usually
called the **alphabet** of the language. Formally, we write `A*`
for the set of all finite strings from the alphabet `A`, including the
empty string, which we will usually write `[]` - but beware, because
many other notations are used for it, including the Greek letter lambda and
the Greek letter epsilon, and in Sethi, <empty>.

For programming, context free languages are especially important, and as
remarked above, BNF is just one kind of grammar for defining these languages.
Here is Chomsky's original version: A **context free grammar** or
**CFG** is a 4-tuple `(N,T,P,S)`, where: `N` is the set of
**nonterminal symbols** which serve as names for the grammatical
categories; `T` is the set of **terminal symbols** (the ones that
will end up in the sentences of the language), such that `T` and
`N` are disjoint; `P` is a finite set of **productions** or
grammatical **rules**, each of which has the form

Nwhere each_{0}-> w_{0}N_{1}w_{1}N_{2}... w_{n-1}N_{n}w_{n}

Given a CFG `G`, let `N` be a nonterminal, `p` a
production `N -> r`, and `e` some string in `A*`. Then we
write

e => e' with pto indicate that some single instance of

e =*=> e'when

L(G) = { t | S =*=> t and t in A*} .

**Regular Expressions** were invented by the logician Steven Kleene to
describe a certain simple class of languages, the **regular languages**,
which we now know are also those defined by regular grammars and accepted by
finite state automata; the "`*`" operation in regular expressions is
called the "Kleene star" in his honor. We can give the following BNF grammar
for regular expressions:

<rexp> ::= Ø <rexp> ::= (<rexp> <rexp>) <rexp> ::= (<rexp> + <rexp>) <rexp> ::= <rexp>* <rexp> ::= a for each a in Awhere <rexp> is the nonterminal for regular expressions. In the first line, "Ø" is a Danish letter, standing for the empty set, as usual in mathematics. The second line has an implicit binary infix operator, that is usually called

(a b)(a b)*defines the language consisting of finite nonempty strings of

(a* b)(a* b)* + (b* a)(b* a)*defining the language of nonempty strings of

Many computer scientists's ideas about natural language seem to have been influenced by the concepts like the above, which take a formal view along the lines Noam Chomsky, rather than that of linguists who study what real people actually write and say. An important point is that in natural language, context determines whether or not something makes sense; formal syntax and semantics are very far from adequate, and indeed, the distinction among syntax, semantics and pragmatics does not hold up under close examination. On the other hand, the formal linguists' way of looking at syntax and semantics works well for programming languages, because we can define things to work that way, and because traditionally, programming language designers want to achieve as much independence from context as possible (though this might change for future computer systems!).

It follows that the above definition of "language" is not suitable for natural language, where context determines whether something is acceptable or not, and where even then, there may be degrees of acceptability, and these may vary among different subcommunities, and even individuals, who speak a given langauge; morever, all of this changes (usually slowly) over time. For example, "yeah" is listed as a word in some dictionaries but not in others; perhaps it will gradually become more and more accepted, as have many other words over the course of time. At the level of syntax, English as spoken in black Harlem (a neighborhood of New York City) differs in some significant ways from "standard English" (if there is such a thing), in its lexicon, its syntax, and its pronunciation; this dialect has been studied under the name "Black English" by William Labov and others, who have shown that in some ways it is more coherent than so called "standard English".

It may be interesting to note that the first (known) grammar was given by Panini for Sanskrit, a sacred language of ancient India, more than 2,500 years ago. This work included very sophisticated components for phonetics, lexicon, and syntax, and was in many ways similar to modern context free grammars. The motivation for this work was to preserve the ancient sacred texts of Hinduism.

If we denote the empty set by "`{}`" and the empty string by the
empty character, then there will be no way to tell the difference between the
empty set and the set containing the empty string. So this is a bad idea, and
it helps to explain why we use `[]` or maybe the Greek letter epsilon
for the empty string, and Ø for the empty set. I am afraid you will
have to get used to there being a lot of notational variantion in mathematics,
just as there is a lot of notational variation for programming languages; in
fact, I am afraid that notational variation will be an ongoing part of the
life of every computer science professional for the foreseeable future, so
that developing a tolerance for it should be part of your professional
training.

A favorite question of philosophers is "What is meaning?" and a favorite
way to try to answer it is to provide a **denotation** for some class of
sentences; this approach is called **denotational semantics**. As one
simple example, we can define the "meanings" or **denotations** of regular
expressions with a **denotation function**, which maps the expression to
the language that it defines. If we let `R` be the set of regular
expressions over a finite alphabet `A`, then denotation for `R`
is given by a function

[[_]] : R -> P(A*) ,where

[[Ø]] = Ø [[E E']] = [[E]] º [[E']] [[E + E']] = [[E]] U [[E']] [[E*]] = [[E]]* [[a]] = {a} for each a in Awhere the operations

A º B = { ab | a in A and b in B }

A* = { [] } U { aUsing all these equations, we can compute the denotation of any regular expression, just by applying them recursively until all [[_]] pairs have been eliminated. Notice that this semantics is_{1}a_{2}...a_{n}| a_{i}in A, n > 0 }

[[ (a b) (a b)* ]] = [[ (a b) ]] [[ (a b)* ]] = ([[ a ]] [[ b ]]) [[ (a b) ]]* = ({ a } { b }) ({ a } { b })* = { ab } { ab }* = { (ab)^{n}| n > 0 }

Although there are clever ways to give a compositional semantics the property that the meaning of a part depends on the context within which it occurs, in the sense of the other parts of which it is a part, a mathematical semantics for a programming language is not going to give us what we usually intuitively think of as the meaning of programs written in it. For example, the fact that a certain FORTRAN program computes a certain formula does not tell us that it gives the estimated yield of a certain variety of corn under various weather conditions, let alone that this figure is only accurate under certain assumptions about the soil, and even then it is only accurate to within about 5 percent. Yet all this is an important part of the meaning of the program.

For another example, just a little more complex, here is a context free grammar G for a very simple class of expressions:

E -> 0 E -> 1 E -> (E + E) E -> (E * E)where

[[_]] : L(G) -> NATfor these expressions, where

[[ 0 ]] = 0 [[ 1 ]] = 1 [[ E + E' ]] = [[ E ]] + [[ E' ]] [[ E * E' ]] = [[ E ]] * [[ E' ]]From this, we can compute, for example, that

[[ (1 + 1) * (1 + 1) ]] = 4

**Attribute grammars** were invented by Don Knuth. They provide
another way to give meanings to expressions. Although they can be pretty
difficult to understand, the special case of grammars with just "synthesized
attributes" is much easier, and can be explained without too much notation,
and can also be illustrated rather simply. For example, we can define the
values of expressions over the above grammar. But first, it is convenient to
add to the grammar one non-terminal, `S`, so that now `N = {S,
E}`, and one more production, `S -> E`. The attribute grammar has
just one (synthesized) attribute, `val`, and associates the following
equations with the productions of the grammar:

S.val = E.val E.val = 0 E.val = 1 E1.val = E2.val + E3.val E1.val = E2.val * E3.valThen a parse tree for the expression

S | E / | \ / | \ / * \ / \ E E / | \ / | \ E + E E + E | | | | 1 1 1 0where for simplicity the parentheses are left out. Then the synthesized attribute will percolate up the tree, by applying the equations from the bottom up, producing the values that are shown here

2 | 2 / | \ / | \ / * \ / \ 2 1 / | \ / | \ 1 + 1 1 + 0where we see that the final value is 2.

For another illustration, we can do the semantics of binary digits; the grammar here is

B -> D B -> DB D -> 0 D -> 1and the equations associated with these four rules are

B.pos = 0 B.val = D.val B1.pos = B2.pos + 1 B1.val = D.val *(2 ** B1.pos) + B2.val D.val = 0 D.val = 1It is now a good exercise to compute the value of the binary number

To CSE 130 homepage

To CSE 130 notes page

Maintained by Joseph Goguen

© 2000 - 2004 Joseph Goguen

Last modified: Tue Jan 13 14:15:14 PST 2004