CSE 230: Principles of Programming Languages
Notes on Internet Languages

This note examines some of the languages that have been spawned by the explosion of interest in the internet. Among these, the currently most important may be Java, HTML, JavaScript, Perl, and XML. One interesting observation about these languages is that they differ greatly from the classical programming languages that are traditionally studied in courses like CSE 130 and 230, and of course this is because they serve different purposes. Actually, HTML and XML are not programming languages at all, and JavaScript is only marginally such a language, but we will discuss them anyway, to get a more complete picture of the internet language scene. A more complete discussion would include more detail on each of these languages, and would also include cgi, Python, web ontology languages, and more. It may be helpful to view the collection of all these languages as a whole, since web professionals have to work with all of them together (and more, e.g. SQL); in my opinion, this situation is a mess, symptomatic of rapid, unplanned evolution, and I would hope that sometime in the future, it will be better.

Let's start with Java. Probably security issues have been addressed to a greater extent in Java than in any other programming language, and many unusual design decisions are due to security concerns. However, platform-independence (i.e. portability) were additional major forces driving the design of Java, and all of these motivate the decision to implement it using interpretation on an abstract machine. The concerns with security and portability are of course motivated by the use of the language over the internet, as is the use of threads for (psuedo-)concurrent execution. The use of APIs allows portability without sacrificing functionality, and in particular provides extensive support for interactive graphics, which of course is motivated by the way that the web is used.

The "ML" in HTML is for "Markup Language" not "Meta Language" as in the ML programming language, and HTML is not a programming language, but a language for describing multimedia content, originally in a way that is independent of the display device to be used, though later evolution of the language introduced many features that allow graphic designers to produce more pleasing layout for specific browsers. It would be interesting to survey all the effects that commercial competition had on HTML, but let it suffice to note that both MicroSoft and Netscape introduced non-standard features in an attempt to lock-in customers.

Although HTML is not a programming language, some programming language features are often desirable in writing content for display on web pages. For example, one wants simple procedures for buttons, menus, etc., rather than having to code them up from scratch. Sometimes one also wants functionality where simple programming language features would come in handy, such as counting the number of mouse clicks. JavaScript is a low power programming language designed for just such purposes; it is relatively simple, but has a lot of "widget" to support interactive graphics. One would not want to use JavaScript for general purpose programming, e.g., for writing a compiler, but it could be done.

Perl is a language that fills a small but important niche in the internet world; it has many features that make it unsuitable for general purpose programming, such as being untyped and having weak modularity. But it is ideal for quickly writing relatively small translators, for example, into SQL, and it has been called "the duct tape of the internet." It is also notable that Perl is an open source effort, and has very high quality implementations and documentation. See Perl: The first postmodern computer language, by Larry Wall, the designer of Perl, for an amusing discussion. What is most interesting about this paper is not so much its content (which is a bit shallow and self-serving) as its style, which reflects a culture that is radically different from that associated (for example) with Ada or COBOL, which were designed and built with military sponsorship. Wall is part of what we might call "hacker culture," which tends towards doing "stuff `cause its cool," i.e., for fun, as opposed to defense contractor culture, which by necessity is more serious, since it must build software for systems that can kill people. As a linguistic note, the word "hacker" has been around at least since the 1960s with a meaning like that discussed above, but was "hijacked" in the 1990s by the media and given the radically different meaning of someone who does illegal and/or unethical programming. This usage grew out of the MIT AI Lab, along with many other amusing words and phrases, during a time of great excitement, exploration, wealth, and productivity. For example, "frobnicator" is a rough synonym for "hacker" (though without the connotation of high skill), derived from the verb "frob," which means to fool around with something, having no particular goal in mind. For more (much more!) on hacker language, see the Jargon File. I find it encouraging that there is a healthy computer science subculture that is freedom-oriented, public spirited, and somewhat rebellious, promoting open source software, in contrast to the culture of profit, of which Microsoft is perhaps an extreme case.

Finally, XML serves as a kind of meta-language for HTML (the "ML" is again for "Markup Language," and the "X" is for "extensible"). Like HTML, XML is simplified from SGML, but unlike HTML, it enables users to define their own new tags. The impetus for developing this languages comes primarily from B2B applications, where it is expected to be used very extensively. However, it is also of interest for applications in the sciences, and of course in computer science. In fact, we have used it in the Kumo system being developed in my own lab. (This system also uses HTML and JavaScript, of course.)

I suggest that you should now re-read the Preliminary Essay on Comparative Programming Linguistics, for its discussion of how intent and social context affect design. A major part of the social context for these languages is the commercial use of the world wide web, and the competition between MicroSoft and nearly everyone else, to provide software to support this. Another clash of note is that between the open source movement and the business culture.

For an overview which includes some historical information as well as the usual technical information, one good source is Programming the World Wide Web, by Robert Sebesta, Addison Wesley, 2002; there is nothing very deep or very original here, but you can get a feel for the current state of play.


To CSE 230 homepage
To CSE 230 notes page
Maintained by Joseph Goguen
© 2000, 2001, 2002 Joseph Goguen
Last modified: Fri Mar 15 14:47:10 PST 2002