This chapter on the web is, as usual, somewhat vague and general, with scatterings of useful information. The web is moving so fast that much of the chapter is out of date, but it is still useful to read, because it makes explicit how user interface design issues are important for the web, and in particular, it links to material discussed earlier in the book. Most students of computer science should already know much of the content, but not these links.
The early historical bits, on Vanever Bush, Ted Nelson, and Douglas Engelbart are good, but I would certainly give much more emphasis to Engelbart's work, because he had already implemented all of the major features of the (so called) personal computer revolution in the mid-60s; Nelson's main contribution seems to have been colorful terminology. It is sobering to realize that the first proposal for the world wide web (Tim Berners-Lee) was only in 1989, and that most of the growth has been in the last 2 years. I like the term hyperchaos (p.556) for what bad hyperlink design can deliver to users.
Maybe I'm misunderstanding him, but it seems to me that two of Shneiderman's "Golden Rules of Hypertext" (p.556) are obviously wrong:
The short list of poor design practices on p.556 is good:
too many links, long chains of links, too many long or dull pages, and inadequate tables of contents (or other overviews).The list of features for web authoring is also good (but dull and heterogeneous). The design "tips" on pp.558-9 are very useful. It is very important not to forget (p.557) that
You are not a good judge of your own design.Shneiderman's remark (p. 561) that much web authoring advice is incomplete, site specific, misleading, or outdated, but still useful, is illustrated by the Karp piece we read earlier, as well as his own book! It would certainly be useful to have a lot more work on the genres of websites, but I suppose the categorizations given in Section 16.4 are better than nothing. We should never forget the importance of
Identifying the user's tasks. (p.566)The fact that the web can gracefully support such a wide variety of tasks is good news, but also bad news because it makes design more difficult. It is really amazing that you can find very specific facts (such as popular song lyrics), browse large areas for an overview (e.g. genetics), order any book in print, get bombarded with advertisements, meet new friends, wander at random into areas you never even knew existed, catch up on TV soap opera plots, get the latest headlines, find your homework assignments, make airline and hotel reservations, lose all your money in day trading, order a pizza, and more, all in the same medium (as usual, Shneiderman doesn't quite say all this). The demographic information on p.566 is out of date, due to the incredably rapid growth of the web.
Perhaps Shneiderman's "OAI model" is of some value, but it seems very limited to me and I tend to get irritated when he pushes it yet again (p.567). The list of information aggregation methods (p.568) is good, as is the list of metaphors for interface objects (p.570). We will have more to say on metaphor later, but the following quote is worthy of thought in connection with semiotic morphisms, in part because I think it is not quite right:
The metaphor needs to be useful in presenting high-level concepts, appropriate for expressing middle-level objects, and effective in suggesting pixel-level details (p.570).
I do like the main points in Section 16.6.5 on webpage design. The remark (p.575) that breadth is usually better than depth for a tree organization of information can be very helpful. The "traditional graphic design rules" listed on p.578 are also really good; it's safe for web designers to rely on these since they are already familiar to users. Section 16.6.6 again emphasizes the need to know your users. An important point that Shneiderman should have made in this section is that it can be really a lot of work to maintain a website; this effort should not be underestimated when thinking about setting up a site.
The Practitioner's Summary (p.580) is well worth reading more than twice. The negative remarks about controlled experiments on p.581 are refreshingly candid, especially compared to some of what we have seen before.
This paper discusses a highly innovative approach to designing multimedia systems, based on concepts from the area called dynamical systems theory. The mathematics is more or less along the same lines found in physics and mechanical engineering, but some details are different, and the applications are completely, and intriguingly, different: concepts like phase space, potential field, gradient, attractor and chaos are being used to tell a story, and to convey values and information. In fact, these concepts are on the cutting edge of science and technology in several important areas, one of which is sensors: it turns out that adding a little noise of the right kind can actually make a sensor more sensitive, by perching it on "the edge of chaos" (this is a technical term). Andersen's exciting approach has several significant benefits, perhaps the most important of which is avoiding pre-programmed linear sequences, such as are found in nearly all current authored products.
I would not promise you that multimedia user interface designers of the future will be using dynamical systems theory, but I do feel confident that interactive multimedia systems, roughly along the lines of Andersen's Viking Museum, will be important in the future; I would guess that there will be home players, in the form of VR rooms, for "playing" interactive multimedia "texts", probably downloaded over the internet, where users can experience many different things, like today's "home theatres" but much more flexible and interesting, perhaps with smell, motion and haptic feedback, in addition to sound and sight. Perhaps some future designers of programs for such devices will be media superstars, like Michael Jackson and Madonna today.
More technically, we can distinguish four levels of description for Andersen's system. The hardware level is at the bottom, with lighting, slide projectors, speakers, amplifiers, and the large video interface (the "Eye of Wodan") with its input devices (which seems to be a mouse and maybe some buttons). Next there is a software level, basically an object oriented program, using C++ and standard Apple multimedia applications, or slightly more technically, an event oriented program with some slightly exotic device drivers. The third level is that of dynamical systems, where we see potential fields over the phase space changing over time, moving the point that describes the state of the room. The fourth level is the most abstract and interesting, because it contains the most human elements, namely narratives, conflicts, values, and of course information about old Viking life.
The conflicts are important for making the experience interesting to users; as Aristotle said more than two thousand years ago, "drama is conflict." This is one of the most fundamental facets of Western culture; you can see it on TV (ads, sitcoms, even the news), in movies, newspapers, magazines, etc., etc. Not all cultures have this same value system; for example, classical Balinese narratives get their "kick" from a return to their starting point, as can be clearly heard in the cyclic nature of classical Balinese music, e.g., that for classical shadow plays. Andersen's ways of using phase space dynamics to bring out conflicts in interactive multimedia systems is (in my opinion) brilliant; see his paper for several interesting examples. Values are sometimes conveyed in an interestingly implicit manner. For example, the fact that the Vikings valued adventurousness is conveyed by rewarding users for being adventurous, e.g., giving them displays, which might be bird sounds, story fragments, bits of information, pictures of artifacts, etc.
The programming level is not especially innovative, and in fact, it fits a familiar genre of object oriented programming called event oriented programming (or sometimes, event driven programming); but it seems that Andersen and his team were not familiar with this literature. There are also some interesting connections with semiotics that will be discussed later. What I would especially highlight about Andersen's approach is that the story lines are not preprogrammed, but arise from the activation of events when their potential energy gets high enough, through a combination of author's programming and user interactions with the system. In fact, it is quite possible for entirely unexpected conjunctions and sequences to occur, some of which might be very interesting and appropriate, others less so. A very nice way to talk about this is through the satisfaction of elastic constraints, which can be "pushed against" with greater and greater effort as they become stronger, and eventually may become strict contraints, but meanwhile, allow great freedom of choice.
In the final section, there are what appear to be some excuses, from which I would conclude that the museum was not entirely a success from the point of view of those who paid for it and those who visit it. An "educated guess" on my part says that some users may be confused when they walk in and see that nothing much is happening, and if they are (say) a bit why about technology, they may not interact with the system enough to get it to do anything, and so will fail to learn anything about Vikings from their visit, and therefore be disappointed, perhaps even angry.
One instructive and important class of applications for semiotic morphisms is the construction of overviews, summaries, or (to use the currently hot term) visualizations for bodies of information, especially scientific information. Here the source sign system is for information structured in some particular way; relevant examples include books, source code for programs, and websites. The approach that I recommend is to first determine the highest priority constructors and selectors of the top level; then design a target sign system to display this information in the required medium; and finally build a morphism to preserve the selected structure and omit the rest; if more information is desired, then consider other levels with their highest priority constructors and selectors. Remember that the whole point is to delete most of the information while preserving enough to give an overview; this is where Principle F/C comes in, saying that when something must be sacrificed, it is more impotant to preserve form than content. It is very likely that some experimentation with the output will be required to achieve good results; user trials, interviews, etc. are recommended.
Let's first consider books. If you look at the physical structure of a book, you will see that it has the most important information printed on its front cover and spine, and that inside it has pages; basic information on the cover and spine include title, author, edition number, and publisher; this information also appears on the title page inside, along with the date of publication (or possibly this is on the back of the title page - publishers may want to make the date harder to find, with the hope that users may not realize that a book has become old). Looking at the contents, you will see that chapters are (usually) the main structuring device, and that their main selectors give a chapter number, a title, and a page number; chapters are (usually) divided into sections, and possibly subsections, each of which also has a number, title, and page. I think that these are the most obvious things that anyone would see, even if they were not already familiar with books. Taking them as constructors and selectors, with target medium a small number of printed pages, yields exactly the very familiar form known as an outline. Notice that the entire content of the book has been lost (unless you could titles as content).
Another example is source code. Surely the most important structuring device is the division into files (if there is more than one file). To see what attributes (i.e., what selectors) are important, we can see what the unix ls command can display under various options. For example, ls -lat will display the files in a directory, with their name, owner, size, and date of last modification. If we preserve these and display them in a natural way in a color graphics window, taking account of human perceptual capabilities will give essentially the display of Plate B1 (just after p.514) in Shneiderman, which has chosen to display the age of a file using color, and its size using length. Notice that the entire content of the program, i.e., all of its code, has been lost.
It is easy to find many similar examples. The conclusion seems to be that thinking about properties of the source and target sign systems, and preserving and representing what is most important in the source sign system, is an excellent guideline for good design of information overviews, summaries and visualizations. In fact, the preservation principles for semiotic morphisms can be used just as effectively for lower level design issues, such as the size, color and location of sliders, but here a great deal of guidance is available (any reasonably good bookstore will have several volumes on such subjects), whereas it is much more difficult to find useful guidance for the highest level design decisions.
Among these new topics, the following seem particularly relevant to this course: metaphors and blending (as discussed below) in the field called cognitive linguistics; the structure and analysis of multi-sentence units in the field called discourse analysis; speech act theory and conversation analysis (in the sense of ethnomethodology), which we have already discussed; and of course semiotics, which today is dominant in studies of film, literature, and media in many academic departments.
Let's discuss metaphors first, following some brilliant work by George Lakoff, a linguist a UC Berkeley (and by way of full disclosure, I should also say that he is an old and close friend of mine). The usual idea of metaphor is that we speak of one thing in terms of another, often using the words "like" or "as". For example, someone might say
Word 97 is like a maze. There are so many choices, and it is very easy to get lost. Also sometimes I can't figure out how to backtrack and undo a choice.Once the basic scheme has been set up with the first sentence, new material can be added that will be interpreted in the same framework, thus enriching our understanding of the speaker's experience, as we constantly refer back to what we already know about mazes.
It is easy to see examples of this kind in terms of semiotic morphisms. Here the source sign system is for mazes and the target sign system is for Word 97. Of course, we do this in a way that is only semi-formal, since no one in their right mind would want to write a complete formal sign system for Word 97! On the other hand, it is easy to give a completely formal sign system for mazes: they are just directed graphs with a given start and finish node; so there are sorts for nodes and edges, a constructor that attaches directed edges to nodes, and constants for the start and finish (i.e., goal) nodes. We then see that the start node of the maze maps nicely to the START icon in the lower left corner of the Windows display, and that choices of edges in the maze map to choices of menu items (or keys on the keyboard) in Word 97. It now follows that paths in a maze map to sequences of actions in Word 97. All this is completely natural, and readers of the above quote are able to make these connections in mere milliseconds, of course without doing any of the mathematics that we are sketching here; as a result, they can easily understand the use of maze language in further talk about Word 97.
The duality between sign systems, which provide languages for taking about signs of a certain kind, and their models shows up in an interesting way in this example. A model for the Word 97 sign system would be the trace of some particular task, such as writing a short business letter that has some bold face characters in it. The goal is then to print the letter, and this goal lies at the end of a long path through a maze of menu choices and keyboard strokes. Our semiotic morphism maps this path, which begins at the Windows 97 START icon, to a much more abstract path through a graph of nodes and edges whose significance in terms of documents has been lost. That is, a semiotic morphism maps the langauge of its source sign system into the language of the target sign system, and as a result, maps models of the target sign system into models of the source sign system; it is typical that some information is lost under the mapping of models.
It is also interesting and important to notice that there is more going on here than these simple mathematical transformations. Mazes have a connotation as well as a literal mathematical description. Scholars will know that the original "maze" was an actual physical structure on the island of Crete in ancient times, with a dangerous beast in it, called the Minotaur; in this maze, if you got lost, you might also get killed! And today, even non-scholars know that mazes have an associated feeling-tone of being bad, unpleasant, and perhaps even dangerous. For this reason, the above quotation is also a rhetorical gesture, having the effect, which is not explicitly stated, of imparting a negative connotation to Word 97. In fact, imparting connotations is often the real purpose of using a metaphor, and the word "rhetoric" refers to this aspect.
It should not be thought that such connotations lie outside of the semiotic framework that we have been developing. For the conceptual space of mazes is much richer than the simple graph sign system discussed above, and in particular, it includes the Minotaur, and anything else that is generally known about mazes in our culture. For example, the above quotation can easily be extended with the following sentence:
For me, the weird INSERT menu is the Minotaur lurking in the maze of Word 97. The whole thing has been a very painful experience for me. I thought I would die.Since the negative emotional connotation is part of the conceptual space of mazes, it is therefore automatically available to be carried over into talk about Word 97. This is easily formalized by adding some simple relations to the source sign system.
However, it is not really typical that an extended metaphorical discourse involves just once source sign system; very often there are two, or even more. For example, the word "weird" in the above quote hints at some kind of occult influence, and this hint could easily be expanded and incorporated into the discussion, for example, as follows:
Perhaps a voodoo doll of Bill Gates would have saved me, or at least given me some satisfaction.To understand this kind of language, we need to blend two different metaphors. And in fact, even the original quotes are perhaps better understood in terms of blending the langauges of mazes with that of Word 97. For example, the sentence "Also sometimes I can't figure out how to backtrack and undo a choice" in the first quotation uses the word "undo" which comes from the computer world as well as the word "choice" from the maze world and the word "backtrack" which could be from either.
All this should serve as motivation to carefully read the material on blending on pages 18-22 of An Introduction to Algebraic Semiotics, with Applications to User Interface Design, where some other applications are discussed, including finding the meanings of compound words such as "boathouse". Something not in that paper that it is interesting to think about is applying the quality criteria to determine the relative naturalness of the four blends given there for the two words "boat" and "house"; in fact, this ordering corresponds precisely to our intuition about which of the concepts are more "far out."