CSE 271: User Interface Design: Social and Technical Issues
8. The Web and Multimedia Interfaces

Web design has become one of the major career paths from computer science students. But the web is evolving so fast that parts of Chapter 16 of Shneiderman are out of date, although it is still impressive, considering that the book was published in 1997, and this chapter is certainly worth reading, because it makes explicit how important user interface design issues are for the web, and it connects well to material discussed earlier in the book, and in this course. Most computer science students already know a lot about the web, but may not know much about these important connections.

The early historical information, on Vanever Bush, Ted Nelson, and Douglas Engelbart is good, but I would certainly give much more emphasis to Engelbart's work, because he had already implemented all the major features of the (so called) personal computer revolution in the mid-60s; Nelson's main contribution seems to have been colorful terminology. It is sobering to realize that the first proposal for the world wide web (Tim Berners-Lee) was only in 1989, and that by far the greatest growth has been recent. I like the term hyperchaos (p.556) for what bad hyperlink design can deliver to users (see Towards a Theory of Ethical Linking, by Jeff White, for an example of this).

Maybe I'm misunderstanding him, but it seems to me that two of Shneiderman's "Golden Rules of Hypertext" (p.556) are obviously wrong:

In fact, the website for this class is a counterexample to both those rules, is it not?

The short list of poor design practices on p.556 is good:

too many links, long chains of links, too many long or dull pages, and inadequate tables of contents (or other overviews).
The list of features for web authoring is also good (though dull and heterogeneous). The design "tips" on pp.558-9 are very useful. It is very important not to forget (p.557) that
You are not a good judge of your own design.
Shneiderman's remark (p. 561) that much web authoring advice is incomplete, site specific, misleading, or outdated, but still useful, is illustrated by Shneiderman's own book, as well as by a strange piece by Karp that I found, from the early days of the web.

It would certainly be useful to have a lot more work on the genres of websites, but the categorizations given in Section 16.4 are better than nothing. We should never forget the importance of

Identifying the user's tasks. (p.566)
The fact that the web can effectively support such a wide variety of tasks is good news, but also bad news because it makes design more difficult. It is really amazing that you can find very specific facts (such as popular song lyrics), browse large areas for an overview (e.g. genetics), order any book in print, get bombarded with advertisements, meet new friends, wander at random into areas you never even knew existed, catch up on TV soap opera plots, get the latest headlines, find your homework assignments, make airline and hotel reservations, lose all your money in day trading, order a pizza, and more, all in the same medium. The demographic information on p.566 is out of date, due to the incredably rapid growth of the web.

As he says, Shneiderman's "OAI model" (p.567) is rather limited, but useful, and it is interesting to note that it can be seen as highlighting certain aspects of a semiotic morphism, namely the objects and actions in the source sign system, the "handles" (which are perceived affordances, in the sense of Norman) in the target sign system, and the metaphors of the morphism itself. Shneiderman mentions structure in the source objects, and later gives a brief list of possible structures for websites, but it is very limited, and moreover, any combination of them could potentially occur; we know that sign systems allow for such possibilites. Shneiderman also gives a list of possible metaphors, but these too are only a sampling of what is possible, and of course, they too can be described by semiotic morphisms.

The list of information aggregation methods (p.568) is good, as is the list of metaphors for interface objects (p.570). Due to the work we have done in this class, we can say much more about metaphor, but the following quote is worthy of thought in connection with semiotic morphisms, in part because I think it is not quite right (especially the last phrase):

The metaphor needs to be useful in presenting high-level concepts, appropriate for expressing middle-level objects, and effective in suggesting pixel-level details (p.570).

I like the main points in Section 16.6.5 on webpage design. The idea of query preview based on a table of contents is very good (p.573). The remark (p.575) that breadth is usually better than depth for a tree organization of information can be very helpful, under the assumption that the information does not already have such a fixed structure that you have no choice about how to organize it. (There is a nice way to explain the recommendation based on the preservation properties of semiotic morphisms.)

The "traditional graphic design rules" listed on pp.577-578 are also really good, and it's quite safe for web designers to rely on these since they are already familiar to users. Section 16.6.6 again emphasizes the need to know your users. An important point that Shneiderman should have emphasized more in this section is that it can be really a lot of work to maintain a website; this effort should not be underestimated when thinking about setting up a site (as happened to me when I undertook to provide these class notes!).

The Practitioner's Summary (p.580) is well worth reading more than twice. The negative remarks about the value of controlled experiments on p.581 are refreshingly candid.

Media and the Ubiquity of Interfaces.

User interface issues are everywhere. A coffee cup is an interface between the coffee and the user; questions like how thick the cup should be, what its volume should be, and whether it should have a handle, are all user interface issues. A book can be considered a user interface to its content; note that a book is interactive, because users turn the pages, and can go to any indicated page they want; they also use indices, glossaries, etc in an interactive manner. Buildings can be seen as providing interfaces for getting to a certain room, e.g. by using a directory in the lobby, buttons outside and inside elevators, "EXIT" signs, doorknobs, stairways, and even corridors (you make choices with your body - not your mouse). Returning to the obvious, medical instruments have user interfaces (for doctors, nurses, and even patients) that can have extreme consequences if badly designed. By perhaps stretching your mind a bit, almost anything can be seen as a user interface, having its own issues of design and representation. Certainly this is how Andersen views his museum.

Of course, all this is quite parallel to what semiotics says about signs, and indeed such issues can be considered a part of semiotics, although the notion of semiotic morphism is often needed in making the translation. The basic idea is to consider an object, such as a cup or a building, as a composite sign. Here are some further concepts that are useful:

Andersen's museum is a multimedia interactive system (and so is any other museum, though in a more prosaic sense). I would to emphasize that the notion of genre is social, whereas that of medium is technical. In discussing real examples, it is difficult (or impossible) to separate media from genres, i.e., to separate the technical from the social, in particular, because a medium without conventions for its use will be difficult or impossible to use.

I would also mention that every genre embodies values and ethics. For example, detective novels typically reinforce the values of truth and justice, and more generally, by its very nature, a genre emphasizes certain things at the expense of others, i.e., it expresses values.

Notes on Andersen's Multimedia Phase-spaces

This paper discusses a very innovative approach to designing multimedia systems, based on concepts from the area called dynamical systems theory. The mathematics is more or less along the same lines found in physics and mechanical engineering, but some details are different, and the applications are completely, and intriguingly, different: concepts like phase space, potential field, gradient, attractor and chaos are being used to tell a story, and to convey values and information. In fact, dynamical systems concepts are on the cutting edge of science and technology in several important areas, one of which is sensors: it turns out that adding a little noise of the right kind can actually make a sensor more sensitive, by perching it on "the edge of chaos" (this is a technical term). Andersen's approach has several significant benefits, one of the most important of which is avoiding pre-programmed linear sequences, such as are found in nearly all current authored products.

I would not promise you that multimedia user interface designers of the future will be using dynamical systems theory, but I do feel confident that interactive multimedia systems, roughly along the lines of Andersen's Viking Museum, will be important in the future; I would guess that there will be home players, in the form of VR rooms, for "playing" interactive multimedia "texts", probably downloaded over the internet, where users can experience many different things, like today's "home theatres" but much more flexible and interesting, perhaps with smell, motion and haptic feedback, in addition to sound and sight. Perhaps some future designers of programs for such devices will be media superstars, like Michael Jackson and Madonna today.

More technically, we can distinguish four levels of description for Andersen's system. The hardware level is at the bottom, with lighting, slide projectors, speakers, amplifiers, and the large video interface (the "Eye of Wodan") with its input devices (which seem to be a mouse and maybe some buttons). Next there is a software level, basically an object oriented program, using C++, standard Apple multimedia applications, and custom code generators, or slightly more technically, an event oriented program with some slightly exotic device drivers. The third level is that of dynamical systems, where we see potential fields over the phase space changing over time, moving the point that describes the state of the room. The fourth level is the most abstract and most interesting, because it contains the most human elements, namely narratives, conflicts, values, and of course information about old Viking life.

The conflicts are important for making the experience interesting to users; as Aristotle said more than two thousand years ago, "drama is conflict." This is one of the most fundamental facets of Western culture; you can see it on TV (ads, sitcoms, even the news), in movies, newspapers, magazines, etc., etc. Not all cultures have this same value system; for example, classical Balinese narratives get their "kick" from a return to their starting point, as can be clearly heard in the cyclic nature of classical Balinese music, e.g., for classical shadow plays. Andersen's ways of using phase space dynamics to bring out conflicts in interactive multimedia systems is (in my opinion) brilliant; see his paper for several interesting examples. Values are sometimes conveyed in an interestingly implicit manner. For example, the fact that the Vikings valued adventurousness is conveyed by rewarding users for being adventurous, e.g., giving them displays, which might be bird sounds, story fragments, bits of information, pictures of artifacts, etc.

The programming style is not especially innovative; in fact, it fits a familiar genre of object oriented programming called event oriented programming (or sometimes, event driven programming); but it seems that Andersen and his team were not familiar with this literature. There are also some interesting connections with semiotics that will be discussed later. What I would especially highlight about Andersen's approach is that the story lines are not preprogrammed, but arise from the activation of events when their potential energy gets high enough, through a combination of author's programming and user interactions with the system. In fact, it is quite possible for entirely unexpected conjunctions and sequences to occur, some of which might be very interesting and appropriate, others less so. A very nice way to talk about this is through the satisfaction of elastic constraints, which can be "pushed against" with greater and greater effort as they become stronger, and eventually may become strict contraints, but meanwhile, allow various amounts of freedom of choice.

It would have helped a lot, I think, if Andersen had included the following equation in the text:

       v(t+1) = v(t) + a + u

where v is is the point (vector) in phase space, t is the time, a is the increment provided the the author, and u is the increment provided by the user, noting that both these increments are also computed at time t, and that their values depend on the current state of the system.

In the final section, there are what appear to be some excuses, from which one might conclude that the museum was not entirely a success from the point of view of those who paid for it and those who visit it. An "educated guess" says that some users may be confused when they walk in and see that nothing much is happening, and if they are (say) a bit shy about technology, they may not interact with the system enough to get it to do anything, and so will fail to learn anything about Vikings from their visit, and therefore be disappointed, perhaps even angry. Can you suggest a social solution to this (possible) problem?

It is interesting to notice that many video games employ similar techniques, though the designers do not use the same sophisticated terminology.

Remarks on Linguistics

Nearly all work on linguistics is concerned with grammar, and insofar as meaning is considered at all, it is usually literal meaning that is treated. In fact, there has not been a lot of progress in grammar during the approximately 3,000 years since Panini's classical grammar for Sanskirt was written. However an important revolution is now occurring, in which many of the more human - and I would say more important - aspects of language are being explored, with fascinating new results and important new applications. Of course grammar is still important, but because researchers in other fields could only make rather limited use of grammar for their applications, they are eagerly adopting the new paradigms, even though the profession of linguistics has been rather slow to respond to this challenge.

Among these new topics, the following seem particularly relevant to this course: metaphors and blending (as discussed below) in the field called cognitive linguistics; the structure and analysis of multi-sentence units in the field called discourse analysis; speech act theory and conversation analysis (in the sense of ethnomethodology), which we have already discussed; and of course semiotics, which today is a dominant theoretical language in studies of film, literature, and media in many academic departments - indeed, semiotics has been called the "mathematics of the humanities" by Peter Bøgh Andersen.

Let's discuss metaphors first, following some brilliant work by George Lakoff, a linguist at UC Berkeley (and by way of full disclosure, I should also say that he is an old and close friend of mine). The usual idea of metaphor is that we speak of one thing in terms of another, often using the words "like" or "as". For example, someone might say

Word is like a maze. There are so many choices, and it is very easy to get lost. Also sometimes I can't figure out how to backtrack and undo a choice.
Once the basic scheme has been set up with the first sentence, new material can be added that will be interpreted in the same framework, thus enriching our understanding of the speaker's experience, as we constantly refer back to what we already know about mazes.

It is easy to see examples like this in terms of semiotic morphisms. Here the source sign system is for mazes and the target sign system is for Word. Of course, we do this in a way that is only semi-formal, since no one in their right mind would want to write a complete formal sign system for Word! On the other hand, it is easy to give a completely formal sign system for mazes: they are just directed graphs with a given start and finish node; so there are sorts for nodes and edges, a constructor that attaches directed edges to nodes, and constants for the start and finish (i.e., goal) nodes. We then see that the start node of the maze maps nicely to the START icon in the lower left corner of the Windows display, and that choices of edges in the maze map to choices of menu items (or keys on the keyboard) in Word. It now follows that paths in a maze map to sequences of actions in Word. All this is completely natural, and readers of the above quote are able to make these connections in mere milliseconds, of course without doing any of the mathematics that we are sketching here; as a result, they can easily understand the use of maze language in further talk about Word.

The duality between sign systems, which provide languages for taking about signs of a certain kind, and their models shows up in an interesting way in this example. A model for the Word sign system would be the trace of some particular task, such as writing a short business letter that has some bold face characters in it. The goal is then to print the letter, and this goal lies at the end of a long path through a maze of menu choices, mouse movements (including mouse buttons), and keyboard strokes. Our semiotic morphism maps this path, which begins at the Windows START icon, to a much more abstract path through a graph of nodes and edges whose significance in terms of documents has been lost. That is, a semiotic morphism maps the langauge of its source sign system into the language of the target sign system, and as a result, maps models of the target sign system into models of the source sign system; it is typical that some information is lost under the mapping of models.

It is also interesting and important to notice that there is more going on here than these simple mathematical transformations. Mazes have a connotation as well as a literal mathematical description. Scholars will know that the original "maze" was an actual physical structure on the island of Crete in ancient times, with a dangerous beast in it, called the Minotaur; in this maze, if you got lost, you might also get killed! And today, even non-scholars know that mazes have an associated feeling-tone that is rather bad, unpleasant, and perhaps even dangerous. For this reason, the above quotation is also a rhetorical gesture, having the effect, which is not explicitly stated, of placing a negative connotation on Word. In fact, imparting connotations is often the real purpose of using a metaphor, and the word "rhetoric" refers to this aspect.

It should not be thought that such connotations lie outside of the semiotic framework that we have been developing. For the conceptual space of mazes is much richer than the simple graph sign system discussed above, and in particular, it includes the Minotaur, and anything else that is generally known about mazes in our culture. For example, the above quotation can easily be extended with the following sentences:

For me, the weird INSERT menu is the Minotaur lurking in the maze of Word. The whole thing has been a very painful experience. I thought I would die.
Since the negative emotional connotation is part of the conceptual space of mazes, it is therefore automatically available to be carried over into talk about Word. This is easily formalized by adding some simple relations to the source sign system.

However, it is not really typical that an extended metaphorical discourse involves just once source sign system; very often there are two, or even more. For example, the word "weird" in the above quote hints at some kind of occult influence, and this hint could easily be expanded and incorporated into the discussion, for example, as follows:

Perhaps a voodoo doll of Bill Gates would have saved me, or at least given me some satisfaction.
To understand this kind of language, we need to blend two different metaphors. And in fact, even the original quotes are perhaps better understood in terms of blending the langauges of mazes with that of Word. For example, the sentence "Also sometimes I can't figure out how to backtrack and undo a choice" in the first quotation uses the word "undo" which comes from the computer world as well as the word "choice" from the maze world and the word "backtrack" which could be from either.

All this should serve as motivation to carefully read the material on blending on pages 18-22 of An Introduction to Algebraic Semiotics, with Applications to User Interface Design, where some other applications are discussed, including finding the meanings of compound words such as "boathouse"; see also the Formal Notation for Conceptual Blending.

To CSE 271 homepage
To the next section of the class notes
To the previous section of the class notes
Maintained by Joseph Goguen
© 2000 - 2003 Joseph Goguen, all rights reserved.
Last modified: Sun Jun 1 11:20:21 PDT 2003