CSE 171: User Interface Design: Social and Technical Issues
8. Multimedia Interfaces and the Web

8.1 Media and the Ubiquity of Interfaces.

User interface issues are everywhere. A coffee cup is an interface between the coffee and the user; questions like how thick the cup should be, what its volume should be, and whether it should have a handle, are all user interface issues. A book can be considered a user interface to its content; note that a book is interactive, because users turn the pages, and can go to any page they want; they also use indices, glossaries, etc in an interactive manner. Buildings can be seen as providing interfaces for getting to a certain room, e.g. by using a directory in the lobby, buttons outside and inside elevators, "EXIT" signs, doorknobs, stairways, and even corridors (you make choices with your body - not your mouse). Returning to the obvious, medical instruments have user interfaces (for doctors, nurses, and even patients) that can have extreme consequences if badly designed. By perhaps stretching your mind a bit, almost anything can be seen as a user interface, having its own issues of design and representation. Certainly this is how Andersen views his museum.

Of course, all this is quite parallel to what semiotics says about signs, and indeed such issues can be considered a part of semiotics. The basic idea is to consider an object, such as a cup or a building, as a composite, structured sign. The guidelines for semiotic analysis in Section 7.4 of the class notes apply perfectly well to these less computational examples: You can look at a number of different coffee (and tea) cups, and use their systematic differences and similarities to interpolate the source space and morphisms, and then use that to expose the values of various classes of cup users. You can do the same with buildings. In fact, my claim is that this method applies to anything that humans build and/or use, and that there is no firm distinction between web design and general design.

Here are some further concepts that are useful:

Andersen's museum is a multimedia interactive system (and so is any other museum, though in a more prosaic sense). I would emphasize that the notion of genre is social, whereas that of medium is technical. In discussing real examples, it is difficult (or impossible) to separate media from genres, i.e., to separate the technical from the social, in particular, because a medium without conventions for its use will be difficult or impossible to use.

I would also mention that every genre embodies values and ethics. For example, detective novels typically reinforce the values of truth and justice, and more generally, by its very nature, a genre emphasizes certain things at the expense of others, i.e., it expresses values.

8.2 Notes on Andersen's Multimedia Phase-spaces

This paper discusses a very innovative approach to designing multimedia systems, based on concepts from the area called dynamical systems theory. The mathematics is more or less along the same lines found in physics and mechanical engineering, but some details are different, and the applications are completely, and intriguingly, different: concepts like phase space, potential field, gradient, attractor and chaos are being used to tell a story, and to convey values and information. In fact, dynamical systems concepts are on the cutting edge of science and technology in several important areas, one of which is sensors: it turns out that adding a little noise of the right kind can actually make a sensor more sensitive, by perching it on "the edge of chaos" (this is a technical term). Andersen's approach has several significant benefits, one of the most important of which is avoiding pre-programmed linear sequences, such as are found in nearly all current authored products.

I would not promise you that multimedia user interface designers of the future will be using dynamical systems theory, but I do feel confident that interactive multimedia systems, roughly along the lines of Andersen's Viking Museum, will be important in the future; I would guess that there will be home players, in the form of VR rooms, for "playing" interactive multimedia "texts", probably downloaded over the internet, where users can experience many different things, like today's "home theatres" but much more flexible and interesting, perhaps with smell, motion and haptic feedback, in addition to sound and sight. Perhaps some future designers of programs for such devices will be media superstars, like Michael Jackson and Madonna today.

More technically, we can distinguish four levels of description for Andersen's system. The hardware level is at the bottom, with lighting, slide projectors, speakers, amplifiers, and the large video interface (the "Eye of Wodan") with its input devices (which seem to be a mouse and maybe some buttons). Next there is a software level, basically an object oriented program, using C++, standard Apple multimedia applications, and custom code generators, or slightly more technically, an event oriented program with some slightly exotic device drivers. The third level is that of dynamical systems, where we see potential fields over the phase space changing over time, moving the point that describes the state of the room. The fourth level is the most abstract and most interesting, because it contains the most human elements, namely narratives, conflicts, values, and of course information about old Viking life.

The conflicts are important for making the experience interesting to users; as Aristotle said more than two thousand years ago, "drama is conflict." This is one of the most fundamental facets of Western culture; you can see it on TV (ads, sitcoms, "American Idol," even the news), in movies, newspapers, magazines, etc., etc. Not all cultures have this same value system; for example, classical Balinese narratives get their "kick" from a return to their starting point, as can be clearly heard in the cyclic nature of classical Balinese music, e.g., for classical shadow plays. Andersen's ways of using phase space dynamics to bring out conflicts in interactive multimedia systems is (in my opinion) brilliant; see his paper for several interesting examples. Values are sometimes conveyed in an interestingly implicit manner. For example, the fact that the Vikings valued adventurousness is conveyed by rewarding users for being adventurous, e.g., giving them new output, which might be bird sounds, story fragments, bits of information, pictures of artifacts, etc.

The programming style is not especially innovative; in fact, it fits a familiar genre of object oriented programming called event oriented programming (or sometimes, event driven programming); but it seems that Andersen and his team were not familiar with this literature. There are also some interesting connections with semiotics that will be discussed later. What I would especially highlight about Andersen's approach is that the story lines are not preprogrammed, but arise from the activation of events when their potential energy gets high enough, through a combination of author programming and user interaction with the system. In fact, it is quite possible for entirely unexpected conjunctions and sequences to occur, some of which might be very interesting and appropriate, others less so. A very nice metaphor for talking about this is through the satisfaction of elastic constraints, which can be "pushed against" with greater and greater effort as they become stronger, and eventually may become strict contraints, but meanwhile, allow various amounts of freedom of choice.

It would have helped a lot, I think, if Andersen had included the following equation in his paper:

       v(t+1) = v(t) + a + u

where v is is the point (vector) in phase space, t is the time, a is the increment provided by the author, and u is the increment provided by the user, noting that both these increments are also computed at time t, and that their values depend on the current state of the system.

In the final section, there are what appear to be some excuses, from which one might conclude that the museum was not entirely a success from the point of view of those who paid for it and those who visit it. An "educated guess" says that some users may be confused when they walk in and see that nothing much is happening, and if they are (say) a bit shy about technology, they may not interact with the system enough to get it to do anything, and so will fail to learn anything about Vikings from their visit, and therefore be disappointed, perhaps even angry; the affordances are not perceived affordances. Exercise: Suggest a social solution to this (possible) problem.

It is interesting to notice that many video games employ similar techniques, though their designers do not use the same sophisticated terminology as Andersen.

8.3 On Cognitive Linguistics

Nearly all work on linguistics is concerned with grammar, and insofar as meaning is considered at all, it is usually literal meaning that is treated. In fact, there has not been a lot of progress in grammar during the approximately 3,000 years since Panini's classical grammar for Sanskirt was written; Panini's grammatical formalism is very similar to Chomsky's. However an important revolution is now occurring, in which many of the more human - and I would say more important - aspects of language are being explored, with fascinating new results and important new applications. Of course grammar is still important, but because researchers in other fields could only make rather limited use of grammar for their applications, they are eagerly adopting the new paradigms, even though the profession of linguistics has been rather slow to respond to this challenge.

Among these new topics, the following seem particularly relevant to this course: metaphors and blending (as discussed below) in the field called cognitive linguistics; the structure and analysis of multi-sentence units in the field called discourse analysis; speech act theory; conversation analysis (in the sense of ethnomethodology), which we have already discussed; and of course semiotics, which today is a dominant theoretical language in studies of film, literature, and media in many academic departments - indeed, semiotics has been called the "mathematics of the humanities" by Peter Bøgh Andersen.

Let's discuss metaphors first, following some brilliant work by George Lakoff, a linguist at UC Berkeley (and by way of full disclosure, I should also say that he is an old and close friend of mine). The usual idea of metaphor is that we speak of one thing in terms of another, often using the words "like" or "as". For example, someone might say

Word is like a maze. There are so many choices, and it is very easy to get lost. Also sometimes I can't figure out how to backtrack and undo a choice.
Once the basic scheme has been set up with the first sentence, new material can be added that will be interpreted in the same framework, thus enriching our understanding of the speaker's experience, as we constantly refer back to what we already know about mazes.

It is easy to see examples like this in terms of semiotic morphisms. Here the source sign system is for mazes and the target sign system is for Word. Of course, we do this in a way that is only semi-formal, since no one in their right mind would want to write a complete formal sign system for Word! On the other hand, it is easy to give a completely formal sign system for the structural aspect of mazes, as directed graphs with a given start and finish node; so there are sorts for nodes and edges, a constructor that attaches directed edges to nodes, and constants for the start and finish (i.e., goal) nodes. We then see that the start node of the maze maps nicely to the START icon (or some other way to invoke Word) in the lower left corner of the Windows display, and that choices of edges in the maze map to choices of menu items (or keys on the keyboard) in Word. It now follows that paths in a maze map to sequences of actions in Word. All this is completely natural, and readers of the above quote are able to make such connections in mere milliseconds, of course without doing any of the mathematics that we are sketching here; as a result, they can easily understand the use of maze language in further talk about Word. (But see below for discussion of connotation, etc.)

The duality between sign systems, which provide languages for taking about signs of a certain kind, and their models, shows up in an interesting way in this example. A model for the Word sign system would include traces of particular tasks, such as writing a short business letter that has some bold face characters in it. The goal is then to print the letter, and this goal lies at the end of a long path through a maze of menu choices, mouse movements (including mouse buttons), and keyboard strokes. Our semiotic morphism maps this path, which begins at the Windows START icon, to a much more abstract path through a graph of nodes and edges whose significance in terms of documents has been lost. That is, a semiotic morphism maps the language of its source sign system into the language of the target sign system, and as a result, maps models of the target sign system into models of the source sign system; it is typical that some information is lost under the mapping of models.

It is also interesting and important to notice that there is much more going on here than these simple mathematical transformations. Mazes have connotations as well as an abstract mathematical description. Scholars will know that the original "maze" was an actual physical structure on the island of Crete in ancient times, with a dangerous beast in it, called the Minotaur; in this maze, if you got lost, you might also get killed! And today, even non-scholars know that mazes have an associated feeling-tone that is rather bad, unpleasant, and perhaps even dangerous. For this reason, the above quotation is also a rhetorical gesture, having the effect, which is not explicitly stated, of placing a negative connotation on Word. In fact, imparting connotations is often the real purpose of using a metaphor, and the word "rhetoric" refers to this aspect.

It should not be thought that such connotations lie outside of the semiotic framework that we have been developing. For the semiotic space (called a conceptual space in the cognitive linguistics literature) of mazes can be much richer than the simple graph sign system discussed above, and in particular, it can "recruit" the Minotaur, and anything else that is generally known about mazes in our culture. For example, the above quotation can easily be extended with the following sentences:

For me, the weird INSERT menu is the Minotaur lurking in the maze of Word. The whole thing has been a very painful experience. I thought I would die.
Since the negative emotional connotation is part of the conceptual space of mazes, it is therefore automatically available to be carried over into talk about Word. This is easily formalized by adding some simple relations to the source sign system.

However, it is not really typical that an extended metaphorical discourse involves just once source sign system; very often there are two, or even more. For example, the word "weird" in the above quote hints at some kind of occult influence, and this hint could easily be expanded and incorporated into the story, for example, as follows:

Perhaps a voodoo doll of Bill Gates would have saved me, or at least given me some satisfaction.
To understand this kind of language, we need to include another metaphor and another space, for "occult" entities. In fact, this and even the original quotes, are better understood in terms of blending the space of mazes with that of Word. For example, the sentence "Also sometimes I can't figure out how to backtrack and undo a choice" in the first quotation uses the word "undo," which comes from the computer world as well as the word "choice" from the maze world and the word "backtrack" which could be from either. Moreover, the story has constructed several hybrid entities, including Word-as-maze, INSERT-as-Minotaur, and Gates-as-doll, which do not belong in either the maze space or the Word space. So where do they belong?

The blending theory of Gilles Fauconnier and Mark Turner provides an answer: there is a blend space that contains the hybrid entities mentioned above; the mathematical theory of graphs is included in a generic space, consisting of those things shared by the input spaces, which here are for mazes, Word, and the occult. Then the mapping from one input space to another in the Lakoff theory arises as a side-effect of blending, just by seeing which entities from the input spaces get identified in the blend space. Note that these conceptual spaces are not all inclusive "knowledge domains," but are considered to contain just the minimal information needed to understand the situation at hand; however, they are also dynamic, in that they can grow as new language recruits new conceptual content.

I hope that all this will encourage you to carefully read the material on blending on pages 18-22 of An Introduction to Algebraic Semiotics, with Applications to User Interface Design, where some other applications are discussed, including finding the meanings of compound words such as "boathouse"; see also the Formal Notation for Conceptual Blending.

8.4 Some General Remarks

Web design has become one of the major career paths for computer science students. One reason is that web design is more difficult to outsource, because interviews, testing, etc. need to be done on site. Another factor is that web technology is evolving enormously quickly, so that recent grads are in well positioned because of having been recently educated in the latest ideas, such as the rise of groupware, distributed applications, and CSCW, and the importance of understanding the sociology of users, not just their psychology and physiology. It is sobering to realize that the first proposal for the web (from Tim Berners-Lee) was only in 1989, and that by far the greatest growth has been very recent. Therefore, much of the older web authoring advice is incomplete, site specific, misleading or badly outdated, as is well illustrated by a strange piece by Karp that I found, from the early days of the web. It is also notable that it can be really a lot of work to maintain a website; this effort should not be underestimated when thinking about setting up a site (as happened to me when I undertook to provide these class notes!).

It is interesting to look at the pre-history of the web, which includes early ideas by Vanever Bush, Ted Nelson, and Douglas Engelbart. Engelbart's work is particularly important, because he had already implemented all the major features of the (so called) personal computer revolution in the mid-60s, including the mouse (on which he holds a patent), windows, menus, and remote connections. Nelson's main contribution seems to have been enthusiasm and colorful terminology, especially "hyperlink" for what we now call links; Vanever Bush had this idea much earlier. I like the term hyperchaos for what bad hyperlink design can deliver to users (see Towards a Theory of Ethical Linking, by Jeff White, for an example of this).

The fact that the web can effectively support such a wide variety of tasks is good news, but also bad news because it makes design more difficult. It is really amazing that you can find very specific facts (such as popular song lyrics), browse large areas for an overview (e.g. genetics), order any book in print, get bombarded with advertisements, meet new friends, wander at random into areas you never even knew existed, catch up on TV soap opera plots, get the latest headlines, find your homework assignments, make airline and hotel reservations, lose all your money in day trading, order a pizza, and more, all in the same medium. Solid research on web genres could be a valuable resource for designers.

To CSE 171 homepage
To the next section of the class notes
To the previous section of the class notes
Maintained by Joseph Goguen
© 2000 - 2005 Joseph Goguen, all rights reserved.
Last modified: Fri May 20 14:46:32 PDT 2005