This chapter on direct manipulation is perhaps a bit less boring than previous chapters. Shniederman says direct manipulation is characterized by the following features: (1) analogical representation; (2) incremental operation; (3) reversibility; (4) physical action instead of syntax; (5) immediate visibility of results; and (6) graphic form (p.229). About the limitation to visibility in (5) and graphic sin (6), it seems to me that representations can involve other senses than just sight.
I would especially emphasize points (1) and (4); see also Shneiderman's discussion of analogy in the last paragraph of section 6.3.1 (p.205). Leibniz, who was no doubt thinking of mathematical notation, puts it very well (quoted on p.185 of Shniederman):
In signs, one sees an advantage for discovery that is greatest when they express the exact nature of a thing briefly and, as it were, picture it; then, indeed, the labor of thought is wonderfully diminished.A good example is the difference in plane geometry between doing proofs with diagrams and doing them with axioms (see p.203).
For this class, it is very interesting to notice that semiotics gives a deeper insight into the nature of direct manipulation, seeing it as an indexicality of motion, often reinforced by a specific kind of iconicity, called diagramatic iconicity by Peirce, where the geometric structure of the sign corresponds to the structure of its object (good examples are geographic information systems, where the structure of a computer graphics map corresponds to the structure of some part of the surface of the earth). In this chapter, Shneiderman systematically confuses the essentially semiotic nature of direct manipulation with the technologies (or in more semiotic terms, "media") that may be used to implement it. Having a clear semiotic conception of direct manipulation allows us to avoid this error. For example, it is perfectly possible to have a virtual reality interface to plain old DOS, complete with a haptic clicking keyboard and a virtual VT100 screen with bright glowing green characters floating in space in front of you!
The remark (p.197) that in computer games, machine generated messages are likely to be more annoying than helpful, and that users prefer feedback like a high score display, is suggestive, and can be explained by CSCW principles. Slider controls are mentioned in several places (p.202, p.214) and probably are not well enough known. The principles of virtuality and transparency (p.202) are both important. The list of problems with direct manipulation on p.204 is very important; please read it at least three times. The "tatami" project in my lab ran into some of these problems with a direct manipulation interface that we built for proofs; it turned out that displaying the proof tree was useless for large proofs, because of the size and homogeneity of the display.
Piaget's work is in many ways outdated , so the material on page 207 should be taken with some grains of salt. The discussion of WIMP (p.207) is amusing but not very substantive. The guidelines for icon design (pp.208-9) are superficial but suggestive. Some of the remarks about emacs do not do justice to that amazingly flexible tool (p.210).
The last paragraph of section 6.5 (p.213) is interesting in connection with Lanier's piece; please reread Lanier, and see whether you think Shneiderman might not be saying the same thing in his very different way. The remark about notations for representing relations among residents of a home (p.217) seems a bit off the wall, but is interesting to think about from an ethical point of view; consider Bill Gates's house. The material on virtual environments may sound far out now, but I believe it should be taken seriously because it will become increasingly important (pp.221-228). It is interesting to note that the term "virtual reality" and the data glove (p.227) were invented by Lanier. The contrast between "immersive" and "looking-at" experience (bottom p.222) is interesting. Augmented reality (p.225) already has important industrial applications (e.g., at Boeing) and no doubt will have more. Situatedly aware shopping carts (p.225) do not appeal to me.
This short chapter contains several statements that strongly support the CSCW position, such as
The introversion and isolation of early computer users has given way to lively online communities of busily interacting dyads and bustling crowds of chatty users. (p.478)The first sentence above is the first sentence of this chapter, and the second is the first sentence of the practitioner's summary. But Shneiderman does not seem to have realized how far reaching the implications of such statements really are. Moreover the first sentence is a very incomplete characterization of how computers are used today; consider for example the very rapidly expanding applications to commerce.
Computing has become a social process. (p.502)
Teams of people often work together and use complex shared technology. (p.494)
Nevertheless, there are valuable things in this chapter. The 4 element table on p.481 can be a basic starting point for considering any communication system. The short list of things that can cause cooperative systems to fail (p.481) should be noted, and read several times, though it is incomplete:
disparity between who does the work and who gets the benefit; threats to existing political power structures; insufficient critical mass of users who have convenient access; violation of social taboos; and rigidity that counters common practice or prevents exception handling.Today there are many case studies supporting these (and other) points, and much could and should be written.
The Coordinator (by Flores, Winograd, et al) is mentioned on p.484, but without citing the CSCW papers that analyze why it failed. One major reason is that people very often want to mitigate speech acts in order not to seem rude (see the online notes on speech acts), and having explicit speech act labels on messages forces them to either seem rude, or else risk being misunderstood. In fact, many users really hated the system, despite having undergone extensive "training" in its use.
"The dangers of junk electronic mail, ...users who fail to be polite, ... electronic snoopers, ... calculating opportunists" are mentioned (p.485) but the importance of these difficult social issues are far from being sufficiently emphasized. For synchronous distributed systems, ownership and control are noted as important concerns (p.488) but without much analysis. Much more should be said on both of these topics. MUDs are mentioned (p.490) but their growing routine use in business settings is not noted. The potential problems with video on p.491 should be noted: slow session startup and exit; difficulty of identifying speakers (the issue here is not just "distracting audio"); difficulty of making eye contact; changed social status; small image size; and potential invasion of privacy. The importance of a good audio channel is discussed on p. 493, but without saying why it is important. In fact, audio seems to provide the context within which video is interpreted, as can be verified by observing how sound tracks are used in movies.
The table on p.503 is very useful, but should have been discussed in some detail; also a lot is missing. The pluses and minuses of anonymity in various situations is discussed in several places and can be very important; much more should have been said here.
One reason this chapter comes to finally seem rather unsatisfying is suggested by some sentences at the very end (p.504), one of which is:
Although user-interface design of applications will be a necessary component, the larger and more difficult research problems lie in studying the social processes.Thus we see that Shneiderman considers user interface design separate from the social processes that it is supposed to support. This is precisely the disastrous attitude that has led to so many failures in developing large and/or complex systems.
One of the most important insights of Peirce, which does not often seem to be emphasized in the literature that discusses his work, is that meaning is relational, not just denotational, and in particular is generally highly dependent on context. There is also an important insight of Saussure that, in my opinion, has not been sufficiently emphasized, that sign systems are organized by systematic differences among signs; we can relate this to a famous saying of Gregory Bateson, that "information is a difference that makes a difference." Algebraic semiotics attempts to combine these two major insights (among others) into a precise formalism that can be applied to the practical engineering of sign systems, e.g., in user interface design.
One very basic insight that this formalism incorporates is that signs need not be the simple little things that we usually call "signs," but instead can be very complex, such as a book, or a series of books, or even a whole library; or a movie or series of movies. A further insight that algebraic semiotics pursues is the importance of studying errors, that is, badly designed sign systems, in order to better understand what it means for a sign system to be well designed.