CSE 271: User Interface Design: Social and Technical Issues
Notes for Eighth Meeting (25 Feb 98)
Notes on Chapter 15 of Shneiderman

This chapter mainly discusses database interfaces, with an emphasis on graphics and dynamic queries (cf p.518), a form of direct manipulation. (The kind of visualization done in scientific and engineering applications like aerodynamic flow over a wing is not discussed at all.) His (implicit) criticisms (pp.515ff) of the popular web search engines are interesting and insightful (and perhaps commercially motivated), and the pictures are nice, especially the color pictures. The concise summary of database terminology (p.512) is good, especially if you never had a database course. As usual with Shneiderman's taxonomies, the classification of task actions (pp.512-3) is suggestive but vague. However, I do like his four-phase framework (p.516), and he's also right about history buffers (p.518).

Some of these data visualization schemes seemed pretty obscure to me; did anyone understand Figures 15.5, 15.11 or 15.16? The parallel coordinates idea is great - but only if there aren't too many lines (Fig 15.9); I also like hyperbolic trees (Fig 15.14) and treemaps (Fig 15.15). Shneiderman's explanation of why AND and OR are hard for users is right on the mark (p.542). What he calls "collaborative filtering" is part of what Ackerman does much more nicely in his Answer Garden paper. I was really surprised to see what may be a joke (however lame), on p.523.


Notes on Answer Garden 2:Merging Organizational Memory with Collaborative Help by Mark Ackerman

I'd like to note that this paper is an example of an important phenomenon, namely integrating user communities with their computer systems. The old view of databases envisions a single user with a well formed query about a well understood and well structured collection of data. The new view envisions a community of users with (at least some) shared goals, who add information, share information, restructure information, and help each other in various ways to deal with a possibly huge, poorly structured and badly understood mass of data. Designing such a system is far from purely technical, because it requires understanding the user community's structure and needs. Moreover, we now know that successful systems are going to evolve, because their user's needs will evolve (among other things); therefore systems should be designed from the beginning to support such evolution. Iterative prototyping, user scenarios (more generally "use cases"), usability testing, interviews, etc. should be used. And this is exactly what Ackerman did in moving from the first to the second version of Answer Garden.

Not only that, but the same kind of expansion of horizon is happening in many other areas of computer science, as people come more and more to realize that systems exist and must function within a social context, and that they can draw on that context to improve their operation in various ways. Ackerman's reconceptualization of a help system as a collective memory system illustrates the kind of rethinking that is going on in many areas. This trend is highly consistent with recent trends in software engineering, including generic architectures, modularization, libraries of reusable code, plug-ins, etc.

Some specific techniques used in AG2 include taking advantage of locality, an answer escalation hierarchy, anonymization, an engine to find experts, a statistics collection service, and support for collaborative authoring. All of these have applications in many other areas, though they are far from exhausting the range of useful devices.


Notes on Chapter 6 of Latour

This chapter focuses much more on technical issues, and software finally makes its entrance; it is interesting that there is also significantly more drama in the writing (e.g., Norbert goes on sick leave after encountering Frankenstein and his monster in a dream) and that the jokes tend more towards literary allusions. We get some good reminders of past techniques:

Follow the actors. (p.204)

Although charged by humanists with the sin of being "simply" efficient ..., "totally" devoid of goals, mechanisms nevertheless absorb our compromises, our desires, our spirit, and our morality - and silence them. (p.206)

Technology is sociology extended by other means. (p.210)

(The last line is a humorous variation on a famous quote from von Clausowitz's book on war.) The explanation for the paired cars finally appears (p.206), with much other interesting engineering detail, such as the adjustable mobile sector or CMD (p.209), the design document classification scheme (p.216), redundant coding (p.225), details of the CET track (p.228) and onboard control (p.229), overall system design (p.231), how speed and distance are determined with reliability (pp.234-46), communication protocols (pp.237-9), and more, all interlaced with an astonishing identification of the student with Aramis, and a grumpy dictatorial personification of Norbert as a controller of the system. Did anyone figure out what "phonic wheels" are? The bad dream is pp.248-50, and I suppose the document on p. 248 is a joke of sorts (at the expense of computer scientists, as so often happens). There's also more on love at the end of this chapter; I'm beginning to think he might mean it.

Returning to the sociology of technology, there is more on how objects only exist when they hold humans and nonhumans together on pp.212-3, and more about how spirit and matter mingle on pp.222-3. There is a wonderful discussion of metaphors (called "projections") on pp.225-7.

If you haven't read Mary Shelley's Frankenstein, I hope this will chapter motivate you to do so; it really is a classic, foreshadowing much of our late 20th century sensibility.


Notes on Class Discussion

A paper can be viewed as a user interface. As such, my paper Semiotic Morphisms was designed for a completely different user community than those who are taking CSE 271, so we will all have to do some translation work to make this communication succeed.

Let's first go over the parts of Definition 1, of a sign system, using some simple examples, starting with the very simple time of day. For computer scientists it may help to think of this as an abstract data type having just one sort, namely time, and just two constructors, one a constant time 0 (for midnight), and the other a successor operation s, where for a time t, s(t) is the next minute. There are no subsorts, data sorts, levels, or priorities. But there is an important axiom,

     s ^1440 (M) = M 
where ^1440 indicates 1440 applications of s, or more prosaically,
     s ^1440 (0) = 0 .
These axioms capture the cyclic nature of time over days. Any reasonable representation for time of day must respect this axiom. Let's denote this sign system S1.

Another example is a 24 x 80 display for a simple line-oriented text editor. The main sorts of interest here are: char (for character), line, and screen. The sort char has two important subsorts: alphnum (alphanumeric) and spec (special); and alphanum has subsorts alpha and num. This gives the following graph for the subsort relation,

              char
             /    \
            /      \
       alphanum   spec
        /   \
       /     \
     alpha   num
where of course alpha and num are also subsorts of char. These sorts also have levels in a natural way: screen is the most important and therefore has level 1, line has level 2, char has level 3, alphanum and spec have level 4, and alpha and num have level 5 (or we could give all subsorts of char level 4, or even 3 - this is a bit arbitrary until we have a clear application in mind). In addition, there are data sorts for data types that do not change but are needed to describe signs: these include at least the natural numbers, and possibly colors, fonts, etc., depending on the capability we want to give this system.

The constructors for this sign system let a line be any concatenation of up to 80 characters, and let a screen be any concatenation of up to 24 lines. (Priority is not an issue because there is just one constructor for each level.) We can define a length function on each sort having level above the characters, and then express the constraints "up to 24" and "up to 80" with axioms. Let us denote this sign system S2.

If we want to display texts on this screen, we also need to define a sign system for texts. The sorts here would be char, word, sentence, and text, in addition to data sorts and the subsorts of char as in S2 above. Text will be level 1, sentence level 2, word level 3, and character level 4. There are different choices for the constructors, but one simple one is to allow any concatenation of alphanum characters to be a word, any concatenation of words to be a sentence, and any concatenation of sentences to be a text. Let us denote this sign system S3. Clearly there will be many different ways to display texts on a screen; each will be a different semiotic morphism; we will come back to this later.

A somewhat different example of a sign system is given by simple parsed sentences, i.e. sentences with their "part of speech" explicitly given. The most familiar way to describe these is probably with a context free grammar, where we would have rules like

     S  -> NP VP
     NP -> N
     NP -> Det N
     VP -> V
     VP -> V PP
     PP -> P NP
     .....
The "parts of speech" S, NP, VP, etc. are the sorts of this sign system, and the rules are its constructors. For example, the first rule says that a S can be constructed from a NP and a VP. There should also be some constants of the various sorts, such as
     N -> time
     N -> arrow
     V -> flies
     Det -> an
     Det -> the
     P -> like
     ......
Viewed as operations, we get the following:
     NP VP -> S
     N     -> NP
     N Det -> N
     V     -> VP
     V PP  -> VP
     P NP  -> PP
     .....
     time  -> N
     .....  
which "construct" something from its parts. Notice that it would really be a better use of the machinery that we have available to regard N as a subsort of NP and V as a subsort of VP, than to have these monadic operations N -> NP and V -> VP. Let's call this sign system S4. It is quite abstract, giving what computer scientists call "abstract syntax" for sentences, without saying how they are to be represented.

Nevertheless, we can get sentences, such as "time flies like an arrow". However, this linear way of representating a sentence fails to show its "syntactic structure". This is traditionally done with trees, as in

               S
              / \
            NP   VP
           /    /  \
          N    V    PP
         /     |    / \
       time flies  P   NP
                  /   /  \
               like an  arrow

So called "bracket notation" can also be used to show the syntactic structure of sentences, as illustrated in Semiotic Morphisms.

Before moving on to semiotic morphisms, let's talk a just bit more about Definition 1. You might have expected that the definition of sign system would call for giving some set of signs for each sort; instead, it gives a language for talking about such sets. A logician would say that our sign systems are theories rather than models. The distinction involved is that a theory provides a language, while a model of a theory provides interpretations for the things in the language: sorts are interpreted as sets; constant symbols are interpreted as elements; constructors are interpreted as functions, etc. This allows a lot of flexibility for what can be a model - but we do need to exclude models where two different terms (terms are compositions of constructors) denote the same thing; otherwise two different times of day might be the same! Definitely not allowed - but see the clock example below where it does seem to happen! This is called the no confusion condition.

Now let's consider semiotic morphisms, Definition 2 of the paper Semiotic Morphisms; recall that the purpose of semiotic morphisms is to provide a way to describe representations, which are ways of mapping signs in one system into signs in another system. These are supposed to include metaphors as well as representations in the more familiar user interface design sense. Just as we defined sign systems as theories rather than models, so their mappings translate from the language of one sign system to the language of another, instead of just translating the concrete signs in a model. If this sounds a bit indirect, well, it is; but it has some advantages over a model based approach to representations.

A good semiotic morphism should preserve as much of the structure in its source sign system as possible. Certainly it should map sorts to sorts, subsorts to subsorts, data sorts to data sorts, constants to constants, constructors to constructors, etc. But it turns out that in many real world examples, not everything can be preserved. So these should all be partial maps. Axioms should be preserved - but again in practice, sometimes not all axioms are preserved. The extent to which things are preserved provides a way of comparing the quality of semiotic morphisms (this is what Definition 3 is about).

As a first example of a semiotic morphism, suppose we want to represent time-of-day (S1) in the little screen (S2). Clearly there are many ways to do this; all of them must map the sort time to the sort screen, map the constructor 0 to some string of (less than 25) strings of (less than 81) characters, and map the constructor s to a function sending each such string of strings to some other string of strings. There isn't anything else to preserve in this very simple example except the axiom, which however is very important.

Recall that the items of abstract syntax in S1 are strings of up to 1439 s's followed by a single 0. One simple representation just maps these strings directly to strings of strings of s's plus a final 0, such that the total number os s's is the same. Let N(t) be the number of s's in some t from S1. Let Q(t) and R(t) be the quotient and remainder after dividing N(t) by 80. Then there will be Q(t) lines of 80 s's followed by one line of R(t) s's and a final 0. This is guaranteed to fit on our screen because Q(1439) = 17 is less than 24, and R(t) + 1 < 25. For humans, this representation is so detailed that it is more or less analog: I think after getting familiar with it, a user would have a "feel" for the approximate number of (these strange 80 minute) hours and of minutes.

One obvious representation just displays N(t) in decimal notation, giving a string of 1 to 4 decimal digits. This is very different from our usual representations; but we could imagine some strange culture that divides its days into 18 "hours" each having 100 "minutes", except the last, which has 40. (Actually, this is no stranger than what we do with our months!) Here N(0) is 0, and s just adds 1, except that s(1439) = 0.

A more familiar representation is constructed as follows: Let N1 and N2 be the quotient and remainder of N divided by 60, both in base 10, with 0's added in front if necessary so that each has exactly 2 digits. Now form the string of characters "N1 : N2". This is the so-called "military" representation of time; let's denote it by M. Then M(0) = 00:00, and of course you know how s goes. The spoken variant of military time has the form "N1 hundred N2 hours" (unless N2 = 00, in which case it is omitted). The use of "hundred" and "hours" may seem odd here, because it isn't hundreds and it isn't hours! - but at least it's clear - and that's the point. Notice that this representation has been defined as a composition of N with a re-representation of S1 to itself.

I think you can now see for yourself how to construct other representations of time as semiotic morphisms, including even the "analog" representation of a clock face. Here 0 has both hands up, satisfaction of the axiom follows because something even stronger is true, namely s(719) = 0, which is built into the circular nature of this geometrical representation. This is an example where the axiom seems to fail - but does it really? Think about why representation modulo 720 works, but (say) modulo 120 or modulo 300 do not work; what about modulo 360? Why? Hint: Think about the context of use of the representation.

Now let's imagine someone who has to design a text editor based on the screen S2; their job is to construct a semiotic morphism E from S3 to S2. One issues to be addressed is the correct placement of spaces, so that words are always separated, and sentences end with a period followed by two spaces. The designer will also have to decide what to do about words that want to go past the end of a line, e.g., to wraparound, hyphenate, or fill in with extra spaces. The limit of 24 lines will also pose problems.

If instead of S3 we have S4 available for input to S2, then a designer can consider some more sophisticated issues, such as automatic insertion of commas. It may also be interesting to think about representations from S4 to S3. The most obvious and familiar one just gives the unparsed form of the parsed sentence; but again we can imagine adding some punctuation.

After this, we discussed blends, which involve two semiotic morphisms having a common source, called the generic space, and with targets called the input spaces, and with two more semiotic morphisms from the two input spaces to a blend space.



                  B
                 / \
                /   \
              I1     I2
                \   /
                 \ /
                  G 

We discussed examples like house boat, road kill, artificial life, and computer virus. Then we discussed how so called "oxymorons" (like military intelligence) are really a form of humor that is funny because there are two different blends, one of which is in some sense contradictory. Notice that "boat house" comes out having a different meaning than house boat!

It is an amazing fact about the human mind that we do a lot of our reasoning in everyday life using metaphors (which are semiotic morphisms) and blends of metaphors, rather than anything that would be recognized as formal logic. (This is a recent finding of cognitive science.)


To CSE 271 homepage
To my home page
10 March 1998