CSE 171: User Interface Design: Social and Technical Issues
7. Database Interfaces

Despite its title, Chapter 15 of Shneiderman is mainly about user interfaces for databases, with an emphasis on direct manipulation graphical interfaces. The kind of visualization done in scientific and engineering applications, such as aerodynamic flow over a wing, is not discussed, although this is a very hot topic today; indeed, scientific visualization tools are part of a revolution in how science is being done, to such an extent that the very notion of a scientific model is changing. It may sound a bit far out now, but virtual reality interfaces to large databases could well become important in the future. And certainly, in another topic not discussed by Shneiderman, communication among databases is also becoming extremely important.

Shneiderman's (mostly implicit) criticisms (pp.515ff) of the interface designs of popular web search engines are interesting and insightful, and the pictures in this chapter are nice, especially the color pictures. The concise summary of database terminology (p.512) is good, especially if you never had a database course. Shneiderman's classification of task actions (pp.512-3) is suggestive but a bit vague. I like his four-phase framework (p.516 and the box on page 517); also his idea of organizing the discussion of interfaces by their underlying data structures (listed in the box on page 524) is genuinely helpful, though I dont think the organization is the best, and animation (i.e., changing the display over time) is not mentioned, though it is certainly important today, e.g., in scientific visualization, of course for the temporal dimension, but it is also for used for other dimensions. It is important to note that what Shneiderman classifies as 1D (i.e., linear) structures can have a great deal of additional sturcture attached to them, as his examples (p.524) clearly show. In particular, each of textual documents, program source code, and alphabetical lists of names can be seen (in some ways more) profitably as having a tree structure (e.g., programs have parse trees). Also, so-called geographical databases have become important, and combine several structures in interesting and complex ways.

Shneiderman's point about history (p.518) is good, though not always applicable; for example, it is probably not much use for web search engines. The idea he calls "dynamic queries" is almost too obvious to deserve a name. (I was surprised to see what may be a joke (however lame), on p.523.) The "mantra" on page 523,

Overview first, zoom and filter, then details on demand.
might also seem obvious, but as Shneiderman says, it has wide applicability and is easy to forget; on the other hand, there are also many situations where it does not apply. Please note that an overview is the image of a semiotic morphism from a source space of all the data, and that zooming, filtering, and selecting details are each manipulations of the semiotic morphism, that refine it to better approximate what the user wants; i.e., the slogan calls for designing not just a semiotic morphism, but a tool for refining semiotic morphisms.

Some of the data visualization schemes may seem obscure; did anyone understand Figures 15.11 or 15.16? The parallel coordinates idea is great; it can easily reveal interesting relationships, but only if there aren't too many lines or dimensions (Fig 15.9); it also needs a capability for rearranging the order of the dimensions. I also like hyperbolic trees (Fig 15.14) and treemaps (Fig 15.15). Shneiderman's explanation of why AND and OR are hard for users is right on the mark (p.542), and helps explain why boolean queries are not (obviously) supported by commercial web search engines (they are implicit); note that his argument is linguistic, not technical.

What Shneiderman calls collaborative filtering is part of what Ackerman considers more thoroughly in his Answer Garden paper. Except for the brief discussion of collaborative filtering, the social dimension is largely absent from this chapter, which is one reason for reading Ackerman's Answer Garden piece. Note that collaborative filtering applies a social process to refine a semiotic morphism.

Methodological Notes on Semiotic Morphisms and Visualization

One instructive and important class of applications for semiotic morphisms is the construction of overviews, summaries, or visualizations for large bodies of information. Here the source sign system is for information structured in some particular way; examples include books, source code for programs, websites, and scientific data. The following semiotic method for synthesis is recommended:

  1. determine the highest priority constructors and selectors for the top level sort;
  2. design a target sign system to display this information in the required medium;
  3. design a morphism to preserve the selected structure and omit the rest;
  4. if more information is desired, then consider other levels with their highest priority constructors and selectors.
Quite likely you will want to iterate these steps, interleaved with some user testing (initially perhaps only with yourself or other design team members). Keep things as simple as you can, and remember that the semiotic structure of the source space may be quite different from its implementation structure, and that the point of an overview of a large dataset is to delete most of the information while preserving enough to survey the key points; principle F/C is very relevant here (see below). It is very likely that some experimentation with the output will be required to achieve good results; user trials, interviews, etc. are recommended. It may also be a good idea to create more than one initial design, and then to iterate through these steps for all designs simultaneously, using the quality orderings to compare them, and feed the most successful ideas back into the next iteration. Eventually of course you should eliminate all but one design.

Let's first consider books. If you look at the physical structure of a book, you will see that it has the most important information printed on its front cover and spine, and that inside it has pages; basic information on the cover and spine include title, author, edition number, and publisher; this information also appears on the title page inside, along with the date of publication (or possibly this is on the back of the title page - publishers may want to make the date harder to find, with the hope that users may not realize that a book has become old). Looking at the contents, you will see that chapters are (usually) the main structuring device, and that their main selectors give a chapter number, a title, and a page number; chapters are (usually) divided into sections, and possibly subsections, each of which also has a number, title, and page. I think these are the things that anyone would discover, even if they were not already familiar with books. Taking them as constructors and selectors, with target medium a small number of printed pages, yields exactly the very familiar form known as an outline. Notice that the entire content of the book has been lost (unless you count titles as content).

Another example is source code. Surely the most important structuring device is the division into files (if there is more than one file). To see what attributes (i.e., selectors) are important, we can check what the unix ls command displays under various options. For example, ls -lat will display the files in a directory, with their name, owner, size, and date of last modification. If we preserve these and display them in a natural way in a color graphics window, taking account of human perceptual capabilities, we can get something much like the display of Plate B1 (just after p.514) in Shneiderman, which displays the age of a file using color, and its size using length. Notice that the entire content of the program, i.e., all of its code, has been lost (however, it can be viewed with the zoom feature that this system also provides).

It is easy to find many other examples. Please recall also the detailed analysis of scrollbars that was given in class (and is briefly summarized in section 6 of the class notes). The conclusion is that algebraic semiotics is a powerful tool for designing and evaluating user interfaces in general, and database interfaces in particular.

The following four principles summarize some significant contributions that algebraic semiotics can make to the design process:

  1. Sort preservation: The most important sorts should be preserved, where their importance is given by the level ordering.
  2. Constructor preservation: The most important constructors should be preserved, where their importance is given by the priority orderings.
  3. Axiom preservation: The most important axioms should be preserved.
  4. F/C: When something must be sacrificed, it is preferable to preserve form at the expense of content.
Note that the first three principles correspond exactly to three main elements of semiotic spaces. These principles can be seen as refining step 3. of the semiotic method for design described above. Determining which axioms are most important can be difficult, because it is possible that some consequences of the given axioms are more important than the axioms themselves; moreover, it is possible to confuse unimportant properties of the way data in the source space is presented with important instrinstic properties of what the data means. (The ordering in principle 1. can be constructed as the lexicographic product of the "preserves at least as much as" orderings on sorts, indexed by levels; the ordering for principle 2. is similar, but more complex.)

The following outlines a recommended semiotic method for analysis of an interface:

  1. identify as much as is relevant of the social context of the interface and its system, including the goal of the interface, and the nature of the user community;
  2. identify key properties of the source and target sign systems, e.g. the major affordances of the source objects;
  3. identify the semiotic morphism involved;
  4. think about what is preserved by the morphism;
  5. consider whether elements of the display are symbolic, indexical, iconic, or diagrammatic iconic, and/or
  6. involve significant sensory-motor (image) schemas, and/or
  7. involve some blending.
It is fun to go around looking at the world using these tools as a lense; if you do this well, you will notice many interesting things that no one else has thought about. This can be very exciting! Perhaps it could even change your life, if you take it very seriously ....

Notes on Answer Garden 2: Merging Organizational Memory with Collaborative Help by Mark Ackerman

Perhaps the deepest point about this paper is signified by the numeral "2" in its title: the system described here is the result of an iterative design process, in which an earlier system, named Answer Garden, was subjected to a careful evaluation, based on the experience of actual users, and then the results of this evaluation were carefully analyzed, pinpointing certain weaknesses in some underlying assumptions, such as a sharp distinction between experts and ordinary users, leading to new assumptions and a new design based upon them. The result is a typical example of what ethnographers call situatedness, and what designers call site specificity: one learns important and interesting things by evaluating a system in the context of a particular community, but those things may not generalize to other communities. For example, distinguishing experts from users might be valid in a different context, and the escalation hierarchy that Ackerman designed for this particular community might well not work for a different community. On the other hand, software tools like those developed by Ackerman are valuable for building systems for other communities.

This paper addresses the important problem of integrating user communities with their computer systems. The old view of databases envisions a single user with a well formed query about a well understood and well structured collection of data. Just as HCI has evolved from a technical ergonomic level through pyschology to a social collaborative level, so databases are evolving towards taking better account of the communities in which they are embedded, including their shared goals and potential conflicts. The new view emphasizes the social side of sharing, restructuring and distilling information, and helping users help each other in various ways. Such tasks are especially important for huge databases of badly understood, poorly structured data. Thus, designing database systems is far from purely technical, and can easily fail from a lack of understanding the user community's structure and needs. Moreover, successful systems will evolve, because their user's needs (and many other things) will evolve; therefore they should be designed from the beginning to support evolution. Iterative prototyping, user scenarios (more generally "use cases"), usability testing, interviews, etc. are needed. All this was done by Ackerman in moving from his first version of Answer Garden to the second (AG2), which is based on a collection of modules that can be assembled in a variety of ways using Tcl/Tk.

A similar expansion of horizons is happening in many other areas of computer science, as people come more and more to realize that systems exist and must function within a social context, and that they can draw on that context to improve their operation in various ways. Ackerman's reconceptualization of a help system as a collective memory system illustrates the kind of rethinking that is going on in many areas. This trend is highly consistent with recent trends in software engineering, including generic architectures, modularization, libraries of reusable code, plug-ins, software patterns, etc., with the evolution of HCI, and with the evolution of computer science as a whole. For a commercial example, Amazon.com makes clever use of several forms of collaborative filtering.

Some specific techniques used in AG2 include its answer escalation hierarchy, anonymization, an engine to find experts, a statistics collection service, and support for collaborative authoring.

Some specific techniques used in AG2 include its answer escalation hierarchy, anonymization, an engine to find experts, a statistics collection service, and support for collaborative authoring. All these have potential applications in many other areas, though they are far from exhausting the range of useful modules; together they constitute a toolkit of "socializing" databases.

A dramatic example of a large distributed database system that raised significant social issues was Napster, which shared MP3 files over the internet. Over its short lifetime, this system has caused severe disruption of several campus computer networks, drew huge lawsuits from the music industry, and was absorbed by a large media industry conglomerate. However, this is far from the end of the story, since many informal peer-to-peer databases are still operating, and the music industry is being very agressive and paranoid.

To CSE 171 homepage
To the next section of the class notes
To the previous section of the class notes
Maintained by Joseph Goguen
© 2000 - 2003 Joseph Goguen, all rights reserved.
Last modified: Wed Jun 4 09:34:53 PDT 2003