CSE 171 Class Notes 7

CSE 171: User Interface Design: Social and Technical Issues

7. Data Interfaces, Social Groups, and Semiotic Morphisms

This section of the class notes considers the display of data collections, such as in databases, in relation to the social groups that use them, and the application of semiotic morphisms to improve them.

7.1 Data and Community

The old view of databases envisions a single user with a well formed query about a well understood and well structured collection of data. Just as HCI has evolved from a technical ergonomic level through pyschology to a social collaborative level, so databases are evolving towards taking better account of the communities in which they are embedded, including their shared goals and their potential conflicts. The new view emphasizes helping users to help each other in various ways, and more generally considers the social side of data collection, dissemination, and use; this implies that database system design today is far from being a purely technical activity. A particular topic is sometimes called collaborative filtering, in which prior use of data helps to determine how it will be presented to current users. Also, increasing competition means that systems can much more easily fail from a lack of understanding the user community's structure and needs. Two good commercial examples of systems that make clever use of several forms of collaborative filtering are Amazon.com and Google, which does so much less visibly.

A similar expansion of horizons is happening in many other areas of computer science, as people come more and more to realize that systems exist and must function within a social context, and that they can draw on that context to improve system operation in various ways. Ackerman's reconceptualization of a help system as a collective memory system (described in Section 7.3) illustrates the kind of rethinking that is going on in many areas, e.g., in software engineering, there are generic architectures, modularization, libraries of reusable code, plug-ins, software patterns, etc.; it is also consistent with the evolution of HCI.

A rather dramatic example of a large distributed database system that raised significant social issues is Napster, which shared MP3 files over the internet. Over its short lifetime, this system caused severe disruption of several campus computer networks, drew huge lawsuits from the music industry, and was absorbed by a large media industry conglomerate. However, this is far from the end of the story, since many informal peer-to-peer databases are still operating, and the music industry is being very paranoid and agressive about them. Meanwhile, Apple's iPod and other non-free services are gaining ground, and a new business model appears to be emerging for the entire music industry, though it is not yet clear what that will turn out to be.

It should also not be forgotten that any successful system must evolve, because its users' needs (and many other things) will evolve; therefore it should be designed from the beginning to support evolution. And of course, iterative prototyping, user scenarios (more generally "use cases"), usability testing, interviews, etc. should be employed.

7.2 Data Interfaces

One instructive and important class of applications for semiotic morphisms is the construction of overviews, summaries, or visualizations for (possibly large) collections of data. Here the source sign system consists of data structured in some particular way; examples include books, source code for programs, digital libraries, websites, scientific data, and databases of all kinds.

Such interfaces often have a direct manipulation flavor. The kind of visualization done in scientific and engineering applications, such as aerodynamic flow over a wing, is a hot topic today; indeed, scientific visualization tools are part of a revolution in how science is being done, to such an extent that the very notion of scientific model is changing (e.g., see the recent book by Stephen Wolfran). Communication among and federation of databases is also becoming important, e.g., with the semantic web. It may sound a bit far out now, but virtual reality interfaces to large databases could well become important in the future.

On p.523 of his popular text, Shneiderman gives the following "mantra":

Overview first, zoom and filter, then details on demand.

This might seem obvious, but as Shneiderman emphasizes, it is easy to forget; in fact, he repeats it 12 times in his book, once for each time he forgot it when he should have used it in some project. Although there are many situations where such a design can be used, there are also many situations where it does not apply. Please note that an overview is the image of a semiotic morphism from a source space of data, and that zooming, filtering, and selecting details are each manipulations of the semiotic morphism, modifying it to better approximate what the user wants; i.e., the slogan calls for designing not just a semiotic morphism, but a tool for defining semiotic morphisms; what this tool does is sometimes called filtering. Note that collaborative filtering can be considered the use of social processes to improve semiotic morphisms.

For the designer of a tool to support this kind of interactive construction of a visualization, the source space should be a theory of the semiotic morphisms that the tool supports, and morphisms from that source space will produce the sliders, menus, etc. with which users construct the visualization that that particular tool allows; thus there are two kinds of display, one for controlling the morphism, and one for displaying the result of that morphism on a particular dataset.

Some further related discussion is given in Information Visualization and Semiotic Morphisms, by Joseph Goguen and D. Fox Harrell, an informal introduction to semiotic morphisms applied to both analysis and design of information visualization, and see also The Ethics of Databases, a naturalistic study of the values embedded in web search engines.

7.3 Notes on Answer Garden 2: Merging Organizational Memory with Collaborative Help by Mark Ackerman

An important point about this paper is signified by the numeral "2" in its title: the system described here is the result of an iterative design process, in which an earlier system, named Answer Garden, was subjected to a careful evaluation, based on the experience of actual users, and then the results of this evaluation were carefully analyzed, pinpointing certain weaknesses in some underlying assumptions, such as a sharp distinction between experts and ordinary users, leading to new assumptions and a new design based upon them. The result is a typical example of what ethnographers call situatedness, and what designers call site specificity: one learns important and interesting things by evaluating a system in the context of a particular community, but those things may not generalize to other communities. For example, distinguishing experts from users might be valid in a different context, and the particular escalation hierarchy that Ackerman designed for this community might well not work for a different community. On the other hand, software tools like those developed by Ackerman are valuable for building systems for other communities, and several of the more abstract ideas are more generally valid, especially that of colaborative filtering.

The paper addresses the important problem of integrating user communities with their computer systems. Such tasks are especially important for huge databases of badly understood, poorly structured data. Specific techniques used in AG2 include its answer escalation hierarchy, anonymization, an engine to find experts, a statistics collection service, and support for collaborative authoring. All these have potential applications in many other areas, though they are far from exhausting the range of useful modules; together they constitute a toolkit for "socializing" databases. These features were added by Ackerman in moving from his first version of Answer Garden to the second (AG2), which is based on a collection of modules that can be assembled in a variety of ways using Tcl/Tk.

7.4 Algebraic Semiotic Design and Analysis

The following semiotic method for synthesis is recommended:

describe the source as a semiotic theory, including constuctors and level and priority orderings;
determine the highest priority constructors and selectors for the top level sort;
design a target sign system to display this information in the required medium;
design a morphism to preserve the selected structure and omit the rest;
if more information is desired, then consider other levels with their highest priority constructors and selectors.

Quite likely you will want to iterate these steps, interleaved with some user testing (initially perhaps only with yourself or other design team members). Keep things as simple as you can, and remember that the semiotic structure of the source space may have a quite different appearance from its implementation, and that the point of an overview of a large dataset is to delete most of the information while preserving enough to survey the key points; principle F/C is very relevant here (see below for an explanation of this principle). It is very likely that some experimentation will be required to achieve good results; user trials, interviews, etc. are recommended. It may also be a good idea to create more than one initial design, and then to iterate through these steps for all designs simultaneously, using the quality orderings to compare them, and feed the most successful ideas back into the next iteration. Eventually of course you should eliminate all but one design.

Let's first consider books. If you look at the physical structure of a book, you will see that it has the most important information printed on its front cover and spine, and that inside it has pages; basic information on the cover and spine include title, author, edition number, and publisher; this information also appears on the title page inside, along with the date of publication (or possibly this is on the back of the title page - publishers may want to make the date harder to find, with the hope that users may not realize that a book has become old). Looking at the contents, you will see that chapters are (usually) the main structuring device, and that their main selectors give a chapter number, a title, and a page number; chapters are (usually) divided into sections, and possibly subsections, each of which also has a number, title, and page. I think these are the things that anyone would eventually discover, even if they were not already familiar with books. Taking them as constructors and selectors, with target medium a small number of printed pages, yields exactly the very familiar form known as an outline. Notice that the entire content of the book has been lost (unless you count titles as content).

Another example is source code. Surely the most important structuring device is the division into files (if there is more than one file). To see what attributes (i.e., selectors) are important, we can check what the unix ls command displays under various options. For example, ls -lat will display the files in a directory, with their name, owner, size, and date of last modification. If we preserve these and display them in a natural way in a color graphics window, taking account of human perceptual capabilities, we can get something much like the code browser built at Bell Labs , which displays file structure in blocks, file age using color, and file size using block size. Notice that the entire content of the program, i.e., all of its code, has been lost (however, it can be viewed with the zoom feature that this system also provides).

It is easy to find many other examples, such as the detailed analysis of scrollbars given in Semiotic Morphisms, Representations, and Blending for User Interface Design and briefly summarized in section 6 of the class notes. The conclusion is that algebraic semiotics is a powerful tool for designing and evaluating user interfaces in general, and interfaces to collections of data in particular (this phrasing is intended to include not just databases, but also file systems, digital libraries, etc.).

The following four principles summarize some significant contributions that algebraic semiotics can make to the design process:

Sort preservation: The most important sorts should be preserved, where their importance is given by the level ordering.
Constructor preservation: The most important constructors should be preserved, where their importance is given by the priority orderings.
Axiom preservation: The most important axioms should be preserved.
F/C: When something must be sacrificed, it is preferable to preserve form at the expense of content.

Note that the first three principles correspond exactly to three main elements of semiotic spaces. These principles can be seen as refining step 3. of the semiotic method for design described above. Determining which axioms are most important can be difficult, because it is possible that some consequences of the given axioms are more important than the axioms themselves; moreover, it is all too easy to confuse unimportant properties of the way data in the source space happens to be presented with important instrinstic properties of what the data means. (The ordering in principle 1. can be constructed as the lexicographic product of the "preserves at least as much as" orderings on sorts, indexed by levels; the ordering for principle 2. is similar, but more complex; see the definition on page 8 of Semiotic Morphisms, Representations, and Blending for User Interface Design.)

Although Principle F/C is probably the most important, and at this time is certainly the most thoroughly studied and supported, there are three other principles that deserve attention, although the range of their applicability has not yet been carefully examined: Principle HL/LL says it is more important to preserve higher levels than lower levels; Principle HL/C says it is more important to preserve high levels than content; Principle P/C says it is more important to preserve priorities than content.

The following outlines a recommended semiotic method for analysis of an interface; it resembles what is done in contemporary semiotic analyses in the humanities, but is both more limited and much more precise. Note that essentially anything can be regarded as an "interface" for the purposes of this method.

identify as much as is relevant of the social context of the interface and its system, including the goal of the interface, and the nature of the user community;
identify key properties of the source and target sign systems, e.g. the major affordances of the source objects;
identify the semiotic morphism involved;
think about what is preserved by the morphism;
consider whether elements of the display are symbolic, indexical, iconic, or diagrammatic iconic, and/or
involve significant sensory-motor (image) schemas, and/or
involve some blending.

It is fun to go around looking at the world using these tools as a lense; if you do this well, it is quite likely you will notice interesting things that no one else has ever thought about. This can be very exciting! Perhaps it could even change your life, if you take it very seriously ....

To CSE 171 homepage
To the next section of the class notes
To the previous section of the class notes