Structure and Values in Music

This webpage is intended as an evolving rough draft of ideas for a project to be carried out in collaboration with Prof. David Borgo, with help from Prof. George Lewis. The project has several goals, one of which is to develop computational methods to identify formal structure in complex systems, using ideas from complexity and information theories. We are especially interested in applications to temporal systems, such as music, with some focus on improvised music. Another goal of the project is to take account of cognitive and even cultural factors beyond formal structure, and in particular, to use structural analysis and other methods to study the cultural values that are implicit in traditional frameworks for the analysis of music (see the Appendix to this note).

We believe that recent advances in mathematics, computer science, cognitive science, and sociology supply some previously missing ingredients, making it possible to combine theory and practice in new ways that should produce interesting new insights. For example, rapid advances in available computation power make it possible to do experiments that would have seemed ridiculous only a few years ago. We have in mind experiments in both the analysis and the synthesis of music, eventually leading up to real-time improvised music, which has been little studied. We also wish to develop more formal and computational approaches to culture, emotion, meaning and metaphor in music. We will draw upon modern ideas about complex systems, including measures of complexity, and perhaps chaos theory, as well as ideas from modern cognitive linguistics, such as embodied metaphor and blending.

We are working on an algorithm to compute minimum complexity descriptions of temporal sequences, based on dynamic programming, and we are exploring the mathematical properties of the resulting complexity measure, denoted H(S), where S is the system under consideration. These properties are very pleasing, in that all the major equations and inequations of classical Shannon information theory seem to be satisfied, even though this measure goes well beyond Shannon information, and even beyond Komolgorov complexity, in that it measures hierarchical structural complexity, assigning (under various metaphors) an "entropy" or "measure of perceptual difficulty" or "temperature" to each segment of a sequence (or more generally, to each subsystem of a complex system), by analyzing it as a composition of transformations of simpler subsystems. This can reveal not only small grain, but also large grain structures, and how they interact. Some mathematical details can be found in [1], though research has now advanced beyond what is described there, and some presuppositions of that paper no longer seem acceptable.

Once the algorithm is implemented and debugged, we will use it in some simple experiments, and then build on that experience, gradually moving to more and more complex experiments. We plan to start with simple melodies (e.g., from nursery rhymes), and work our way up, with (for example) Charlie Parker solos as an interesting intermediate step, towards a desired end point of contemporary group improvised music. We hypothesize that descriptions of a piece of music built from psychologically and culturally appropriate components, constitute "understandings" of that piece, and that a minimum complexity description gives the "best" understanding, which necessarily includes a precise structural analysis of the piece. For example, a repeat of a (not too unwieldy) segment should have a very low additional complexity, and a transposition of it by a fifth or a fourth should have very little more, whereas more perceptually difficult transformations, such as retrograde inversion, would involve significantly more cognitive effort. These differences are reflected by assigning different "weights" to the transformations to which they correspond.

It will be important to study the conditional complexity function H(S'|S), which intuitively measures the additional effort needed to understand S' given that S is already understood, because this will enable us to detect the boundaries of important structural units, by detecting local minima of this function. For example, in an AABA form, there will be a sharp drop in complexity at the boundary between the first and the second A unit, and a sharp rise at the boundary between the second A and the B unit. It is significant that the behavior of the information function H(S'|S) varies as S' and S vary in size and location over a given piece. It is hoped that by adjusting these and other parameters, a reasonable correspondence can be constructed to the experience of real listeners; one such thought is to include some effects of long term memory in S. Each such choice will give rise to a complexity profile for a piece, which measures the cognitive effort required at each moment, to fully understand the structure that has been uncovered by the analysis. It is important to notice that the results of analysis also crucially depend on how source material is encoded for computer analysis, for example, whether micro-tonal and micro-timbral inflections are included (again, see the Appendix for more discussion).

Among many other concepts that can be defined within this theory, a measure of how much one structure S resembles another S', given by the ratio H(S|S')/H(S'), seems particularly interesting for thinking about improvisation, where S, S' are segments over the same temporal interval, produced by two different improvisers, since it measures the extent to which one is following, or is "influenced by," the other. Notice that this measure is not symmetric, and hence is closer to our intuition for this application than the usual statistical concept of correlation, which is symmetric.

The cognitive science component of this project includes recently developed theories of metaphor and conceptual blending, as well as more conventional approaches to musical perception and psychophysics. This in turn suggests many possibilities in experimental psychology, and even neuro-science. For example, it would be interesting to do experiments to determine the relative complexities of various transpositions, and it should even be possible to relate these to brain states by using fMRI techniques, since it is reasonable to conjecture that resonance phenomena among related neural groups play an important role in the perception of relative pitch; similar experiments could be done with chord changes, and other musical phenomena, such as rhythm.

This connects with work of Fauconnier and Turner on metaphor and conceptual blending, by viewing the transformation of musical material as a kind of metaphor, and viewing the composition of musical elements as a kind of blending. Moreover, it seems possible to connect music with extra-musical elements through appropriate mappings of conceptual spaces (which are a generalized form of metaphor), and it seems that these could include cultural and even emotional elements. One important application of blending would certainly be to study combinations of music with text. It seems that a recent formalization of metaphor and blending based on ideas from category theory could be used to develop appropriate algorithms (see [2]). Indeed, the added generality of [2] over the pioneering work of Fauconnier and Turner seems to be needed, because the "conceptual spaces" that are involved require more complex relationships than are supported in their approach. We note that "colimits" in the sense of category theory are involved in both the hierarchical complexity of sytems, and in the blending of conceptual spaces, or more generally, of semiotic spaces [2].

An interesting extension of this would be to "reverse engineer" various analytic schemes that are traditionally used in music, in order to determine the value systems upon which they are based. For example, traditional classical notation does not provide for changes in timbre that are important in contemporary jazz improvisation. This is related to our previous research on uncovering the value systems that are implicit in artificial systems, such as mathematical proofs [3] and database interfaces [4]. Some foundations for this in sociology, particularly ethnomethodology, are given in [5].

While theoretical developments in sociology, cognitive science, computer science, and mathematics can supply a general theoretical framework and practical algorithms, it is still necessary to ground these in musical practice to instantiate the general concepts in a meaningful way. Careful experimentation is needed to identify the correct basic components, transformations, and weights for building music in various styles, and an informed aesthetics will have to play a dominant role in evaluating the results of experiments. The same will apply to attempts to connect with the cultural context of music. It should also be expected that applications of the theory to musical practice will lead to many further modifications and developments, including a clearer understanding the limitations of the theory, which will be important for avoiding exaggerated claims. For example, further work will be required to determine whether the theory can be directly applied to polyphonic music, or would have to be extended in some way.

While concerns such as the psychologically meaningful components of music, and the cultural values implicit in various styles, are not novel, the approach outlined above promises to bring a new level of precision and integration to such analyses. On the other hand, there seems to have been little or no formal work on combinations of text and music, or on metaphor in music; it could be very exiciting to explore these areas, since it seems likely that interesting results could be achieved without huge effort.

A final remark is that we seek to avoid the common mistake of identifying a formal analysis with the thing itself. No transcription can ever capture all the nuances of an actual performance; not even spectral analysis captures everything. Similarly, our structural analysis method is just another tool for exploring the nature of music. The philosophical error that underlies such mistakes as identifying a symphony with its score, ignoring all the cultural and historical factors that go into interpreting it (including the mediation by actual musicians and musical instruments), is akin to that in mathematics, when a transcendental ontological status is claimed for all mathematical objects, however abstract. We have found that our desire to develop improved tools for structural analysis is often confused with a desire to reduce music to some form of mathematics, whereas on the contrary, we are interested in exploring the limitations of all such tools, including standard musical notation, by relating them more closely to their cultural and historical contexts. In particular, our use of an information measure is not an attempt to reduce the subtleties of music to a single number, but rather to provide a new tool for uncovering structure in music.


[1] Joseph Goguen. Complexity of Hierarchically Organized Systems and the Structure of Musical Experiences, International Journal of General Systems Theory, Volume 3, Number 4, 1977, pages 237-251. Originally in UCLA Computer Science Department Quarterly, October 1975, pages 51-88. This paper develops a mathematical information theory based on minimum complexity hierarchical system descriptions, with applications to musical systems, including electronic music compositions. (Unfortunately, this paper is not available on line.)

[2] Joseph Goguen. An Introduction to Algebraic Semiotics, with Application to User Interface Design, in Computation for Metaphors, Analogy and Agents, edited by Chrystopher Nehaniv, in Springer Lecture Notes in Artificial Intelligence, Volume 1562, 1999, pages 242-291. This is the basic paper on algebraic semiotics, including the mathematical theory based on 3/2-categories and 3/2-colimits, with many examples, especially from user interface design. There is also a pdf version.

[3] Joseph Goguen. Reality and Human Values in Mathematics, by Joseph Goguen, submitted to Social Studies of Science. Applies discourse analysis (in the sense of sociolinguistics), cognitive linguistics and ethnomethodology to mathematical discourse, showing how the reality of mathematical objects is achieved, and the role of values in this process; a pdf version is also available, as are slides for an early lecture version, entitled The Reality of Mathematical Objects, at the UCSD Science Studies Colloquium, 20 November 2000. Warning: You may have to change the orientation of the pages from landscape to seascape.

[4] Joseph Goguen. The Ethics of Databases, a sketch of a paper for an invited presentation at the 1999 Annual Meeting, 29 October 1999, of the Society for Social Studies of Science; a separate abstract is also available. Lecture also given 6 December 1999 at the Annenberg Center of the University of Southern California, in the Confronting Convergence Seminar series, and to appear in a book of the same title. This is a naturalistic study of the values embedded in the user interfaces to web search engines.

[5] Joseph Goguen. Towards a Social, Ethical Theory of Information, in Social Science Research, Technical Systems and Cooperative Work, edited by Geoffrey Bowker, Les Gasser, Leigh Star and William Turner (Erlbaum, 1997) pages 27-56. A pdf version is also available. Presents a theory of information based on social interaction, especially ethnomethodology, and shows how values arise naturally in such a theory.

An Appendix on Musical Notation

These are some notes towards a fuller discussion.

  1. Western classical music notation has an interesting evolution, with a clear trend towards exercising progressively more control over the performer. Early music notation did not even specify duration, let alone amplitude. Some recent notation even attempts to specify intonation to various extents, although this can be very difficult, as musicians like Miles Davis make painfully (and beautifully) clear.
  2. Notation is never value neutral: it involves making choices of what to notate, that is, of what is important; even the choice to notate involves the value of exercising certain forms of control over performers. The situation is similar for transcription.
  3. There is no way for analysis to achieve complete value neutrality, but there are various ways to get closer, which can then expose values in less neutral forms of analysis. The approach is to expose more of the details that are ignored by other analyses, but that are important for "alternative" forms of music (note the highly value laden term "alternative"!).
  4. One idea is to use closely timed sequences of spectral analyses instead of transcription as a basis for discussion. Of course any discussion of such data is a also form of transcription, that necessarily involves value laden choices. Even spectral analysis involves value laden choices, for example, to leave out interactions among performers on stage.
  5. These ideas apply not only to sound quality, but also to structure. Structure encodes values; what counts as structure is a value. For example, the choice between symmetry and assymmetry is value-laden; even what counts as symmetry is culturally determined.
  6. A working hypothesis is that good jazz improvisation involves forms of large grain structure that are quite different from those of classical music; hence we hope that our minimum complexity algorithm can make these structures more explicit. In any case, it seems clear that there are interesting small grain structures that we can make both explicit and precise. Of course, the very idea of looking for such structures involves values, which to some extent are alien to improvised music.
  7. Another research topic would be to identify patterns of interaction among improvising musicians. This might make dual use of video tapes and correlational structural analyses.

To the research projects index page
To the social research projects index page

Maintained by Joseph Goguen
Last modified: Wed Dec 3 09:45:53 PST 2003