The Sight-Reading Tutor

UCSD, 2007 - 2008
Diane Hu, Jie Cheng, Lawrence Saul


Chieh-Chieh Cheng, Diane J. Hu, and Lawrence K. Saul, "Nonnegative Matrix Factorization for Real Time Musical Analysis and Sight-Reading Evaluation," Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP-08), Las Vegas, NV


Sight-reading is the ability to read and perform music from a written score with little or no preparation. Though an integral part of musicianship, it is rarely or minimally addressed in traditional music lessons. In this paper, we describe a real-time system for sight-reading evaluation of solo instrumental music. The system is trained to recognize monophonic and polyphonic music from acoustic instruments without digital pickups. The pattern-matching in the back-end is achieved by nonnegative matrix factorization, an algorithm that represents notes as combinations of learned templates and chords as combinations of single notes. As part of the user interface, an animated musical score provides beginning musicians with instant visual feedback as they practice to improve their sight-reading.


This project was initially inspired by the somewhat tedious chore of practicing sight-reading with a young child. When practicing to sight-read, a beginning musician generally requires a teacher to spend long periods of time with him or her, giving feedback and correction. The idea behind this program is to provide a new tool to facilitate sight-reading practice when human teachers are not present.

The front-end graphical interface is designed specifically to facilitate sight-reading practice in a way that agrees with commonly accepted pedagogical methods. The machine displays an animated score, "listens" to the player's instrument, and provides instant visual feedback distinguishing correctly versus incorrectly played notes. While many existing systems have related goals, our application distinguishes itself by (1) requiring no digital pickups or electronic instruments, (2) targeting the needs of beginning musicians, and (3) providing immediate, visual feedback.

The back-end of this system must operate in real-time to determine which notes have been correctly played by the user; we use NMF to learn nonnegative basis templates for each note and to evaluate whether the sound from the user's musical instrument matches the notes on a given musical score.

I. Importance of sight-reading

Sight-reading is the ability to perform music from a score with no preparation or previous acquaintance with the music score. This skill is most often needed in the course of collaborating with other musicians or accompanying choral ensembles where the sight-reader is asked to perform a piece with very little notice. Here, the goal of the sight-reader is to play the piece from beginning to end with reasonable accuracy, while keeping tempo with the rest of the musicians. Sight-reading is also paramount to both student and teacher musicians as it enables one to play through a large amount of musical literature quickly. This allows one to hear what a piece sounds like (especially when no recording is readily available), and determine whether or not it would be appropriate for long-term study. Student musicians are often required to sight-read in music competitions or festivals as a gauge for his or her musical abilities.

In 1983, Lowder [6] surveyed college faculty and teachers in music departments to determine which component of piano playing they deemed most important. Sight-reading ranked second only to "cadences," with score-reading, harmonization, and accompanying following behind. Recent studies reveal similar results: Kostka[2] reports sight-reading as the second most desirable skill, surpassed only by "musicality," while Hardy's[1] poll of 221 nationally certified piano teachers from the MTNA (Music Teachers National Association) showed that 86 percent of those who responded rated sight-reading as either "highly important" or "most important" of all pianistic skills.

Sight-reading research has found that sight-reading skill is mostly a result of spending many hours doing it [3]. Practicing to sight-read, however, is very different than practicing for performance. Lehmann and McArthur[4] illustrate these differences concisely. When practicing for performance: (a) mistakes should be corrected, (b) details to musicality, notation, and correct fingering are essential, and (c), incorrect and omitted notes are cardinal sins. When practicing sight-reading however: (a) Never correct mistakes; instead maintaining rhythm and meter is essential, (b) the "big picture" is more important then details -- getting to the notes in time is more important than correct fingering, and (3), incorrect and omitted notes are inevitable; just move on.

II. Guidelines for sight-reading practice

Some guidelines for effective sight-reading practice are as follows: (1) Use and maintain a slow enough tempo so that a reasonable percentage of correct notes are played, (2) never go backward to correct mistakes, and (3) pay attention to rhythm; use a metronome and count out loud. We show how each of these guidelines are incorporated into our design as we describe the interface in the remainder of this section. [5]

III. Program usage

To begin the application, the user chooses a sight-reading exercise of the appropriate level. Once selected, the musical score is displayed, and the user must play from the score at the specified tempo. Points are given for each note that is played correctly. However, if too many incorrect notes are played, the exercise restarts at a slower tempo. This restart mechanism forces a consistent tempo that is appropriate to the user's sight-reading level. Once the notes to be played reach the middle of the screen, the musical score scrolls to the left, and notes that have just been played disappears off screen. This prevents the inherent instinct of back-tracking and correcting a previous mistake. Further, to improve rhythmic literacy, ``progress bars" are drawn above each note. The length of each progress bar is proportional to the duration of each note and ``fills up" (with color) at the rate that the note's duration is passing.

At the conclusion of the piece, the player receives a performance score based on the difficulty of the selected piece, the tempo at which the piece was played, and the number of notes played correctly. As a quantitative metric for self-evaluation, the performance score provides a well-defined target for further improvement, as well as an incentive for continued playing.


[1] Hardy, D. Teaching sight-reading at the piano Methodology and Significance. Master's thesis, Southwestern Oklahoma State University. Weatherford, OK, 1992.

[2] Kostka, M. Effects of self-assessment and successive approximations on "knowing" and "valuing" selected keyboard skills. Journal of Research in Music Education, 45, 273-281, 1997.

[3] Lehmann, A. and Ericsson, K. Sight-reading ability of expert pianists in the context of piano accompanying. Pscyhomusicology, 12 (2), 142- 161, 1993.

[4] Lehmann, A. C., & Ericsson, K. A. (1996). Performance without preparation: Structure and acquisition of expert accompanying and sight-reading performance. Psychomusicology, 15, 1-29.

[5] Lehmann, A. and McArthur, V. Sight-reading: Developing the Skill of Reconstructing a Musical Score. In Parncutt, R. and McPherson, G. (Eds.), Science and Psychology of Music Performance. pp. 135 - 150. Oxford University Press, 2002.

[6] Lowder, J. Evaluation of keyboard skills required in college class piano programs. Contributions to Music Education, 10, 33 - 38, 1983.