Perceptually-Based Evaluation of the Errors Usually Made When Automatically Transcribing Music

This paper investigates the perceptual importance of typical errors occurring when transcribing polyphonic music excerpts into a symbolic form. The case of the automatic transcription of piano music is taken as the target application and two subjective tests are designed. The main test aims at understanding how human subjects rank typical transcription errors such as note insertion, deletion or replacement, note doubling, incorrect note onset or duration, and so forth. The Bradley-Terry-Luce (BTL) analysis framework is used and the results show that pitch errors are more clearly perceived than incorrect loudness estimations or temporal deviations from the original recording. A second test presents a first attempt to include this information in more perceptually motivated measures for evaluating transcription systems.

[1]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[2]  Simon Dixon,et al.  On the Computer Recognition of Solo Piano Music , 2000 .

[3]  Roland Badeau,et al.  Automatic transcription of piano music based on HMM tracking of jointly-estimated pitches , 2008, 2008 16th European Signal Processing Conference.

[4]  Emmanuel Vincent,et al.  Instrument-Specific Harmonic Atoms for Mid-Level Music Representation , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[5]  R. A. Bradley SOME STATISTICAL METHODS IN TASTE TESTING AND QUALITY EVALUATION (a, b) , 1953 .

[6]  Matija Marolt,et al.  A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[7]  M.P. Ryynanen,et al.  Polyphonic music transcription using note event modeling , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[8]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[9]  David Barber,et al.  A generative model for music transcription , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  David Sankoff,et al.  Comparison of musical sequences , 1990, Comput. Humanit..

[11]  Daniel P. W. Ellis,et al.  A Discriminative Model for Polyphonic Piano Transcription , 2007, EURASIP J. Adv. Signal Process..

[12]  Pierre Hanna,et al.  Polyphonic Music Retrieval by Local Edition of Quotiented Sequences , 2007, 2007 International Workshop on Content-Based Multimedia Indexing.

[13]  Remco C. Veltkamp,et al.  Using transportation distances for measuring melodic similarity , 2003, ISMIR.

[14]  James Anderson Moorer,et al.  On the segmentation and analysis of continuous musical sound by digital computer , 1975 .

[15]  Roland Badeau,et al.  Blind Signal Decompositions for Automatic Transcription of Polyphonic Music: NMF and K-SVD on the Benchmark , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.