Signal-to-Score Music Transcription using Graphical Models

We present a transcription system that takes a music signal as input and returns its musical score. Two stages of processing are used. The first employs a fundamental frequency detector and an onset detector to transform input signals into a sequence of sound events. The onset detection is inherently noisy. This paper focuses on the second stage, going from sound events to a notated score. We use a family of graphical models for this task. We allow the results of onset detection to be noisy, necessitating a search over possible segmentations of the sound events. We use a large corpus of monophonic vocal music to evaluate our system. Our results show that our approach is well-suited to the problem of music transcription. The initial onset detection reduces the number of observations and makes the system less instrument specific. The search over segmentations corrects the errors in the onset detection. Without such reasoning, these errors are magnified in subsequent rhythm transcription.

[1]  David Barber,et al.  A generative model for music transcription , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[3]  Sylvia Richardson,et al.  Markov Chain Monte Carlo in Practice , 1997 .

[4]  A. Scott,et al.  Ann Arbor , 1980 .

[5]  Andrew D. Sterian,et al.  Model-based segmentation of time-frequency images for musical transcription. , 1999 .

[6]  Avi Pfeffer,et al.  A Hierarchical Approach to Onset Detection , 2004, ICMC.

[7]  J. Smith Seattle , 1906 .

[8]  Stuart J. Russell,et al.  Approximate inference for first-order probabilistic languages , 2001, IJCAI.

[9]  Ali Taylan Cemgil,et al.  Monte Carlo Methods for Tempo Tracking and Rhythm Quantization , 2011, J. Artif. Intell. Res..

[10]  John Platt,et al.  Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods , 1999 .

[11]  Christopher Raphael,et al.  A hybrid graphical model for rhythmic parsing , 2002, Artif. Intell..

[12]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[13]  Matija Marolt,et al.  A connectionist approach to automatic transcription of polyphonic piano music , 2004, IEEE Transactions on Multimedia.

[14]  Yoichi Muraoka,et al.  An Audio-based Real-time Beat Tracking System and Its Applications , 1998, ICMC.

[15]  K. Pearson,et al.  Biometrika , 1902, The American Naturalist.

[16]  Christopher Raphael,et al.  Automatic Transcription of Piano Music , 2002, ISMIR.

[17]  Anssi Klapuri,et al.  Signal Processing Methods for the Automatic Transcription of Music , 2004 .

[18]  N. Sheibani,et al.  Paris , 1894, The Hospital.

[19]  Nils J. Nilsson,et al.  Artificial Intelligence , 1974, IFIP Congress.

[20]  Katherine D. Blake To San Francisco , 1911 .