Polyphonic music modeling with random fields

Recent interest in the area of music information retrieval and related technologies is exploding. However, very few of the existing techniques take advantage of recent developments in statistical modeling. In this paper we discuss an application of Random Fields to the problem of creating accurate yet flexible statistical models of polyphonic music. With such models in hand, the challenges of developing effective searching, browsing and organization techniques for the growing bodies of music collections may be successfully met. We offer an evaluation of these models in terms of perplexity and prediction accuracy, and show that random fields not only outperform Markov chains, but are much more robust in terms of overfitting.

[1]  Justin Zobel,et al.  Melodic matching techniques for large music databases , 1999, MULTIMEDIA '99.

[2]  Andrew McCallum,et al.  Using Maximum Entropy for Text Classification , 1999 .

[3]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[4]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[5]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Roni Rosenfeld,et al.  A whole sentence maximum entropy language model , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[7]  Juan Pablo Bello,et al.  Time-domain polyphonic transcription using self-generating databases , 2002 .

[8]  Mark Sandler,et al.  Pitch Locking Monophonic Music Analysis , 2002 .

[9]  Andrew McCallum,et al.  Maximum Entropy Markov Models for Information Extraction and Segmentation , 2000, ICML.

[10]  Rob Malouf,et al.  A Comparison of Algorithms for Maximum Entropy Parameter Estimation , 2002, CoNLL.

[11]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[12]  Salim Roukos,et al.  Maximum likelihood and discriminative training of direct translation models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[13]  Ichiro Fujinaga,et al.  Gamera: A Structured Document Recognition Application Development Environment , 2001 .

[14]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[15]  Justin Zobel,et al.  Manipulation of music for melody matching , 1998, MULTIMEDIA '98.

[16]  Roger B. Dannenberg,et al.  Enhanced Vocal Performance Tracking Using Multiple Information Sources , 1998, ICMC.

[17]  Adriane Durey,et al.  Melody Spotting Using Hidden Markov Models , 2001, ISMIR.

[18]  Mark B. Sandler,et al.  Polyphonic Score Retrieval Using Polyphonic Audio Queries: A Harmonic Modeling Approach , 2003, ISMIR.

[19]  Christopher Raphael,et al.  Automatic Transcription of Piano Music , 2002, ISMIR.