Bayesian analysis of polyphonic western tonal music.

This paper deals with the computational analysis of musical audio from recorded audio waveforms. This general problem includes, as subtasks, music transcription, extraction of musical pitch, dynamics, timbre, instrument identity, and source separation. Analysis of real musical signals is a highly ill-posed task which is made complicated by the presence of transient sounds, background interference, or the complex structure of musical pitches in the time-frequency domain. This paper focuses on models and algorithms for computer transcription of multiple musical pitches in audio, elaborated from previous work by two of the authors. The audio data are supposedly presegmented into fixed pitch regimes such as individual chords. The models presented apply to pitched (tonal) music and are formulated via a Gabor representation of nonstationary signals. A Bayesian probabilistic structure is employed for representation of prior information about the parameters of the notes. This paper introduces a numerical Bayesian inference strategy for estimation of the pitches and other parameters of the waveform. The improved algorithm is much quicker and makes the approach feasible in realistic situations. Results are presented for estimation of a known number of notes present in randomly generated note clusters from a real musical instrument database.

[1]  H. Akaike A new look at the statistical model identification , 1974 .

[2]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[3]  J. Rissanen A UNIVERSAL PRIOR FOR INTEGERS AND ESTIMATION BY MINIMUM DESCRIPTION LENGTH , 1983 .

[4]  R. T. Schumacher,et al.  ON THE OSCILLATIONS OF MUSICAL-INSTRUMENTS , 1983 .

[5]  S. Schwerman,et al.  The Physics of Musical Instruments , 1991 .

[6]  Alain de Cheveigné,et al.  Separation of concurrent harmonic sounds: Fundamental frequency estimation and a time-domain cancell , 1993 .

[7]  Michèle Basseville,et al.  Detection of abrupt changes: theory and application , 1993 .

[8]  Kunio Kashino,et al.  Organization of Hierarchical Perceptual Sounds: Music Scene Analysis with Autonomous Processing Modules and a Quantitative Information Integration Mechanism , 1995, IJCAI.

[9]  P. Green Reversible jump Markov chain Monte Carlo computation and Bayesian model determination , 1995 .

[10]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[11]  T. Strohmer,et al.  Gabor Analysis and Algorithms: Theory and Applications , 1997 .

[12]  Simon J. Godsill,et al.  Multidimensional optimisation of harmonic signals , 1998, 9th European Signal Processing Conference (EUSIPCO 1998).

[13]  A. Doucet,et al.  Joint Bayesian detection and estimation of noisy sinusoids via reversible jump MCMC , 1998 .

[14]  C. Doncarli,et al.  Stationarity index for abrupt changes detection in the time-frequency plane , 1996, IEEE Signal Processing Letters.

[15]  Peter J. W. Rayner,et al.  Digital Audio Restoration: A Statistical Model Based Approach , 1998 .

[16]  Hideki Kawahara,et al.  Multiple period estimation and pitch perception model , 1999, Speech Commun..

[17]  Simon J. Godsill,et al.  Polyphonic pitch tracking using joint Bayesian estimation of multiple frame parameters , 1999, Proceedings of the 1999 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics. WASPAA'99 (Cat. No.99TH8452).

[18]  Christophe Andrieu,et al.  Joint Bayesian model selection and estimation of noisy sinusoids via reversible jump MCMC , 1999, IEEE Trans. Signal Process..

[19]  Andrew Sterian,et al.  Model-Based Musical Transcription , 1999, ICMC.

[20]  Kunio Kashino,et al.  A sound source identification system for ensemble music based on template adaptation and music stream extraction , 1999, Speech Commun..

[21]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[22]  Anssi Klapuri,et al.  Separation of harmonic sound sources using sinusoidal modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[23]  M. Stephens Dealing with label switching in mixture models , 2000 .

[24]  Anssi Klapuri,et al.  Separation of harmonic sounds using multipitch analysis and iterative parameter estimation , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[25]  Rafael A. Irizarry,et al.  Local Harmonic Estimation in Musical Sound Signals , 2001 .

[26]  Lucas C. Parra,et al.  Approximate Kalman filtering for the harmonic plus noise model , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[27]  A. Bregman Auditory Scene Analysis , 2001 .

[28]  S. Godsill On the Relationship Between Markov chain Monte Carlo Methods for Model Uncertainty , 2001 .

[29]  Simon J. Godsill,et al.  Bayesian harmonic models for musical pitch estimation and analysis , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[30]  Christopher Raphael,et al.  Automatic Transcription of Piano Music , 2002, ISMIR.

[31]  Simon J. Godsill,et al.  Detection of abrupt spectral changes using support vector machines an application to audio signal segmentation , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[32]  M. Davies,et al.  A HYBRID APPROACH TO MUSICAL NOTE ONSET DETECTION , 2002 .

[33]  R. Irizarry,et al.  Weighted Estimation of Harmonic Components in a Musical Sound Signal , 2002 .

[34]  Anssi Klapuri,et al.  Measuring the similarity of Rhythmic Patterns , 2002, ISMIR.

[35]  Anssi Klapuri,et al.  Separation of harmonic sounds using linear models for the overtone series , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[36]  Simon J. Godsill,et al.  Bayesian harmonic models for musical signal analysis , 2003 .

[37]  Christopher Raphael,et al.  Harmonic analysis with probabilistic graphical models , 2003, ISMIR.

[38]  David Barber,et al.  Generative model based polyphonic music transcription , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[39]  Rémi Gribonval,et al.  Harmonic decomposition of audio signals with matching pursuit , 2003, IEEE Trans. Signal Process..

[40]  P. Moral,et al.  On a Class of Genealogical and Interacting Metropolis Models , 2003 .

[41]  Kunio Kashino,et al.  Bayesian estimation of simultaneous musical notes based on frequency domain modelling , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[42]  Hirokazu Kameoka,et al.  Separation of harmonic structures based on tied Gaussian mixture model and information criterion for concurrent sounds , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[43]  S. Godsill,et al.  Bayesian variable selection and regularization for time–frequency surface estimation , 2004 .

[44]  Jérôme Idier,et al.  Fast MCMC computations for the estimation of sparse processes from noisy observations , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[45]  A. Doucet,et al.  Reversible Jump Markov Chain Monte Carlo Strategies for Bayesian Model Selection in Autoregressive Processes , 2004, Journal of Time Series Analysis.

[46]  Shlomo Dubnov,et al.  Maximum a-posteriori probability pitch tracking in noisy environments using harmonic model , 2004, IEEE Transactions on Speech and Audio Processing.

[47]  Peter Gerstoft,et al.  Bayesian model selection applied to self-noise geoacoustic inversion , 2004 .

[48]  Manuel Davy,et al.  An online kernel change detection algorithm , 2005, IEEE Transactions on Signal Processing.

[49]  Ajay Jasra,et al.  Markov Chain Monte Carlo Methods and the Label Switching Problem in Bayesian Mixture Modeling , 2005 .

[50]  David Barber,et al.  A generative model for music transcription , 2006, IEEE Transactions on Audio, Speech, and Language Processing.