Monte Carlo Methods for Tempo Tracking and Rhythm Quantization

We present a probabilistic generative model for timing deviations in expressive music performance. The structure of the proposed model is equivalent to a switching state space model. The switch variables correspond to discrete note locations as in a musical score. The continuous hidden variables denote the tempo. We formulate two well known music recognition problems, namely tempo tracking and automatic transcription (rhythm quantization) as filtering and maximum a posteriori (MAP) state estimation tasks. Exact computation of posterior features such as the MAP state is intractable in this model class, so we introduce Monte Carlo methods for integration and optimization. We compare Markov Chain Monte Carlo (MCMC) methods (such as Gibbs sampling, simulated annealing and iterative improvement) and sequential Monte Carlo methods (particle filters). Our simulation results suggest better results with sequential methods. The methods can be applied in both online and batch scenarios such as tempo tracking and transcription and are thus potentially useful in a number of music applications such as adaptive automatic accompaniment, score typesetting and music information retrieval.

[1]  James D. Hamilton Time Series Analysis , 1994 .

[2]  Jun S. Liu,et al.  Mixture Kalman ®lters , 2000 .

[3]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[4]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[5]  Hisashi Tanizaki,et al.  Ch. 22. Nonlinear and non-gaussian state-space modeling with monte carlo techniques: A survey and comparative study , 2003 .

[6]  Christophe Andrieu,et al.  Iterative algorithms for state estimation of jump Markov linear systems , 2001, IEEE Trans. Signal Process..

[7]  Belinda Thom,et al.  Unsupervised Learning and Interactive Jazz/Blues Improvisation , 2000, AAAI/IAAI.

[8]  Stuart J. Russell,et al.  Dynamic bayesian networks: representation, inference and learning , 2002 .

[9]  Peter Desain,et al.  Quantization of musical time: a connectionist approach , 1989 .

[10]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[11]  Gareth O. Roberts,et al.  Markov‐chain monte carlo: Some practical implications of theoretical results , 1998 .

[12]  Barry Vercoe,et al.  Structured audio: creation, transmission, and rendering of parametric sound representations , 1998, Proc. IEEE.

[13]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[14]  Yuval Peres,et al.  Decayed MCMC Filtering , 2012, UAI.

[15]  H. C. Longuet-Higgins,et al.  Mental Processes: Studies in Cognitive Science , 1987 .

[16]  Jun S. Liu,et al.  Mixture Kalman filters , 2000 .

[17]  Petri Toiviainen,et al.  An interactive MIDI accompanist , 1998 .

[18]  Luke Windsor,et al.  Make Me a Match: An Evaluation of Different Approaches to ScorePerformance Matching , 2000, Computer Music Journal.

[19]  Emilios Cambouropoulos,et al.  From MIDI to Traditional Musical Notation , 2000 .

[20]  Ehl Emile Aarts,et al.  Statistical cooling : a general approach to combinatorial optimization problems , 1985 .

[21]  Rong Chen,et al.  A Theoretical Framework for Sequential Importance Sampling with Resampling , 2001, Sequential Monte Carlo Methods in Practice.

[22]  Roger B. Dannenberg,et al.  An On-Line Algorithm for Real-Time Accompaniment , 1984, ICMC.

[23]  Yuval Peres,et al.  Decayed MCMC iltering , 2002, UAI 2002.

[24]  E. rey,et al.  Variational Learning for Swit hing State-Spa e ModelsZoubin GhahramaniGeo , 2006 .

[25]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[26]  Peter Desain,et al.  Foot-tapping: a brief intoduction to beat induction , 1994, ICMC.

[27]  Nando de Freitas,et al.  Sequential Monte Carlo Methods in Practice , 2001, Statistics for Engineering and Information Science.

[28]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[29]  Camilo Rueda,et al.  Kant: a Critique of Pure Quantification , 1994, ICMC.

[30]  Yaakov Bar-Shalom,et al.  Estimation and Tracking: Principles, Techniques, and Software , 1993 .

[31]  Simon Dixon,et al.  Beat Tracking with Musical Knowledge , 2000, ECAI.

[32]  G. Casella,et al.  Rao-Blackwellisation of sampling schemes , 1996 .

[33]  R. Kohn,et al.  Markov chain Monte Carlo in conditionally Gaussian state space models , 1996 .

[34]  T. Heskes,et al.  Expectation propagation for approximate inference in dynamic bayesian networks , 2002, UAI 2002.

[35]  Christopher Raphael,et al.  A Probabilistic Expert System for Automatic Musical Accompaniment , 2001 .

[36]  A. Doucet,et al.  Maximum a Posteriori Sequence Estimation Using Monte Carlo Particle Filters , 2001, Annals of the Institute of Statistical Mathematics.

[37]  Hisashi Tanizaki Nonlinear and Non-Gaussian State-Space Modeling with Monte Carlo Techniques : A Survey and Comparative Study , 2000 .

[38]  Simon J. Godsill,et al.  On sequential Monte Carlo sampling methods for Bayesian filtering , 2000, Stat. Comput..

[39]  Peter Lawrence,et al.  Transcribe: A Comprehensive Autotranscription Program , 1993, ICMC.

[40]  W. Burgard,et al.  Markov Localization for Mobile Robots in Dynamic Environments , 1999, J. Artif. Intell. Res..

[41]  Y. Bar-Shalom Tracking and data association , 1988 .

[42]  Yoichi Muraoka,et al.  Musical understanding at the beat level: real-time beat tracking for audio signals , 1998 .

[43]  Louis P. DiPalma,et al.  Music and Connectionism , 1991 .

[44]  Roger B. Dannenberg,et al.  A probabilistic method for tracking a vocalist , 1998 .

[45]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[46]  Peter Desain,et al.  On tempo tracking: Tempogram Representation and Kalman filtering , 2000, ICMC.

[47]  Geoffrey E. Hinton,et al.  Parameter estimation for linear dynamical systems , 1996 .

[48]  M. Hamanaka Learning-Based Quantization : Estimation of Onset Times in a Musical Score , 2001 .

[49]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[50]  Tom Heskes Onno,et al.  Expe tation propagation for approximate inferen e in dynami Bayesian networks , 2002 .

[51]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[52]  N. Metropolis,et al.  The Monte Carlo method. , 1949 .

[53]  Miller Puckette,et al.  Synthetic Rehearsal: Training the Synthetic Performer , 1985, ICMC.

[54]  Christopher Raphael A Mixed Graphical Model for Rhythmic Parsing , 2001, UAI.

[55]  Michael Isard,et al.  Contour Tracking by Stochastic Propagation of Conditional Density , 1996, ECCV.

[56]  Peter Desain,et al.  Rhythm Quantization for Transcription , 2000, Computer Music Journal.