Event-based Multitrack Alignment using a Probabilistic Framework

This paper presents a Bayesian probabilistic framework for real-time alignment of a recording or score with a live performance using an event-based approach. Multitrack audio files are processed using existing onset detection and harmonic analysis algorithms to create a representation of a musical performance as a sequence of time-stamped events. We propose the use of distributions for the position and relative speed which are sequentially updated in real-time according to Bayes’ theorem. We develop the methodology for this approach by describing its application in the case of matching a single MIDI track and then extend this to the case of multitrack recordings. An evaluation is presented that contrasts ourmultitrack alignment method with state-of-the-art alignment techniques.

[1]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[2]  William P. Birmingham,et al.  Improved Score Following for Acoustic Performances , 2002, ICMC.

[3]  Barry Vercoe,et al.  The Synthetic Performer in The Context of Live Performance , 1984, International Conference on Mathematics and Computing.

[4]  Arshia Cont,et al.  Antescofo: Anticipatory Synchronization and control of Interactive parameters in Computer Music , 2008, ICMC.

[5]  Christopher Raphael,et al.  Aligning music audio with symbolic scores using a hybrid graphical model , 2006, Machine Learning.

[6]  George Tzanetakis,et al.  Polyphonic audio matching and alignment for music retrieval , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[7]  Miller Puckette,et al.  Score Following in Practice , 1992, ICMC.

[8]  Roger B. Dannenberg,et al.  A Stochastic Method of Tracking a Vocal Performer , 1997, ICMC.

[9]  Hideki Kawahara,et al.  YIN, a fundamental frequency estimator for speech and music. , 2002, The Journal of the Acoustical Society of America.

[10]  Gerhard Widmer,et al.  MATCH: A Music Alignment Tool Chest , 2005, ISMIR.

[11]  Gerhard Widmer,et al.  A Multi-pass Algorithm for Accurate Audio-to-Score Alignment , 2010, ISMIR.

[12]  A. Arzt SIMPLE TEMPO MODELS FOR REAL-TIME MUSIC TRACKING , 2010 .

[13]  Mark B. Sandler,et al.  The Sonic Visualiser: A Visualisation Platform for Semantic Descriptors from Musical Signals , 2006, ISMIR.

[14]  Gregory H. Wakefield,et al.  Mathematical representation of joint time-chroma distributions , 1999, Optics & Photonics.

[15]  Tetsuya Ogata,et al.  Design and Implementation of Two-level Synchronization for Interactive Music Robot , 2010, AAAI.

[16]  Peter Desain,et al.  On tempo tracking: Tempogram Representation and Kalman filtering , 2000, ICMC.

[17]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[18]  Christopher Raphael,et al.  Music Plus One and Machine Learning , 2010, ICML.

[19]  Nicola Orio,et al.  Score Following Using Spectral Analysis and Hidden Markov Models , 2001, ICMC.

[20]  Roger B. Dannenberg,et al.  An Intelligent Multi-Track audio Editor , 2007, ICMC.

[21]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[22]  Christopher Raphael,et al.  A Probabilistic Expert System for Automatic Musical Accompaniment , 2001 .

[23]  Peter Grosche,et al.  High resolution audio synchronization using chroma onset features , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[24]  Roger B. Dannenberg,et al.  A Reference Architecture and Score Representation for Popular Music Human-Computer Music Performance Systems , 2011, NIME.

[25]  Simon J. Godsill,et al.  A Probabilistic Framework for Matching Music Representations , 2007, ISMIR.

[26]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[27]  Arshia Cont,et al.  On the creative use of score following and its impact on research , 2011 .

[28]  R. Shepard Circularity in Judgments of Relative Pitch , 1964 .

[29]  C. Joder,et al.  A Conditional Random Field Framework for Robust and Scalable Audio-to-Score Matching , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[30]  Roger B. Dannenberg Toward Automated Holistic Beat Tracking, Music Analysis and Understanding , 2005, ISMIR.

[31]  Bryan Pardo,et al.  A state space model for online polyphonic audio-score alignment , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[33]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[34]  Gerhard Widmer,et al.  Automatic Page Turning for Musicians via Real-Time Machine Listening , 2008, ECAI.

[35]  Fabio Kon,et al.  The Quest for Low Latency , 2004, ICMC.

[36]  Nicola Orio,et al.  Score Following: State of the Art and New Developments , 2003, NIME.

[37]  Arshia Cont,et al.  A unified approach to real time audio-to-score and audio-to-audio alignment using sequential Montecarlo inference techniques , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Matthew E. P. Davies,et al.  Context-Dependent Beat Tracking of Musical Audio , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Miller Puckette,et al.  Synthetic Rehearsal: Training the Synthetic Performer , 1985, ICMC.

[40]  Gerhard Widmer,et al.  Fast Identification of Piece and Score Position via Symbolic Fingerprinting , 2012, ISMIR.

[41]  Roger B. Dannenberg,et al.  An On-Line Algorithm for Real-Time Accompaniment , 1984, ICMC.