A Multi-pass Algorithm for Accurate Audio-to-Score Alignment

Most current audio-to-score alignment algorithms work on the level of score time frames; i.e., they cannot differentiate between several notes occurring at the same discrete time within the score. This level of accuracy is sufficient for a variety of applications. However, for those that deal with, for example, musical expression analysis such microtimings might also be of interest. Therefore, we propose a method that estimates the onset times of individual notes in a post-processing step. Based on the initial alignment and a feature obtained by matrix factorization, those notes for which the confidence in the alignment is high are chosen as anchor notes. The remaining notes in between are revised, taking into account the additional information about these anchors and the temporal relations given by the score. We show that this method clearly outperforms a reference method that uses the same features but does not differentiate between anchor and non-anchor notes.

[1]  Ning Hu,et al.  Bootstrap learning for accurate onset detection , 2006, Machine Learning.

[2]  J. Sundberg,et al.  Perception of just-noticeable time displacement of a tone presented in a metrical sequence at different tempos , 1993 .

[3]  George Tzanetakis,et al.  Polyphonic audio matching and alignment for music retrieval , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[4]  Keikichi Hirose,et al.  Automatic alignment of a musical score to performed music , 2001 .

[5]  Lawrence K. Saul,et al.  Real-Time Pitch Determination of One or More Voices by Nonnegative Matrix Factorization , 2004, NIPS.

[6]  Christopher Raphael,et al.  Aligning music audio with symbolic scores using a hybrid graphical model , 2006, Machine Learning.

[7]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[8]  Arshia Cont,et al.  A Coupled Duration-Focused Architecture for Real-Time Music-to-Score Alignment , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Arshia Cont Realtime Audio to Score Alignment for Polyphonic Music Instruments, using Sparse Non-Negative Constraints and Hierarchical HMMS , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Meinard Müller,et al.  An Efficient Multiscale Approach to Audio Synchronization , 2006, ISMIR.

[11]  Meinard Müller,et al.  Refinement Strategies for Music Synchronization , 2009, CMMR.