Multiple-Feature Fusion Based Onset Detection for Solo Singing Voice

Onset detection is a challenging problem in automatic singing transcription. In this paper, we address singing onset detection with three main contributions. First, we outline the nature of a singing voice and present a new singing onset detection approach based on supervised machine learning. In this approach, two Gaussian Mixture Models (GMMs) are used to classify audio features of onset frames and non-onset frames. Second, existing audio features are thoroughly evaluated for this approach to singing onset detection. Third, feature-level and decision-level fusion are employed to fuse different features for a higher level of performance. Evaluated on a recorded singing database, the proposed approach outperforms state-of-the-art onset detection algorithms significantly.

[1]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[2]  Giuliano Antoniol,et al.  Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories , 2005, ACM SIGSOFT Softw. Eng. Notes.

[3]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Alex Loscos Spectral processing of the singing voice , 2007 .

[5]  Ye Wang,et al.  A Violin Music Transcriber for Personalized Learning , 2006, 2006 IEEE International Conference on Multimedia and Expo.

[6]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[7]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[8]  Giuliano Antoniol,et al.  Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories , 2005, MSR.

[9]  Nick Collins Using a Pitch Detector for Onset Detection , 2005, ISMIR.

[10]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[11]  Douglas Eck,et al.  A Supervised Classification Algorithm for Note Onset Detection , 2006, EURASIP J. Adv. Signal Process..

[12]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[13]  Nick Collins A Comparison of Sound Onset Detection Algorithms with Emphasis on Psychoacoustically Motivated Detection Functions , 2005 .