Singer verification: Singer model .vs. song model

This paper proposes a method to verify the singer identity of a given song. The query song is modeled as a GMM learned on the features extracted from sustained sung notes of the song. Each note is described by the shape its spectral envelope and by the temporal variations in frequency and amplitude of its fundamental frequency. The singer identity is verified with two approaches: the model of the query song is compared to a singer-based GMM or compared to the GMM of another song performed by the same singer. The comparison is done using a dissimilarity measurement given by the Kullback Leibler divergence. When the two types of features are combined, the proposed approach verifies the singer identity of a given a cappella song with an error rate lower than 8% when the whole song is considered and an error rate lower than 10% when a short excerpt of the song (i.e. 15 consecutive sustained notes) is considered.

[1]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.

[2]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[3]  Hsin-Min Wang,et al.  Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Ixone Arroabarren,et al.  Vibrato in Singing Voice: The Link between Source-Filter and Sinusoidal Models , 2004, EURASIP J. Adv. Signal Process..

[5]  Roland Badeau,et al.  High-resolution spectral analysis of mixtures of complex exponentials modulated by polynomials , 2006, IEEE Transactions on Signal Processing.

[6]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[7]  X. Rodet EFFICIENT SPECTRAL ENVELOPE ESTIMATION AND ITS APPLICATION TO PITCH SHIFTING AND ENVELOPE PRESERVATION , 2005 .

[8]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[9]  Paul Lamere,et al.  A Model-Based Approach to Constructing Music Similarity Functions , 2007, EURASIP J. Adv. Signal Process..

[10]  Tong Zhang,et al.  Automatic singer identification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[11]  Anssi Klapuri,et al.  Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods , 2007, ISMIR.

[12]  Haizhou Li,et al.  Exploring Perceptual Based Timbre Feature for Singer Identification , 2007, CMMR.

[13]  Douglas A. Reynolds,et al.  A Tutorial on Text-Independent Speaker Verification , 2004, EURASIP J. Adv. Signal Process..

[14]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[15]  François Pachet,et al.  Music Similarity Measures: What's the use? , 2002, ISMIR.

[16]  David A. van Leeuwen,et al.  NIST and NFI-TNO evaluations of automatic speaker recognition , 2006, Comput. Speech Lang..

[17]  Geoffroy Peeters,et al.  Partial clustering using a time-varying frequency model for singing voice detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.