Exploring Perceptual Based Timbre Feature for Singer Identification

Timbre can be defined as feature of an auditory stimulus that allows us to distinguish the sounds which have the same pitch and loudness. In this paper, we explore timbre based perceptual feature for singer identification. We start with a vocal detection process to extract the vocal segments from the sound. The cepstral coefficients, which reflect timbre characteristics, are then computed from the vocal segments. The cepstral coefficients of timbre are formulated by combining information of harmonic and the dynamic characteristics of the sound such as vibrato and the attack-decay envelope of the songs. Bandpass filters that spread according to the octave frequency scale are used to extract vibrato and harmonic information of sounds. The experiments are conducted on a database of 84 popular songs. The results show that the proposed timbre based perceptual feature is robust and effective. We achieve an average error rate of 12.2% in segment level singer identification.

[1]  Ming Chun. Liu,et al.  Content-based audio classification and retrieval. , 2005 .

[2]  Fritz Winckel Music, Sound and Sensation: A Modern Exposition , 1967 .

[3]  Tong Zhang,et al.  Automatic singer identification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[4]  Gregory H. Wakefield,et al.  Singing voice identification using spectral envelope estimation , 2004, IEEE Transactions on Speech and Audio Processing.

[5]  Joe Wolfe,et al.  Vocal tract resonances in singing: the soprano voice. , 2004, The Journal of the Acoustical Society of America.

[6]  M. Erickson,et al.  Discrimination functions: can they be used to classify singing voices? , 2001, Journal of voice : official journal of the Voice Foundation.

[7]  P. Desain,et al.  VIBRATO : QUESTIONS AND ANSWERS FROM MUSICIANS AND SCIENCE , 2000 .

[8]  T. Zhang System and Method for Automatic Singer Identification , 2003 .

[9]  Tong Zhang,et al.  Content-Based Audio Classification and Retrieval for Audiovisual Data Parsing , 2001 .

[10]  Haizhou Li,et al.  Exploring Vibrato-Motivated Acoustic Features for Singer Identification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  C. Dromey,et al.  Vibrato rate adjustment. , 2003, Journal of voice : official journal of the Voice Foundation.

[12]  Say Wei Foo,et al.  Stress Classification Using Subband Based Features , 2003 .

[13]  T. Cleveland Acoustic properties of voice timbre types and their influence on voice classification. , 1977, The Journal of the Acoustical Society of America.

[14]  F. Alton Everest,et al.  Master handbook of acoustics , 1981 .

[15]  Chin-Hui Lee,et al.  Vocabulary independent discriminative utterance verification for nonkeyword rejection in subword based speech recognition , 1996, IEEE Trans. Speech Audio Process..

[16]  Jean-François Bonastre,et al.  Bayesian bpproach based decision in speaker verification , 2001, Odyssey.

[17]  Paolo Prandoni,et al.  Sonological models for timbre characterization , 1997 .

[18]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[19]  J. Sundberg,et al.  Measurements of vibrato parameters in long sustained crescendo notes as sung by ten sopranos. , 2003, Journal of voice : official journal of the Voice Foundation.