Combining Evidences from Mel Cepstral Features and Cepstral Mean Subtracted Features for Singer Identification

One of the challenging and difficult problems under the category of Music Information Retrieval (MIR) is to identify a singer of a given song under the strong influence of instrumental sounds. The performance of Singer Identification (SID) system is also severely affected by the quality of recording devices, transmission channels and singing voice(s) of other singer(s). We have proposed a large database of 500 songs, prepared from Hindi Bollywood songs. The State-of-the-art Mel Frequency Cepstral Coefficients (MFCC) are used as feature vectors and 2nd order polynomial classifier is employed as a pattern classifier in our work. We also used Cepstral Mean Subtraction (CMS) based MFCC (CMSMFCC) feature vectors for SID and are found to give better results than the MFCC on proposed database. The SID accuracy for MFCC and CMSMFCC was found to be 75.75% and 84.5%, respectively and Equal Error Rate (EER) for MFCC and CMSMFCC was found to be 9.48% and 8.45%, respectively. While score-level-fusion of both gave improvement in SID accuracy and EER by 10.25% and 2.08% respectively than MFCC alone.

[1]  T. Zhang System and Method for Automatic Singer Identification , 2003 .

[2]  Steve Lawrence,et al.  Artist detection in music with Minnowmatch , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[3]  William M. Campbell,et al.  Speaker recognition with polynomial classifiers , 2002, IEEE Trans. Speech Audio Process..

[4]  Gregory H. Wakefield,et al.  Singing voice identification using spectral envelope estimation , 2004, IEEE Transactions on Speech and Audio Processing.

[5]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[6]  Haizhou Li,et al.  Exploring Vibrato-Motivated Acoustic Features for Singer Identification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  B. Atal Effectiveness of linear prediction characteristics of the speech wave for automatic speaker identification and verification. , 1974, The Journal of the Acoustical Society of America.

[8]  Youngmoo E. Kim,et al.  Singer Identification in Popular Music Recordings Using Voice Coding Features , 2002 .

[9]  Alvin F. Martin,et al.  The DET curve in assessment of detection task performance , 1997, EUROSPEECH.

[10]  Hsin-Min Wang,et al.  Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Hiromasa Fujihara,et al.  Singer Identification Based on Accompaniment Sound Reduction and Reliable Frame Selection , 2005, ISMIR.

[12]  Pafan Doungpaisan Singer Identification Using Time-Frequency Audio Feature , 2011, ISNN.

[13]  Daniel P. W. Ellis,et al.  USING VOICE SEGMENTS TO IMPROVE ARTIST CLASSIFICATION OF MUSIC , 2002 .