Towards efficient automated singer identification in large music databases

Automated singer identification is important in organising, browsing and retrieving data in large music databases. In this paper, we propose a novel scheme, called Hybrid Singer Identifier (HSI), for automated singer recognition. HSI can effectively use multiple low-level features extracted from both vocal and non-vocal music segments to enhance the identification process with a hybrid architecture and build profiles of individual singer characteristics based on statistical mixture models. Extensive experimental results conducted on a large music database demonstrate the superiority of our method over state-of-the-art approaches.

[1]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[2]  Kian-Lee Tan,et al.  HSI: A Novel Framework for Efficient Automated Singer Identification in Large Music Database , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[3]  Wayne H. Ward,et al.  Speech recognition , 1997 .

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[6]  Hans C. van Houwelingen,et al.  The Elements of Statistical Learning, Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani and Jerome Friedman, Springer, New York, 2001. No. of pages: xvi+533. ISBN 0‐387‐95284‐5 , 2004 .

[7]  François Pachet,et al.  Content management for electronic music distribution , 2003, CACM.

[8]  Daniel P. W. Ellis,et al.  USING VOICE SEGMENTS TO IMPROVE ARTIST CLASSIFICATION OF MUSIC , 2002 .

[9]  J. Sundberg,et al.  The Science of Singing Voice , 1987 .

[10]  Michael I. Jordan Why the logistic function? A tutorial discussion on probabilities and neural networks , 1995 .

[11]  Tong Zhang,et al.  Automatic singer identification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[12]  Daniel P. W. Ellis,et al.  Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[13]  Yoram Singer,et al.  Logistic Regression, AdaBoost and Bregman Distances , 2000, Machine Learning.

[14]  Gregory H. Wakefield,et al.  Singing voice identification using spectral envelope estimation , 2004, IEEE Transactions on Speech and Audio Processing.

[15]  Steve Lawrence,et al.  Artist detection in music with Minnowmatch , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[16]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[17]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[18]  Jong Kyoung Kim,et al.  Speech recognition , 1983, 1983 IEEE International Solid-State Circuits Conference. Digest of Technical Papers.

[19]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[20]  Tao Li,et al.  A comparative study on content-based music genre classification , 2003, SIGIR.

[21]  Chih-Chin Liu,et al.  A singer identification technique for content-based classification of MP3 music objects , 2002, CIKM '02.

[22]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[23]  Xavier Rodet,et al.  MUSICAL INSTRUMENT IDENTIFICATION IN CONTINUOUS RECORDINGS , 2004 .

[24]  Changsheng Xu,et al.  Automatic music classification and summarization , 2005, IEEE Transactions on Speech and Audio Processing.

[25]  Hsin-Min Wang,et al.  Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.