Popular singer identification based on cepstrum transformation

A prerequisite for identifying the singers in popular music recordings is to reduce the interference of background accompaniment when trying to characterize the singer voice. This study proposes a background music removal approach for singer identification (SID) by exploiting the underlying relationships between solo voices and their accompanied versions in cepstrum. The relationships are characterized by a transformation estimated using a large set of accompanied singing generated by manually mixing solo singing with the accompaniments extracted from Karaoke VCD. This transformation reflects the cepstrum variations of a singing voice before and after it is added with accompaniments. When an unknown accompanied voice is to be identified by our system, we convert its cepstrum into a solo-like one based on the pre-trained transformation. Our experiments show that such a background music removal approach improves the SID accuracy noticeably.

[1]  Haizhou Li,et al.  On fusion of timbre-motivated features for singing voice detection and singer identification , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[2]  Tong Zhang,et al.  Automatic singer identification , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[3]  Douglas A. Reynolds,et al.  Robust text-independent speaker identification using Gaussian mixture speaker models , 1995, IEEE Trans. Speech Audio Process..

[4]  Wei-Ho Tsai,et al.  Automatic Identification of Simultaneous Singers in Duet Recordings , 2008, ISMIR.

[5]  Hsin-Min Wang,et al.  Blind Clustering of Popular Music Recordings Based on Singer Voice Characteristics , 2004, Computer Music Journal.

[6]  Anssi Klapuri,et al.  Singer Identification in Polyphonic Music Using Vocal Separation and Pattern Recognition Methods , 2007, ISMIR.

[7]  Youngmoo E. Kim,et al.  Singer Identification in Popular Music Recordings Using Voice Coding Features , 2002 .

[8]  Steve Lawrence,et al.  Artist detection in music with Minnowmatch , 2001, Neural Networks for Signal Processing XI: Proceedings of the 2001 IEEE Signal Processing Society Workshop (IEEE Cat. No.01TH8584).

[9]  Eric Moulines,et al.  Continuous probabilistic transform for voice conversion , 1998, IEEE Trans. Speech Audio Process..

[10]  Jaakko Astola,et al.  The Mel-Frequency Cepstral Coefficients in the Context of Singer Identification , 2005, ISMIR.

[11]  Kian-Lee Tan,et al.  Towards efficient automated singer identification in large music databases , 2006, SIGIR.

[12]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[13]  Haizhou Li,et al.  Exploring Vibrato-Motivated Acoustic Features for Singer Identification , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Hiromasa Fujihara,et al.  Singer Identification Based on Accompaniment Sound Reduction and Reliable Frame Selection , 2005, ISMIR.

[15]  Changsheng Xu,et al.  Singer identification based on vocal and instrumental models , 2004, ICPR 2004.

[16]  Chih-Chin Liu,et al.  A singer identification technique for content-based classification of MP3 music objects , 2002, CIKM '02.

[17]  Hsin-Min Wang,et al.  Automatic singer recognition of popular music recordings via estimation and modeling of solo vocal signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[18]  Daniel P. W. Ellis,et al.  USING VOICE SEGMENTS TO IMPROVE ARTIST CLASSIFICATION OF MUSIC , 2002 .

[19]  Douglas D. O'Shaughnessy,et al.  Statistical recovery of wideband speech from narrowband speech , 1992, IEEE Trans. Speech Audio Process..

[20]  Gregory H. Wakefield,et al.  Singing voice identification using spectral envelope estimation , 2004, IEEE Transactions on Speech and Audio Processing.