An Effective Vocal/Non-vocal Segmentation Approach for Embedded Music Retrieve System on Mobile Phone

With the growing bodies of MP3 songs in Internet, content-based analysis plays an important role for its retrieving and management. Due to most useful information is carried by vocal portions, it is necessary to separate the vocal segments from music. This paper presents a method for vocal/non-vocal segmentation, which uses a new featue extracted directly from MPEG encoded bitstream to avoid the computational cost of completely decoding process. In contrast to conventional classification method based on statistical model, in our method, the similarity matrix is first introduced to partition the input into a series of portions; then the SVM (Support Vector Machine) classifier is employed for vocal/non-vocal classification for each portion, finally, a smoothing method is adopted to correct the misclassification errors brought by the classifier. Experiments show the proposed method not only has lower computational complexity, but also has better accuracy rate of the vocal and non-vocal classification under a broad range of signal noise ratio than the original ones.

[1]  Daniel P. W. Ellis,et al.  Locating singing voice segments within music signals , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).

[2]  Ye Wang,et al.  Automatic Detection Of Vocal Segments In Popular Songs , 2004, ISMIR.

[3]  Changsheng Xu,et al.  An SVM-based classification approach to musical audio , 2003, ISMIR.

[4]  Chih-Chin Liu,et al.  Content-based retrieval of MP3 music objects , 2001, CIKM '01.

[5]  DeLiang Wang,et al.  Separation of singing voice from music accompaniment for monaural recordings , 2007 .

[6]  C.-C. Jay Kuo,et al.  Similarity matrix processing for music structure analysis , 2006, AMCMM '06.

[7]  Tsung-Han Tsai,et al.  Content-Based Retrieval of Mp3 Songs For One Singer Using Quantization Tree Indexing and Melody-Line Tracking Method , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[8]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[9]  David Harel,et al.  Visualizing and Classifying Odors Using a Similar-ity Matrix , 2002 .

[10]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[11]  Jun Wang,et al.  Classification Visualization with Shaded Similarity Matrix , 2002 .

[12]  Concept tree based clustering visualization with shaded similarity matrices , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[13]  M. Davies,et al.  Complex domain onset detection for musical signals , 2003 .