Content-based audio retrieval based on Gabor wavelet filtering

Rapid increase in the amount of audio data and especially music collections demand an efficient method to automatically retrieve audio objects based on its content. In this paper, based on the Gabor wavelet features, we will propose a method for content-based retrieval of perceptually similar music pieces in audio documents. It allows the user to select a reference passage within an audio file and retrieve perceptually similar passages such as repeating phrases within a music piece, similar music clips in a database or one song sung by different persons or in different languages. The proposed method will first divide an audio stream into clips, each of which contains one-second audio information. Then, the frame-based features of each clip are extracted based on the Gabor wavelet filters. Finally, a similarity measuring technique is provided to perform pattern matching on the resulting sequences of feature vectors. Experimental results show that the proposed method can achieve over 96% accuracy rate for audio retrieval.

[1]  A. J. Willis,et al.  A cost-effective fingerprint recognition system for use with low-quality prints and damaged fingertips , 2001, Pattern Recognit..

[2]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[3]  Don Kimber,et al.  Acoustic Segmentation for Audio Browsers , 1997 .

[4]  Kunio Kashino,et al.  Quick audio retrieval using active search , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[5]  C.-C. Jay Kuo,et al.  Content-based classification and retrieval of audio , 1998, Optics & Photonics.

[6]  George Tzanetakis,et al.  Audio Information Retrieval (AIR) Tools , 2000, ISMIR.

[7]  Wolfgang Effelsberg,et al.  Automatic audio content analysis , 1997, MULTIMEDIA '96.

[8]  Stephen W. Smoliar,et al.  Toward content-based audio indexing and retrieval and a new speaker discrimination technique , 1995, IJCAI 1995.

[9]  Jonathan Foote,et al.  An overview of audio information retrieval , 1999, Multimedia Systems.

[10]  Christian Spevak,et al.  SOUNDSPOTTER – A PROTOTYPE SYSTEM FOR CONTENT-BASED AUDIO RETRIEVAL , 2002 .

[11]  C.-C. Jay Kuo,et al.  Hierarchical classification of audio data for archiving and retrieving , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[12]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[13]  Sing-Tze Bow,et al.  Pattern recognition and image preprocessing , 1992 .

[14]  B. S. Manjunath,et al.  Introduction to MPEG-7: Multimedia Content Description Interface , 2002 .

[15]  Cheng Yang,et al.  Music Database Retrieval Based on Spectral Similarity , 2001 .

[16]  George Tzanetakis,et al.  Manipulation, analysis and retrieval systems for audio signals , 2002 .

[17]  Eric D. Scheirer,et al.  Music Content Analysis through Models of Audition , 1998 .

[18]  Guojun Lu,et al.  A technique towards automatic audio classification and retrieval , 1998, ICSP '98. 1998 Fourth International Conference on Signal Processing (Cat. No.98TH8344).

[19]  TechniqueLonce Wyse,et al.  Toward Content-Based Audio Indexing and Retrieval and aNew Speaker Discrimination , 1995 .

[20]  Guy J. Brown,et al.  Computational auditory scene analysis , 1994, Comput. Speech Lang..

[21]  Douglas Keislar,et al.  Content-Based Classification, Search, and Retrieval of Audio , 1996, IEEE Multim..

[22]  Julius O. Smith,et al.  PARSHL: An Analysis/Synthesis Program for Non-Harmonic Sounds Based on a Sinusoidal Representation , 1987, ICMC.

[23]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[24]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[25]  B. S. Manjunath,et al.  Texture Features for Browsing and Retrieval of Image Data , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  S. Qian,et al.  Joint time-frequency analysis : methods and applications , 1996 .

[27]  Hugo Fastl,et al.  Psychoacoustics: Facts and Models , 1990 .