Audio Retrieval based on Cepstral Feature

The interest towards music is rapidly growing in our day to day life. It is necessary to have efficient system to retrieve relevant music for the user. The audio retrieval system mainly depends on the feature extraction process because only the meaningful feature will provide better retrieval task. In this work, audio information retrieval has been performed on GTZAN datasets using weighted Mel-Frequency Cepstral Coefficients (WMFCC) feature which is a kind of cepstral feature. The results obtained for the various stages of feature extraction WMFCC and retrieval performance plot has been presented. The mean precision values obtained for the audio files from the GTZAN database are 96.40% respectively. General Terms Segmentation, query, Audio, Filters.

[1]  Tao Li,et al.  A comparative study on content-based music genre classification , 2003, SIGIR.

[2]  Gerard Salton,et al.  The SMART Retrieval System , 1971 .

[3]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[4]  Michael S. Lewicki,et al.  Efficient coding of natural sounds , 2002, Nature Neuroscience.

[5]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[6]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Guodong Guo,et al.  Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[8]  Gerhard Widmer,et al.  Improvements of Audio-Based Music Similarity and Genre Classificaton , 2005, ISMIR.

[9]  Gonçalo Marques,et al.  A Music Classification Method based on Timbral Features , 2009, ISMIR.

[10]  Hsin-Min Wang,et al.  Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[11]  Haizhou Li,et al.  Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  Keikichi Hirose,et al.  MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[13]  Thomas Sikora,et al.  Audio classification based on MPEG-7 spectral basis representations , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  Christian Breiteneder,et al.  Features for Content-Based Audio Retrieval , 2010, Adv. Comput..

[15]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[16]  V. Tiwari MFCC and its applications in speaker recognition , 2010 .

[17]  François Pachet,et al.  "The way it Sounds": timbre models for analysis and retrieval of music signals , 2005, IEEE Transactions on Multimedia.

[18]  Atanas Ouzounov Cepstral Features and Text-Dependent Speaker Identification – A Comparative Study , 2010 .

[19]  Ming Chun. Liu,et al.  Content-based audio classification and retrieval. , 2005 .

[20]  Riccardo Miotto,et al.  A Generative Context Model for Semantic Music Annotation and Retrieval , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[21]  P. Dhanalakshmi,et al.  Classification of audio signals using SVM and RBFNN , 2009, Expert Syst. Appl..

[22]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[23]  Douglas E. Sturim,et al.  Speaker indexing in large audio databases using anchor models , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[24]  A. Tanju Erdem,et al.  Formant position based weighted spectral features for emotion recognition , 2011, Speech Commun..