Audio retrieval using timbral feature

The increase in availability of music information demands for the development of tools for audio retrieval. Audio information retrieval implicates the retrieval of similar audio files based on the feature. Feature extraction is one of the important tasks where the entire retrieval system relies on. In this work, audio information retrieval has been performed on GTZAN datasets using Delta Mel-Frequency Cepstral Coefficients (MFCC) feature which is a kind of timbre feature. The results obtained for the various stages of feature extraction and retrieval performance plot has been presented. The average precision and recall values obtained are 78.67% and 58.02%, respectively.

[1]  V. Tiwari MFCC and its applications in speaker recognition , 2010 .

[2]  Ishwar K. Sethi,et al.  Classification of general audio data for content-based retrieval , 2001, Pattern Recognit. Lett..

[3]  François Pachet,et al.  "The way it Sounds": timbre models for analysis and retrieval of music signals , 2005, IEEE Transactions on Multimedia.

[4]  Hsin-Min Wang,et al.  Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval , 2010, 2010 IEEE International Conference on Multimedia and Expo.

[5]  E Tsunoo,et al.  Beyond Timbral Statistics: Improving Music Classification Using Percussive Patterns and Bass Lines , 2011, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[7]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  Keikichi Hirose,et al.  MFCC enhancement using joint corrupted and noise feature space for highly non-stationary noise environments , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Thomas Sikora,et al.  Audio classification based on MPEG-7 spectral basis representations , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Christian Breiteneder,et al.  Features for Content-Based Audio Retrieval , 2010, Adv. Comput..

[11]  Tao Li,et al.  A comparative study on content-based music genre classification , 2003, SIGIR.

[12]  Haizhou Li,et al.  Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[13]  Atanas Ouzounov Cepstral Features and Text-Dependent Speaker Identification – A Comparative Study , 2010 .

[14]  P. Dhanalakshmi,et al.  Classification of audio signals using SVM and RBFNN , 2009, Expert Syst. Appl..

[15]  Gonçalo Marques,et al.  A Music Classification Method based on Timbral Features , 2009, ISMIR.

[16]  Guodong Guo,et al.  Content-based audio classification and retrieval by support vector machines , 2003, IEEE Trans. Neural Networks.

[17]  Riccardo Miotto,et al.  A Generative Context Model for Semantic Music Annotation and Retrieval , 2012, IEEE Transactions on Audio, Speech, and Language Processing.