GMM Based Indexing and Retrieval of Music Using MFCC and MPEG-7 Features

Audio which includes voice, music, and various kinds of environmental sounds, is an important type of media, and also a significant part of video. The digital music databases in place these days, people begin to realize the importance of effectively managing music databases relying on music content analysis. The goal of music indexing and retrieval system is to provide the user with capabilities to index and retrieve the music data in an efficient manner. For efficient music retrieval, some sort of music similarity measure is desirable. In this paper, we propose a method for indexing and retrieval of the classified music using Mel-Frequency Cepstral Coefficients (MFCC) and MPEG-7 features. Music clip extraction, feature extraction, creation of an index and retrieval of the query clip are the major issues in automatic audio indexing and retrieval. Indexing is done for all the music audio clips using Gaussian mixture model (GMM) models, based on the features extracted. For retrieval, the probability that the query feature vector belongs to each of the Gaussian is computed. The average Probability density function is computed for each of the Gaussians and the retrieval is based on the highest probability.

[1]  Thomas S. Huang,et al.  Partially Supervised Speaker Clustering , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[3]  Trisiladevi C. Nagavi,et al.  Content based audio retrieval with MFCC feature extraction, clustering and sort-merge techniques , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[4]  Zbigniew W. Ras,et al.  MIRAI: Multi-hierarchical, FS-Tree Based Music Information Retrieval System , 2007, RSEISP.

[5]  Peter Grosche,et al.  Signal processing methods for beat tracking, music segmentation, and audio retrieval , 2012 .

[6]  Samy Bengio,et al.  Large-scale content-based audio retrieval from text queries , 2008, MIR '08.

[7]  Augusto Sarti,et al.  Searching for dominant high-level features for Music Information Retrieval , 2012, 2012 Proceedings of the 20th European Signal Processing Conference (EUSIPCO).

[8]  K. V. Krishna Kishore,et al.  Emotion recognition in speech using MFCC and wavelet features , 2013 .

[9]  Thomas Lidy,et al.  Evaluation of New Audio Features and Their Utilization in Novel Music Retrieval Applications , 2006 .

[10]  D. A. Chandy,et al.  Audio retrieval using timbral feature , 2013, 2013 IEEE International Conference ON Emerging Trends in Computing, Communication and Nanotechnology (ICECCN).

[11]  Haizhou Li,et al.  Low-Variance Multitaper MFCC Features: A Case Study in Robust Speaker Verification , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[12]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .