Combining multi-probe histogram and order-statistics based LSH for scalable audio content retrieval

In order to improve the reliability and the scalability of content-based retrieval of variant audio tracks from large music databases, we suggest a new multi-stage LSH scheme that consists in (i) extracting compact but accurate representations from audio tracks by exploiting the LSH idea to summarize audio tracks, and (ii) adequately organizing the resulting representations in LSH tables, retaining almost the same accuracy as an exact kNN retrieval. In the first stage, we use major bins of successive chroma features to calculate a multi-probe histogram (MPH) that is concise but retains the information about local temporal correlations. In the second stage, based on the order statistics (OS) of the MPH, we propose a new LSH scheme, OS-LSH, to organize and probe the histograms. The representation and organization of the audio tracks are storage efficient and support robust and scalable retrieval. Extensive experiments over a large dataset with 30,000 real audio tracks confirm the effectiveness and efficiency of the proposed scheme.

[1]  Alain de Cheveigné,et al.  Scalable Metadata and Quick Retrieval of Audio Signals , 2005, ISMIR.

[2]  Olivier Buisson,et al.  Scalable mining of large video databases using copy detection , 2008, ACM Multimedia.

[3]  Meinard Müller,et al.  Making chroma features more robust to timbre changes , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Riccardo Miotto,et al.  A Music Identification System Based on Chroma Indexing and Statistical Modeling , 2008, ISMIR.

[5]  Meinard Müller,et al.  Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[6]  Yannis Manolopoulos,et al.  Audio Indexing for Efficient Music Information Retrieval , 2005, 11th International Multimedia Modelling Conference.

[7]  Malcolm Slaney,et al.  Analysis of Minimum Distances in High-Dimensional Musical Spaces , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[8]  YU YI MULTI-VERSION MUSIC SEARCH USING ACOUSTIC FEATURE UNION AND EXACT SOFT MAPPING , 2009 .

[9]  Olivier Buisson,et al.  A posteriori multi-probe locality sensitive hashing , 2008, ACM Multimedia.

[10]  Piotr Indyk,et al.  Approximate nearest neighbors: towards removing the curse of dimensionality , 1998, STOC '98.

[11]  Cheng Yang,et al.  Efficient acoustic index for music retrieval with various degrees of similarity , 2002, MULTIMEDIA '02.

[12]  Antonio Torralba,et al.  Spectral Hashing , 2008, NIPS.

[13]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[14]  Heng Tao Shen,et al.  Exploring composite acoustic features for efficient music similarity query , 2006, MM '06.

[15]  Samy Bengio,et al.  Large-scale content-based audio retrieval from text queries , 2008, MIR '08.

[16]  Wang Weihong,et al.  A Scalable Content-based Image Retrieval Scheme Using Locality-sensitive Hashing , 2009, 2009 International Conference on Computational Intelligence and Natural Computing.

[17]  J.F. Serrano,et al.  Music Motive Extraction Through Hanson Intervallic Analysis , 2006, 2006 15th International Conference on Computing.

[18]  Lei Chen,et al.  Local summarization and multi-level LSH for retrieving multi-variant audio tracks , 2009, MM '09.

[19]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[20]  Julien Allali,et al.  Adaption of String Matching Algorithms for Identification of Near-Duplicate Music Documents , 2007, PAN.

[21]  Joydeep Ghosh,et al.  A text retrieval approach to content-based audio retrieval , 2008 .

[22]  Gregory H. Wakefield,et al.  Audio thumbnailing of popular music using chroma-based representations , 2005, IEEE Transactions on Multimedia.