Fast intra-collection audio matching

The general goal of audio matching is to identify all audio extracts of a music collection that are similar to a given query snippet. Over the last years, several approaches to this task have been presented. However, due to the complexity of audio matching the proposed approaches usually either yield excellent matches but have a poor runtime or provide quick responses albeit calculate less satisfying retrieval results. In this paper, we present a novel procedure that combines the positive aspects and efficiently computes good retrieval results. Our idea is to exploit the fact that in some practical applications queries are not arbitrary audio snippets but are rather given as extracts from the music collection itself (intra-collection query). This allows us to split the audio collection into equal sized overlapping segments and to precompute their retrieval results using dynamic time warping (DTW). Storing these matches in appropriate index structures enables us to efficiently recombine them at runtime. Our experiments indicate a significant speedup compared to classical DTW-based audio retrieval while achieving nearly the same retrieval quality.

[1]  Meinard Müller,et al.  Efficient Index-Based Audio Matching , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[2]  Meinard Müller,et al.  Towards Timbre-Invariant Audio Features for Harmony-Based Music , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[4]  Meinard Müller,et al.  Chroma Toolbox: Matlab Implementations for Extracting Variants of Chroma-Based Audio Features , 2011, ISMIR.

[5]  Cheng Yang,et al.  Efficient acoustic index for music retrieval with various degrees of similarity , 2002, MULTIMEDIA '02.

[6]  Mathematisch-Naturwissenschaftlichen Fakultät,et al.  A Digital Library Framework for Heterogeneous Music Collections—from Document Acquisition to Cross-Modal Interaction , 2013 .

[7]  Peter Grosche,et al.  Audio Content-Based Music Retrieval , 2012, Multimodal Music Processing.

[8]  Peter Grosche,et al.  Toward characteristic audio shingles for efficient cross-version music retrieval , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Malcolm Slaney,et al.  Analysis of Minimum Distances in High-Dimensional Musical Spaces , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[11]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[12]  Kyuseok Shim,et al.  Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases , 1995, VLDB.

[13]  Dimitrios Gunopulos,et al.  Embedding-based subsequence matching in time-series databases , 2011, TODS.

[14]  Avery Wang,et al.  An Industrial Strength Audio Search Algorithm , 2003, ISMIR.

[15]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.