Learning Binary Codes For Efficient Large-Scale Music Similarity Search

Content-based music similarity estimation provides a way to find songs in the unpopular “long tail” of commercial catalogs. However, state-of-the-art music similarity measures are too slow to apply to large databases, as they are based on finding nearest neighbors among very high-dimensional or non-vector song representations that are difficult to index. In this work, we adopt recent machine learning methods to map such song representations to binary codes. A linear scan over the codes quickly finds a small set of likely neighbors for a query to be refined with the original expensive similarity measure. Although search costs grow linearly with the collection size, we show that for commercialscale databases and two state-of-the-art similarity measures, this outperforms five previous attempts at approximate nearest neighbor search. When required to return 90% of true nearest neighbors, our method is expected to answer 4.2 1-NN queries or 1.3 50-NN queries per second on a collection of 30 million songs using a single CPU core; an up to 260 fold speedup over a full scan of 90% of the database.

[1]  Antonio Torralba,et al.  Small codes and large image databases for recognition , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  G. Widmer Mirage - High-Performance Music Similarity Computation and Automatic Playlist Generation , 2007 .

[3]  Christos Faloutsos,et al.  FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets , 1995, SIGMOD '95.

[4]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[5]  Andrew W. Moore,et al.  An Investigation of Practical Approximate Nearest Neighbor Algorithms , 2004, NIPS.

[6]  Wei-Ying Ma,et al.  Scalable music recommendation by search , 2007, ACM Multimedia.

[7]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[8]  Zhe Wang,et al.  Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search , 2007, VLDB.

[9]  Yannis Manolopoulos,et al.  Nonlinear dimensionality reduction for efficient and effective audio similarity searching , 2009, Multimedia Tools and Applications.

[10]  Peter Knees,et al.  On Rhythm and General Music Similarity , 2009, ISMIR.

[11]  Peter Knees,et al.  Automatic Music Tag Classification Based On Block-Level Features , 2010 .

[12]  David J. Fleet,et al.  Hamming Distance Metric Learning , 2012, NIPS.

[13]  Gert R. G. Lanckriet,et al.  Large-scale music similarity search with spatial trees , 2011, ISMIR.

[14]  Geoffrey E. Hinton,et al.  Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[15]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[16]  Arthur Flexer,et al.  A MIREX Meta-analysis of Hubness in Audio Music Similarity , 2012, ISMIR.

[17]  Gerhard Widmer,et al.  A Filter-and-Refine Indexing Method for Fast Similarity Search in Millions of Music Tracks , 2009, ISMIR.

[18]  Klaus Seyerlehner FUSING BLOCK-LEVEL FEATURES FOR MUSIC SIMILARITY ESTIMATION , 2010 .

[19]  Gonzalo Navarro,et al.  Proximity Searching in High Dimensional Spaces with a Proximity Preserving Order , 2005, MICAI.

[20]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[21]  Junfeng He,et al.  Optimal Parameters for Locality-Sensitive Hashing , 2012, Proceedings of the IEEE.

[22]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.