Efficient Cover Song Identification using approximate nearest neighbors

Automatically detecting cover songs imply being robust to several kinds of musical modulations. Timbral variance can be accounted at the feature level, but key and most importantly tempo variations have to be dealt with at the retrieval stage. For that purpose, most state of the art approaches consider exhaustive search based on song to song matching methods that fail at scaling up. In this paper, we introduce a hybrid technique. It first retrieves the approximate neighbors of each query chroma descriptor. In a second stage, the temporal consistency is exploited to further filter out some matches, thereby filtering irrelevant songs. Our method performs a search in a dataset comprising 80 songs in about 1s, while achieving satisfactory accuracy compared to the best performing techniques of the state of the art.

[1]  Daniel P. W. Ellis,et al.  The 2007 LabROSA Cover Song Detection System , 2007 .

[2]  Pedro Cano,et al.  A Review of Audio Fingerprinting , 2005, J. VLSI Signal Process..

[3]  H. Tong,et al.  Threshold Autoregression, Limit Cycles and Cyclical Data , 1980 .

[4]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[6]  Joan Serrà,et al.  Model-based cover song detection via threshold autoregressive forecasts , 2010, MML '10.

[7]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[8]  Nicola Orio,et al.  A scalable cover identification engine , 2010, ACM Multimedia.

[9]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[10]  Xavier Serra,et al.  Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[11]  Matthijs Douze,et al.  Searching in one billion vectors: Re-rank with source coding , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[13]  R. Andrzejak,et al.  Cross recurrence quantification for cover song identification , 2009 .

[14]  Emilia Gómez Gutiérrez,et al.  Tonal description of music audio signals , 2006 .

[15]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.