Fusing similarity functions for cover song identification

Cover Song Identification (CSI) technique, refers to the process of identifying an alternative version, performance, rendition, or recording of a previously recorded musical composition by measuring and modeling the musical similarity between them quantitatively and objectively. However, it is not possible to describe the similarity between tracks comprehensively and reliably with only one similarity function. In this paper, the Similarity Network Fusion (SNF) technique, which was originally proposed for combining different kernels for predicting drug-target interactions, is adopted to fuse different similarities based on the same descriptor and different similarity functions. First, the Harmonic Pitch Class Profile (HPCP) is extracted from each track. Next, the similarities, in terms of Qmax and Dmax measures, between the HPCP descriptors of any two tracks are calculated, respectively. Then, the track-by-track similarity networks based on Qmax and on Dmax similarity are constructed separately and then fused into one network by SNF. Finally, the fused similarities obtained from the fused similarity network are adopted to train a classifier, which can then be used to identify whether the input two tracks belong to reference/cover or reference/non-cover pair. Experimental results on Covers80 (http://labrosa.ee.columbia.edu/projects/coversongs/covers80/), subset of SecondHandSongs (SHS) (http://labrosa.ee.columbia.edu/millionsong/secondhand), and the Mixed Collection and Mazurka Cover Collection provided by MIREX (http://www.music-ir.org/mirex/wiki/2016:Audio_Cover_Song_Identification) demonstrate that the proposed scheme performs comparably with or even better than state-of-the-art CSI schemes.

[1]  Emilia Gómez,et al.  Tonal representations for music retrieval: from version identification to query-by-humming , 2012, International Journal of Multimedia Information Retrieval.

[2]  Hsin-Min Wang,et al.  Using the Similarity of Main Melodies to Identify Cover Versions of Popular Songs for Music Document Retrieval , 2008, J. Inf. Sci. Eng..

[3]  Daniel P. W. Ellis,et al.  Identifying "Cover Songs" with Beat-Synchronous Chroma Features , 2006 .

[4]  Emilia Gómez,et al.  Audio Cover Song Identification and Similarity: Background, Approaches, Evaluation, and Beyond , 2010, Advances in Music Information Retrieval.

[5]  Xiao Chuan,et al.  Cover song identification using an enhanced chroma over a binary classifier based similarity measurement framework , 2012, 2012 International Conference on Systems and Informatics (ICSAI2012).

[6]  Xavier Serra,et al.  Chroma Binary Similarity and Local Alignment Applied to Cover Song Identification , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Massimiliano Zanin,et al.  Cover song retrieval by cross recurrence quantification and unsupervised set detection , 2009 .

[8]  Meinard Müller,et al.  Towards Timbre-Invariant Audio Features for Harmony-Based Music , 2010, IEEE Transactions on Audio, Speech, and Language Processing.

[9]  Emilia Gómez,et al.  Tonal Description of Polyphonic Audio for Music Content Processing , 2006, INFORMS J. Comput..

[10]  Richard F. Lyon,et al.  The Intervalgram: An Audio Feature for Large-Scale Cover-Song Recognition , 2012, CMMR.

[11]  J. Stephen Downie,et al.  Cochlear pitch class profile for cover song identification , 2015 .

[12]  Joan Serrà,et al.  Identification of versions of the same musical composition by processing audio descriptions , 2011 .

[13]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[14]  Maurizio Omologo,et al.  Large-Scale Cover Song Identification Using Chord Profiles , 2013, ISMIR.

[15]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[16]  Hsin-Min Wang,et al.  Query-By-Example Technique for Retrieving Cover Versions of Popular Songs with Similar Melodies , 2005, ISMIR.

[17]  Takuya Fujishima,et al.  Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[18]  Juan Pablo Bello,et al.  Audio-Based Cover Song Retrieval Using Approximate Chord Sequences: Testing Shifts, Gaps, Swaps and Beats , 2007, ISMIR.

[19]  Matija Marolt,et al.  A Mid-level Melody-based Representation for Calculating Audio Similarity , 2006, ISMIR.

[20]  Chuan Xiao,et al.  Cover song identification using an enhanced chroma over a binary classifier based similarity measurement framework , 2012, ICONS 2012.

[21]  Emilia Gómez Gutiérrez,et al.  Tonal description of music audio signals , 2006 .

[22]  Mathieu Lagrange,et al.  Multimodal similarity between musical streams for cover version detection , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[23]  Riccardo Leonardi,et al.  A heuristic for distance fusion in cover song identification , 2013, 2013 14th International Workshop on Image Analysis for Multimedia Interactive Services (WIAMIS).

[24]  Zhuowen Tu,et al.  Similarity network fusion for aggregating data types on a genomic scale , 2014, Nature Methods.

[25]  R. Andrzejak,et al.  Cross recurrence quantification for cover song identification , 2009 .

[26]  Matija Marolt,et al.  A Mid-Level Representation for Melody-Based Retrieval in Audio Collections , 2008, IEEE Transactions on Multimedia.

[27]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[28]  Pao-Chi Chang,et al.  Cover song identification with direct chroma feature extraction from AAC files , 2013, 2013 IEEE 2nd Global Conference on Consumer Electronics (GCCE).

[29]  Emilia Gómez,et al.  The song remains the same: identifying versions of the same piece using tonal descriptors , 2006, ISMIR.

[30]  Emilia Gómez,et al.  Melody, bass line, and harmony representations for music version identification , 2012, WWW.

[31]  Daniel P. W. Ellis,et al.  Cover song detection: From high scores to general classification , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[32]  Ning Chen,et al.  Similarity fusion scheme for cover song identification , 2016 .

[33]  Justin Salamon,et al.  Melody extraction from polyphonic music signals , 2013 .

[34]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008, Acoustical Science and Technology.