Audio-Based Cover Song Retrieval Using Approximate Chord Sequences: Testing Shifts, Gaps, Swaps and Beats

This paper presents a variation on the theme of using string alignment for MIR in the context of cover song identification in audio collections. Here, the strings are derived from audio by means of HMM-based chord estimation. The characteristics of the cover-song ID problem and the nature of common chord estimation errors are carefully considered. As a result strategies are proposed and systematically evaluated for key shifting, the cost of gap insertions and character swaps in string alignment, and the use of a beat-synchronous feature set. Results support the view that string alignment, as a mechanism for audiobased retrieval, cannot be oblivious to the problems of robustly estimating musically-meaningful data from audio.

[1]  Rainer Typke,et al.  Music Retrieval based on Melodic Similarity , 2007 .

[2]  Kyogu Lee,et al.  Identifying Cover Songs from Audio Using Harmonic Representation , 2006 .

[3]  Gregory H. Wakefield,et al.  Iterative Deepening for Melody Alignment and Retrieval , 2005, ISMIR.

[4]  Mark B. Sandler,et al.  Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations , 2005, ISMIR.

[5]  Michael A. Casey,et al.  The Importance of Sequences in Musical Similarity , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[6]  Matija Marolt,et al.  A Mid-level Melody-based Representation for Calculating Audio Similarity , 2006, ISMIR.

[7]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[8]  Juan Pablo Bello,et al.  A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.

[9]  Kjell Lemström,et al.  String Matching Techniques for Music Retrieval , 2000 .

[10]  Durbin,et al.  Biological Sequence Analysis , 1998 .

[11]  William P. Birmingham,et al.  A Comprehensive Trainable Error Model for Sung Music Queries , 2004, J. Artif. Intell. Res..

[12]  Meinard Müller,et al.  Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[13]  Sean R. Eddy,et al.  Biological sequence analysis: Contents , 1998 .