A Mid-Level Representation for Melody-Based Retrieval in Audio Collections

Searching audio collections using high-level musical descriptors is a difficult problem, due to the lack of reliable methods for extracting melody, harmony, rhythm, and other such descriptors from unstructured audio signals. In this paper, we present a novel approach to melody-based retrieval in audio collections. Our approach supports audio, as well as symbolic queries and ranks results according to melodic similarity to the query. We introduce a beat-synchronous melodic representation consisting of salient melodic lines, which are extracted from the analyzed audio signal. We propose the use of a 2D shift-invariant transform to extract shift-invariant melodic fragments from the melodic representation and demonstrate how such fragments can be indexed and stored in a song database. An efficient search algorithm based on locality-sensitive hashing is used to perform retrieval according to similarity of melodic fragments. On the cover song detection task, good results are achieved for audio, as well as for symbolic queries, while fast retrieval performance makes the proposed system suitable for retrieval in large databases.

[1]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[2]  Joseph Timoney,et al.  IMPLEMENTING LOUDNESS MODELS IN MATLAB , 2004 .

[3]  R. Parncutt A Perceptual Model of Pulse Salience and Metrical Accent in Musical Rhythms , 1994 .

[4]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[5]  Perry R. Cook,et al.  Music, cognition, and computerized sound: an introduction to psychoacoustics , 1999 .

[6]  Daniel P. W. Ellis,et al.  A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[7]  Ton Kalker,et al.  A Highly Robust Audio Fingerprinting System With an Efficient Search Strategy , 2003 .

[8]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[9]  Jonathan Foote,et al.  Automatic Music Summarization via Similarity Analysis , 2002, ISMIR.

[10]  J. Stephen Downie,et al.  Evaluating a simple approach to music information retrieval : conceiving melodic n-grams as text , 1999 .

[11]  Eleanor Selfridge-Field,et al.  Conceptual and representational issues in melodic comparison , 1998 .

[12]  William W. Cohen,et al.  Web-collaborative filtering: recommending music by crawling the Web , 2000, Comput. Networks.

[13]  Graham E. Poliner,et al.  Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[14]  Joan Serrà A Qualitative Assessment of Measures for the Evaluation of a Cover Song Identification System , 2007, ISMIR.

[15]  Anssi Klapuri,et al.  Signal Processing Methods for Music Transcription , 2006 .

[16]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[17]  Juan Pablo Bello,et al.  Audio-Based Cover Song Retrieval Using Approximate Chord Sequences: Testing Shifts, Gaps, Swaps and Beats , 2007, ISMIR.

[18]  Michael A. Casey,et al.  Song Intersection by Approximate Nearest Neighbor Search , 2006, ISMIR.

[19]  Meinard Müller,et al.  Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[20]  Justin Zobel,et al.  An architecture for effective music information retrieval , 2004, J. Assoc. Inf. Sci. Technol..

[21]  Janne Heikkilä,et al.  A new class of shift-invariant operators , 2004, IEEE Signal Processing Letters.

[22]  Anssi Klapuri,et al.  Multiple Fundamental Frequency Estimation by Summing Harmonic Amplitudes , 2006, ISMIR.

[23]  Julius O. Smith,et al.  Spectral modeling synthesis: A sound analysis/synthesis based on a deterministic plus stochastic decomposition , 1990 .

[24]  Rainer Typke,et al.  Music Retrieval based on Melodic Similarity , 2007 .

[25]  Daniel J. Levitin,et al.  Memory for musical attributes , 1999 .

[26]  Shlomo Dubnov,et al.  Robust temporal and spectral modeling for query By melody , 2002, SIGIR '02.

[27]  Richard Middleton,et al.  Studying Popular Music , 1990 .

[28]  Matija Marolt,et al.  A Mid-level Melody-based Representation for Calculating Audio Similarity , 2006, ISMIR.

[29]  Gerhard Widmer,et al.  Exploring Music Collections by Browsing Different Views , 2004, Computer Music Journal.