Content-Based Music Information Retrieval: Current Directions and Future Challenges

| The steep rise in music downloading over CD sales has created a major shift in the music industry away from physical media formats and towards online products and services. Music is one of the most popular types of online information and there are now hundreds of music streaming and download services operating on the World-Wide Web. Some of the music collections available are approaching the scale of ten million tracks and this has posed a major challenge for searching, retrieving, and organizing music content. Research efforts in music information retrieval have involved experts from music perception, cognition, musicology, engineering, and computer science engaged in truly interdisciplinary activity that has resulted in many proposed algorithmic and methodological solutions to music search using content-based methods. This paper outlines the problems of content-based music information retrieval and explores the state-of-the-art methods using audio cues (e.g., query by humming, audio fingerprinting, content-based music retrieval) and other cues (e.g., music notation and symbolic representation), and identifies some of the major challenges for the coming years.

[1]  Marc Leman,et al.  Origin and Nature of Cognitive and Systematic Musicology: An Introduction , 1996, Joint International Conference on Cognitive and Systematic Musicology.

[2]  Meinard Müller,et al.  Towards Structural Analysis of Audio Recordings in the Presence of Musical Variations , 2007, EURASIP J. Adv. Signal Process..

[3]  George Tzanetakis,et al.  An experimental comparison of audio tempo induction algorithms , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Ning Hu,et al.  A comparison of melodic database retrieval techniques using sung queries , 2002, JCDL '02.

[5]  Judith C. Brown,et al.  An efficient algorithm for the calculation of a constant Q transform , 1992 .

[6]  Fabio Vignoli,et al.  Visual Playlist Generation on the Artist Map , 2005, ISMIR.

[7]  Fabio Vignoli,et al.  Mapping Music In The Palm Of Your Hand, Explore And Discover Your Collection , 2004, ISMIR.

[8]  M. Davies,et al.  Complex domain onset detection for musical signals , 2003 .

[9]  Jaakko Astola,et al.  Analysis of the meter of acoustic musical signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[10]  Michael Clausen,et al.  PROMS: A Web-based Tool for Searching in Polyphonic Music , 2000, ISMIR.

[11]  Ramon López de Mántaras,et al.  Melody retrieval using the Implication/Realization Model , 2005 .

[12]  Xavier Amatriain CLAM: A Framework for Audio and Music Application Development , 2007, IEEE Software.

[13]  Roger B. Dannenberg Toward Automated Holistic Beat Tracking, Music Analysis and Understanding , 2005, ISMIR.

[14]  Juan Pablo Bello,et al.  A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.

[15]  Daniel P. W. Ellis,et al.  A Classification Approach to Melody Transcription , 2005, ISMIR.

[16]  Masataka Goto,et al.  An Audio-based Real-time Beat Tracking System for Music With or Without Drum-sounds , 2001 .

[17]  Malcolm Slaney,et al.  Construction and evaluation of a robust multifeature speech/music discriminator , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Peter Knees,et al.  Assigning and Visualizing Music Genres by Web-based Co-Occurrence Analysis , 2006, ISMIR.

[19]  Josep Lluís Arcos,et al.  Visualizing and Exploring Personal Music Libraries , 2004, ISMIR.

[20]  T G Bever,et al.  Harmonic structure as a determinant of melodic organization , 1981, Memory & cognition.

[21]  György Fazekas,et al.  Intelligent Editing of Studio Recordings with the Help of Automatic Music Structure Extraction , 2007 .

[22]  Emilia Gómez,et al.  Tonal Description of Polyphonic Audio for Music Content Processing , 2006, INFORMS J. Comput..

[23]  M. Levy,et al.  Signal-based Music Searching and Browsing , 2007, 2007 Digest of Technical Papers International Conference on Consumer Electronics.

[24]  Nicholas Cook,et al.  Empirical musicology : aims, methods, prospects , 2004 .

[25]  Malcolm Slaney,et al.  Analysis of Minimum Distances in High-Dimensional Musical Spaces , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[26]  Avery Wang,et al.  An Industrial Strength Audio Search Algorithm , 2003, ISMIR.

[27]  Hiromasa Fujihara,et al.  A Music Information Retrieval System Based on Singing Voice Timbre , 2007, ISMIR.

[28]  DeLiang Wang,et al.  Detecting pitch of singing voice in polyphonic audio , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[29]  Masataka Goto,et al.  A real-time music-scene-description system: predominant-F0 estimation for detecting melody and bass lines in real-world audio signals , 2004, Speech Commun..

[30]  Eric Allamanche,et al.  Content-based Identification of Audio Material Using MPEG-7 Low Level Description , 2001, ISMIR.

[31]  Masataka Goto,et al.  Musicream: New Music Playback Interface for Streaming, Sticking, Sorting, and Recalling Musical Pieces , 2005, ISMIR.

[32]  Godfried T. Toussaint,et al.  A Comparison of Rhythmic Similarity Measures , 2004, ISMIR.

[33]  Remco C. Veltkamp,et al.  Using transportation distances for measuring melodic similarity , 2003, ISMIR.

[34]  Christopher Raphael,et al.  A Probabilistic Expert System for Automatic Musical Accompaniment , 2001 .

[35]  Graham E. Poliner,et al.  Melody Transcription From Music Audio: Approaches and Evaluation , 2007, IEEE Transactions on Audio, Speech, and Language Processing.

[36]  Masataka Goto,et al.  A chorus section detection method for musical audio signals and its application to a music listening station , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Hiromasa Fujihara,et al.  F0 Estimation Method for Singing Voice in Polyphonic Audio Signal Based on Statistical Vocal Model and Viterbi Search , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[38]  Lie Lu,et al.  Automatic mood detection and tracking of music audio signals , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[39]  Marc Leman,et al.  How potential users of music search and retrieval systems describe the semantic quality of music , 2008 .

[40]  Remco C. Veltkamp,et al.  Applying Rhythmic Similarity Based on Inner Metric Analysis to Folksong Research , 2007, ISMIR.

[41]  Ryan M. Rifkin,et al.  Musical query-by-description as a multiclass learning problem , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[42]  E. Narmour The analysis and cognition of basic melodic structures , 1992 .

[43]  Takuya Fujishima,et al.  Realtime Chord Recognition of Musical Sound: a System Using Common Lisp Music , 1999, ICMC.

[44]  Michael A. Casey,et al.  Soundspotter and Remix-TV: Fast Approxmate Matching for Audio-Visual Performance , 2007 .

[45]  J. Sloboda,et al.  Music and emotion: Theory and research , 2001 .

[46]  J. Stephen Downie,et al.  Evaluating a simple approach to music information retrieval : conceiving melodic n-grams as text , 1999 .

[47]  Daniel P. W. Ellis,et al.  Identifying `Cover Songs' with Chroma Features and Dynamic Programming Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[48]  Eleanor Selfridge-Field,et al.  Melodic Similarity : concepts, procedures, and applications , 1998 .

[49]  P. Janata,et al.  Embodied music cognition and mediation technology , 2009 .

[50]  Emilia Gómez,et al.  A cover song identification system based on sequences of tonal descriptors , 2007 .

[51]  Simon Dixon,et al.  Automatic Extraction of Tempo and Beat From Expressive Performances , 2001 .

[52]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[53]  Robert O. Gjerdingen,et al.  The Psychology of Music , 1972 .

[54]  Marc Leman,et al.  Music, Gestalt, and Computing - Studies in Cognitive and Systematic Musicology , 1997 .

[55]  C. Harte,et al.  Detecting harmonic change in musical audio , 2006, AMCMM '06.

[56]  Roger B. Dannenberg,et al.  An On-Line Algorithm for Real-Time Accompaniment , 1984, ICMC.

[57]  Michael A. Casey,et al.  Algorithms for Determining and Labelling Approximate Hierarchical Self-Similarity , 2007, ISMIR.

[58]  George Tzanetakis,et al.  MARSYAS: a framework for audio analysis , 1999, Organised Sound.

[59]  Mario Nöcker,et al.  Databionic Visualization of Music Collections According to Perceptual Distance , 2005, ISMIR.

[60]  Esko Ukkonen,et al.  Geometric algorithms for transposition invariant content based music retrieval , 2003, ISMIR.

[61]  I. Shmulevich,et al.  A system for machine recognition of music patterns , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[62]  Anssi Klapuri,et al.  Transcription of the Singing Melody in Polyphonic Music , 2006, ISMIR.

[63]  Matija Marolt Gaussian Mixture Models For Extraction Of Melodic Lines From Audio Recordings , 2004, ISMIR.

[64]  Eric D. Scheirer,et al.  Tempo and beat analysis of acoustic musical signals. , 1998, The Journal of the Acoustical Society of America.

[65]  Òscar Celma,et al.  Foafing the Music: A Music Recommendation System based on RSS Feeds and User Preferences , 2005, ISMIR.

[66]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[67]  Xiao Wu A QBSH SYSTEM BASED ON THREE-LEVEL MELODY REPRESENTATION , 2007 .

[68]  Amílcar Cardoso,et al.  An Auditory Model Based Approach for Melody Detection in Polyphonic Musical Recordings , 2004, CMMR.

[69]  Daniel P. W. Ellis,et al.  Anchor space for classification and similarity measurement of music , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[70]  Malcolm Slaney,et al.  Acoustic Chord Transcription and Key Extraction From Audio Using Key-Dependent HMMs Trained on Synthesized Audio , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[71]  Gerhard Widmer,et al.  Exploring Music Collections by Browsing Different Views , 2004, Computer Music Journal.

[72]  Adrian Freed Music MetaData Quality: A Multiyear Case Study using the Music of Skip James , 2006 .

[73]  J. Stephen Downie,et al.  Visual Collaging Of Music In A Digital Library , 2004, ISMIR.

[74]  Diemo Schwarz THE CATERPILLAR SYSTEM FOR DATA-DRIVEN CONCATENATIVE SOUND SYNTHESIS , 2003 .

[75]  Guy J. Brown,et al.  Extracting Melody Lines From Complex Audio , 2004, ISMIR.

[76]  François Pachet,et al.  Music Similarity Measures: What's the use? , 2002, ISMIR.

[77]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[78]  Mark B. Sandler,et al.  Using duration models to reduce fragmentation in audio segmentation , 2006, Machine Learning.

[79]  Ronald de Wolf,et al.  Algorithmic Clustering of Music Based on String Compression , 2004, Computer Music Journal.

[80]  Xavier Rodet,et al.  Toward Automatic Music Audio Summary Generation from Signal Analysis , 2002, ISMIR.

[81]  William A. Sethares,et al.  Equilibria of Adaptive Wavetable Oscillators with Applications to Beat Tracking , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[82]  Masataka Goto,et al.  A Virtual Dancer ‘ ‘ Cindy ’ ’ Interactive Performance of a Music-controlled CG Dancer , 2008 .

[83]  Masataka Goto,et al.  A Real-time Music Scene Description System: Detecting Melody and Bass Lines in Audio Signals , 1999 .

[84]  Ali Taylan Cemgil,et al.  Monte Carlo Methods for Tempo Tracking and Rhythm Quantization , 2011, J. Artif. Intell. Res..

[85]  J.P. Bello,et al.  Phase-based note onset detection for music signals , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[86]  William A. Sethares,et al.  Meter and Periodicity in Musical Performance , 2001 .

[87]  J. Stephen Downie,et al.  The International Music Information Retrieval Systems Evaluation Laboratory: Governance, Access and Security , 2004, ISMIR.

[88]  George Tzanetakis,et al.  Automatic Musical Genre Classification of Audio Signals , 2001, ISMIR.

[89]  Adrian C. North,et al.  The Functions of Music in Everyday Life: Redefining the Social in Music Psychology , 1999 .

[90]  Hiromasa Fujihara,et al.  Automatic Synchronization between Lyrics and Music CD Recordings Based on Viterbi Alignment of Segregated Vocal Signals , 2006, Eighth IEEE International Symposium on Multimedia (ISM'06).

[91]  Marc Leman,et al.  Content-Based Music Information Retrieval: Current Directions and Future Challenges , 2008, Proceedings of the IEEE.

[92]  George Tzanetakis,et al.  Manipulation, analysis and retrieval systems for audio signals , 2002 .

[93]  Malcolm D. Macleod,et al.  Particle Filtering Applied to Musical Tempo Tracking , 2004, EURASIP J. Adv. Signal Process..

[94]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[95]  Elaine Chew,et al.  CLASSIFICATION USING INNER METRIC ANALYSIS A Computational Approach and Case Study Using 101 Latin American Dances and National Anthems , 2004 .

[96]  Masataka Goto,et al.  MusicRainbow: A New User Interface to Discover Artists Using Audio-based Similarity and Web-based Labeling , 2006, ISMIR.

[97]  Masataka Goto,et al.  Instrogram: A New Musical Instrument Recognition Technique Without Using Onset Detection NOR F0 Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[98]  Anssi Klapuri,et al.  Sound onset detection by applying psychoacoustic knowledge , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[99]  G. H. Wakefield,et al.  To catch a chorus: using chroma-based representations for audio thumbnailing , 2001, Proceedings of the 2001 IEEE Workshop on the Applications of Signal Processing to Audio and Acoustics (Cat. No.01TH8575).