The Scientific Evaluation of Music Information Retrieval Systems: Foundations and Future

Computer Music Journal, 28:2, pp. 12–23, Summer 2004 2004 Massachusetts Institute of Technology. Music Information Retrieval (MIR) is a multidisciplinary research endeavor that strives to develop innovative content-based searching schemes, novel interfaces, and evolving networked delivery mechanisms in an effort to make the world’s vast store of music accessible to all. Some teams are developing ‘‘Query-by-Singing’’ and ‘‘Query-by-Humming’’ systems that allow users to interact with their respective music search engines via queries that are sung or hummed into a microphone (e.g., Birmingham et al. 2001; Haus and Pollastri 2001). ‘‘Queryby-Note’’ systems are also being developed wherein searchers construct queries consisting of pitch and/or rhythm information (e.g., Pickens 2000; Doraisamy and Ruger 2002). Input methods for Queryby-Note systems include symbolic interfaces as well as both physical (MIDI) and virtual (Javabased) keyboards. Some teams are working on ‘‘Query-by-Example’’ systems that take prerecorded music in the form of CD or MP3 tracks as their query input (e.g., Haitsma and Kalker 2002; Harb and Chen 2003). The development of comprehensive music recommendation and distribution systems is a growing research area (e.g., Logan 2002; Pauws and Eggen 2002). The automatic generation of playlists for use in personal music systems, based on a wide variety of user-defined criteria, is the goal of this branch of MIR research. Other groups are investigating the creation of music analysis systems to assist those in the musicology and music theory communities (e.g., Barthelemy and Bonardi 2001; Kornstadt 2001). Overviews of MIR’s interdisciplinary research areas can be found in Downie (2003), Byrd and Crawford (2002), and Futrelle and Downie (2002). This article begins with an overview of the current scientific problem facing MIR research. Entitled ‘‘Current Scientific Problem,’’ the opening section also provides a brief explication of the Text Retrieval Conference (TREC) evaluation paradigm that has come to play an important role in the community’s thinking about the testing and evaluation of MIR systems. The sections which follow, entitled ‘‘Data Collection Method’’ and ‘‘Emergent Themes and Commentary,’’ report upon the findings of the Music Information Retrieval (MIR)/ Music Digital Library (MDL) Evaluation Frameworks Project with issues surrounding the creation of a TREC-like evaluation paradigm for MIR as the central focus. ‘‘Building a TREC-Like Test Collection’’ follows next and highlights the progress being made concerning the establishment of the necessary test collections. The ‘‘Summary and Future Research’’ section concludes this article and highlights some of the key challenges uncovered that require further investigation.

[1]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[2]  J. Crisp,et al.  The Delphi method? , 1997, Nursing research.

[3]  Beth Logan,et al.  Content-Based Playlist Generation: Exploratory Experiments , 2002, ISMIR.

[4]  Masataka Goto,et al.  RWC Music Database: Popular, Classical and Jazz Music Databases , 2002, ISMIR.

[5]  Michael J. Hanson The Reference Interview , 2005 .

[6]  E. Voorhees Whither Music IR Evaluation Infrastructure : Lessons to be Learned from TREC , 2002 .

[7]  Gael. Richard Towards large databases for Music Information Retrieval systems development and evaluation , 2002 .

[8]  J. Stephen Downie,et al.  Evaluating a simple approach to music information retrieval : conceiving melodic n-grams as text , 1999 .

[9]  J. Zobel,et al.  Matching Techniques for Large Music Databases , 1999 .

[10]  H. A. Lingstone,et al.  The Delphi Method: Techniques and Applications , 1976 .

[11]  Berry Eggen,et al.  Realization and User Evaluation of an Automatic Playlist Generator , 2003, ISMIR.

[12]  Daniel P. W. Ellis,et al.  Toward Evaluation Techniques for Music Similarity , 2003, SIGIR 2003.

[13]  Ton Kalker,et al.  A Highly Robust Audio Fingerprinting System , 2002, ISMIR.

[14]  Keiichiro Hoashi,et al.  Comparison of User Ratings of Music in Copyright-free Databases and Onthe-market CDs , 2003 .

[15]  Andreas Kornstädt,et al.  The JRing System for Computer-Assisted Musicological Analysis , 2001, ISMIR.

[16]  Liming Chen,et al.  A Query by Example Music Retrieval Algorithm , 2003 .

[17]  Perfecto Herrera-Boyer Setting Up an Audio Database for Music Information Retrieval Benchmarking , 2002 .

[18]  Patricia Dewdney,et al.  Asking "Why" Questions in the Reference Interview: A Theoretical Justification , 1997, The Library Quarterly.

[19]  Donald Byrd,et al.  Problems of music information retrieval in the real world , 2002, Inf. Process. Manag..

[20]  Joe Futrelle Three Criteria for the Evaluation of Music Information Retrieval Techniques Against Collections of Musical Material , 2002 .

[21]  J. S. Downie The MIR/MDL Evaluation Project White Paper Collection , 2002 .

[22]  Eric J. Isaacson Music IR for Music Theory , 2002 .

[23]  Emanuele Pollastri An Audio Front End for Query-by-Humming Systems , 2001, ISMIR.

[24]  J. Reiss,et al.  Beyond Recall and Precision : A Full Framework for MIR System Evaluation , 2002 .

[25]  Jérôme Barthélemy,et al.  Figured Bass and Tonality Recognition , 2001, ISMIR.

[26]  Geraldine B. King,et al.  The Reference Interview. , 1972 .

[27]  Alan F. Smeaton,et al.  Evaluating a Music Information Retrieval System - TREC Style , 2002 .

[28]  Abby A. Goodrum,et al.  If It Sounds As Good As It Looks : Lessons Learned From Video Retrieval Evaluation , 2003 .

[29]  Jeremy Pickens A Comparison of Language Modeling and Probabilistic Text Information Retrieval Approaches to Monophonic Music Retrieval , 2000, ISMIR.

[30]  Richard P. Smiraglia Musical Works as Information Retrieval Entities: Epistemological Perspectives , 2001, ISMIR.

[31]  J. Stephen Downie,et al.  Report on the panels and workshops of the music information retrieval (MIR) and music digital library (MDL) evaluation frameworks project , 2003, SIGF.

[32]  William P. Birmingham,et al.  MUSART: Music Retrieval Via Aural Queries , 2001, ISMIR.

[33]  Nicholas J. Belkin,et al.  Categories of Music Description and Search Terms and Phrases Used by Non-Music Experts , 2002, ISMIR.

[34]  Justin Zobel,et al.  Manipulation of music for melody matching , 1998, MULTIMEDIA '98.

[35]  Cyril W. Cleverdon,et al.  Factors determining the performance of indexing systems , 1966 .

[36]  E. Rasmussen Evaluation in Information Retrieval , 2002 .

[37]  Linda C. Smith,et al.  Reference and information services : an introduction , 1995 .

[38]  Jeremy Pickens,et al.  Tracks and Topics : Ideas for Structuring Music Retrieval Test Collections and Avoiding Balkanization , 2003 .

[39]  Shyamala Doraisamy,et al.  Emphasizing the Need for TREC-like Collaboration Towards MIR Evaluation , 2003 .

[40]  J. Stephen Downie,et al.  Music information retrieval , 2005, Annu. Rev. Inf. Sci. Technol..

[41]  D. Bainbridge Towards a Workbench for Symbolic Music Information Retrieval , 2002 .

[42]  Stefan M. Rüger,et al.  A Comparative and Fault-tolerance Study of the Use of N-grams with Polyphonic Music , 2002, ISMIR.

[43]  William P. Birmingham,et al.  Query by Humming: How good can it get? , 2003, SIGIR 2003.

[44]  Mari Itoh Subject Search for Music: Quantitative Analysis of Access Point Selection , 2000, ISMIR.

[45]  J. Stephen Downie,et al.  Interdisciplinary Communities and Research Issues in Music Information Retrieval , 2002, ISMIR.

[46]  J. Reiss,et al.  Benchmarking Music Information Retrieval Systems , 2002 .

[47]  Josh Reiss MIR Benchmarking: Lessons Learned from the Multimedia Community , 2002 .

[48]  J. Stephen Downie,et al.  Toward a Theory of Music Information Retrieval Queries: System Design Implications , 2002, ISMIR.

[49]  Bill Schottstaedt,et al.  Common music notation , 1997 .

[50]  Nicola Orio,et al.  A Task-Oriented Approach for the Development of a Test Collection for Music Information Retrieval , 2002 .

[51]  Linda Schamber Relevance and Information Behavior. , 1994 .

[52]  Sally Jo Cunningham User Studies: A First Step in Designing an MIR Testbed , 2002 .

[53]  Enric Guaus,et al.  Open Position Multilingual Orchestra Conductor. Lifetime Opportunity. , 2003, SIGIR 2003.

[54]  Jon W. Dunn,et al.  Indiana university digital music library project , 2001, JCDL '01.

[55]  William P. Birmingham,et al.  Comparing Aural Music-Information Retrieval Systems , 2002 .