Computational Models of Music Similarity and their Application in Music Information Retrieval

This thesis aims at developing techniques which support users in accessing and discovering music. The main part consists of two chapters. Chapter 2 gives an introduction to computational models of music similarity. The combination of different approaches is optimized and the largest evaluation of music similarity measures published to date is presented. The best combination performs significantly better than the baseline approach in most of the evaluation categories. A particular effort is made to avoid overfitting. To cross-check the results from the evaluation based on genre classification a listening test is conducted. The test confirms that genrebased evaluations are suitable to efficiently evaluate large parameter spaces. Chapter 2 ends with recommendations on the use of similarity measures. Chapter 3 describes three applications of such similarity measures. The first application demonstrates how music collections can be organized and visualized so that users can control the aspect of similarity they are interested in. The second application demonstrates how music collections can be organized hierarchically into overlapping groups at the artist level. These groups are summarized using words from web pages associated with the respective artists. The third application demonstrates how playlists can be generated which require minimum user input.

[1]  Daniel P. W. Ellis,et al.  Automatic Record Reviews , 2004, ISMIR.

[2]  Masataka Goto,et al.  RWC Music Database: Music genre database and musical instrument sound database , 2003, ISMIR.

[3]  Brian Christopher Smith,et al.  Query by humming: musical information retrieval in an audio database , 1995, MULTIMEDIA '95.

[4]  Gerhard Widmer,et al.  Towards Characterisation of Music via Rhythmic Patterns , 2004, ISMIR.

[5]  J. Stephen Downie,et al.  The International Music Information Retrieval Systems Evaluation Laboratory: Governance, Access and Security , 2004, ISMIR.

[6]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[7]  Andreas Rauber,et al.  PlaySOM and PocketSOMPlayer, Alternative Interfaces to Large Music Collections , 2005, ISMIR.

[8]  Stephan Baumann Artificial Listening Systems - Modellierung und approximation der individuellen Perzeption von Musikähnlichkeit , 2005 .

[9]  Leonidas J. Guibas,et al.  The Earth Mover's Distance as a Metric for Image Retrieval , 2000, International Journal of Computer Vision.

[10]  Peter Knees,et al.  Discovering and Visualizing Prototypical Artists by Web-Based Co-Occurrence Analysis , 2005, ISMIR.

[11]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[12]  Daniel P. W. Ellis,et al.  A Large-Scale Evaluation of Acoustic and Subjective Music-Similarity Measures , 2004, Computer Music Journal.

[13]  Gerhard Widmer,et al.  Evaluating Rhythmic descriptors for Musical Genre Classification , 2004 .

[14]  Fabio Vignoli,et al.  Visual Playlist Generation on the Artist Map , 2005, ISMIR.

[15]  F. Gouyon A computational approach to rhythm description - Audio features for the computation of rhythm periodicity functions and their use in tempo induction and music content processing , 2005 .

[16]  Ichiro Fujinaga,et al.  Web Services for Music Information Retrieval , 2004, ISMIR.

[17]  D. Moelants Preferred tempo reconsidered. , 2002 .

[18]  Andreas Rauber LabelSOM: on the labeling of self-organizing maps , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[19]  Tim Pohle Extraction of Audio Descriptors and Their Evaluation in Music Classification Tasks , 2005 .

[20]  E. Pampalk Islands of Music Analysis, Organization, and Visualization of Music Archives , 2002 .

[21]  Emilia Gómez,et al.  Tonality Visualization of Polyphonic audio , 2005, ICMC.

[22]  Elias Pampalk Speeding Up Music Similarity , 2005 .

[23]  Stephan Baumann,et al.  AN ECOLOGICAL APPROACH TO MULTIMODAL SUBJECTIVE MUSIC SIMILARITY PERCEPTION , 2004 .

[24]  André Skupin,et al.  The world of geography: Visualizing a knowledge domain with cartographic means , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[25]  Kevin W. Boyack,et al.  Domain visualization using VxInsight® for science and technology management , 2002, J. Assoc. Inf. Sci. Technol..

[26]  Steffen Pauws,et al.  User Evaluation of a New Interactive Playlist Generation Concept , 2005, ISMIR.

[27]  Dominik Lübbers SoniXplorer: Combining Visualization and Auralization for Content-Based Exploration of Music Collections , 2005, ISMIR.

[28]  Masataka Goto,et al.  Musicream: New Music Playback Interface for Streaming, Sticking, Sorting, and Recalling Musical Pieces , 2005, ISMIR.

[29]  François Pachet,et al.  Musical data mining for electronic music distribution , 2001, Proceedings First International Conference on WEB Delivering of Music. WEDELMUSIC 2001.

[30]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[31]  Markus Schedl,et al.  An Explorative, Hierarchical User Interface to Structured Music Repositories , 2003 .

[32]  Gerhard Widmer,et al.  Exploring Music Collections by Browsing Different Views , 2004, Computer Music Journal.

[33]  Daniel P. W. Ellis,et al.  Anchor space for classification and similarity measurement of music , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[34]  Beth Logan,et al.  Semantic analysis of song lyrics , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[35]  Gerhard Widmer,et al.  EXPLORING EXPRESSIVE PERFORMANCE TRAJECTORIES: SIX FAMOUS PIANISTS PLAY SIX CHOPIN PIECES , 2004 .

[36]  Giorgio Zoia,et al.  On the Modeling of Time Information for Automatic Genre Recognition Systems in Audio Signals , 2005, ISMIR.

[37]  Ivan Magrin-Chagnolleau,et al.  Second-order statistical measures for text-independent speaker identification , 1995, Speech Commun..

[38]  G. Widmer,et al.  EVALUATION OF FREQUENTLY USED AUDIO FEATURES FOR CLASSIFICATION OF MUSIC INTO PERCEPTUAL CATEGORIES , 2005 .

[39]  Thomas Kamps,et al.  Improving Content-Based Similarity Measures by Training a Collaborative Model , 2005, ISMIR.

[40]  Masataka Goto,et al.  A chorus-section detecting method for musical audio signals , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[41]  Gerhard Widmer,et al.  Visualizing changes in the structure of data for exploratory feature selection , 2003, KDD '03.

[42]  Tim Pohle,et al.  Dynamic Playlist Generation Based on Skipping Behavior , 2005, ISMIR.

[43]  T. Jehan,et al.  Hierarchical multi-class self similarities , 2005, IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, 2005..

[44]  Bee-Suan Ong,et al.  Computing Structural Descriptions of Music through the Identification of Representative Excerpts from Audio Files , 2004 .

[45]  A. Oppenheim Speech analysis-synthesis system based on homomorphic filtering. , 1969, The Journal of the Acoustical Society of America.

[46]  Elias Pampalk,et al.  Using Psycho-Acoustic Models and Self-Organizing Maps to Create a Hierarchical Structuring of Music by Sound Similarity , 2002 .

[47]  Tim Pohle,et al.  Towards a Socio-cultural Compatibility of MIR Systems , 2004, ISMIR.

[48]  Andreas Rauber,et al.  Self-Organizing Maps for Content-Based Music Clustering , 2001, WIRN.

[49]  A. H. Tewfik,et al.  A network flow model for playlist generation , 2001 .

[50]  Beth Logan,et al.  Mel Frequency Cepstral Coefficients for Music Modeling , 2000, ISMIR.

[51]  Stephen Cox,et al.  Features and classifiers for the automatic classification of musical audio signals , 2004, ISMIR.

[52]  François Pachet,et al.  A taxonomy of musical genres , 2000, RIAO.

[53]  François Pachet,et al.  Music Similarity Measures: What's the use? , 2002, ISMIR.

[54]  Christopher M. Bishop,et al.  GTM: The Generative Topographic Mapping , 1998, Neural Computation.

[55]  Andreas Rauber,et al.  The growing hierarchical self-organizing map , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[56]  John W. Sammon,et al.  A Nonlinear Mapping for Data Structure Analysis , 1969, IEEE Transactions on Computers.

[57]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[58]  Daniel P. W. Ellis,et al.  The Quest for Ground Truth in Musical Artist Similarity , 2002, ISMIR.

[59]  Pavel Zezula,et al.  M-tree: An Efficient Access Method for Similarity Search in Metric Spaces , 1997, VLDB.

[60]  Jonathan Foote,et al.  Visualizing music and audio using self-similarity , 1999, MULTIMEDIA '99.

[61]  Berry Eggen,et al.  Realization and User Evaluation of an Automatic Playlist Generator , 2003, ISMIR.

[62]  Pasi Koikkalainen,et al.  Self-organizing hierarchical feature maps , 1990, 1990 IJCNN International Joint Conference on Neural Networks.

[63]  Remco C. Veltkamp,et al.  A Survey of Music Information Retrieval Systems , 2005, ISMIR.

[64]  Mario Nöcker,et al.  Databionic Visualization of Music Collections According to Perceptual Distance , 2005, ISMIR.

[65]  Fabio Vignoli,et al.  A Music Retrieval System Based on User Driven Similarity and Its Evaluation , 2005, ISMIR.

[66]  Meinard Müller,et al.  Audio Matching via Chroma-Based Statistical Features , 2005, ISMIR.

[67]  Paris Smaragdis,et al.  Combining Musical and Cultural Features for Intelligent Style Detection , 2002, ISMIR.

[68]  Beth Logan,et al.  A music similarity function based on signal analysis , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[69]  Tao Li,et al.  A comparative study on content-based music genre classification , 2003, SIGIR.

[70]  Steve Lawrence,et al.  Inferring Descriptions and Similarity for Music from Community Metadata , 2002, ICMC.

[71]  Samuel Kaski,et al.  Keyword selection method for characterizing text document maps , 1999 .

[72]  Ian T. Nabney,et al.  Netlab: Algorithms for Pattern Recognition , 2002 .

[73]  Pedro Cano,et al.  On the use of FastMap for Audio Retrieval and Browsing , 2002, ISMIR.

[74]  Juan Pablo Bello,et al.  A Robust Mid-Level Representation for Harmonic Content in Music Signals , 2005, ISMIR.

[75]  Oliver Hummel,et al.  Using cultural metadata for artist recommendations , 2003, Proceedings Third International Conference on WEB Delivering of Music.

[76]  Josep Lluís Arcos,et al.  Visualizing and Exploring Personal Music Libraries , 2004, ISMIR.

[77]  Fabien Gouyon,et al.  Percussion-related Semantic Descriptors of Music Audio Files , 2004 .

[78]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[79]  Stephen T. Neely,et al.  Signals, Sound, and Sensation , 1997 .

[80]  Andreas Rauber,et al.  Evaluation of Feature Extractors and Psycho-Acoustic Transformations for Music Genre Classification , 2005, ISMIR.

[81]  Elias Pampalk,et al.  HIERARCHICAL ORGANIZATION AND VISUALIZATION OF DRUM SAMPLE LIBRARIES , 2004 .

[82]  Jeroen Breebaart,et al.  Features for audio and music classification , 2003, ISMIR.

[83]  Nuno Vasconcelos,et al.  A Kullback-Leibler Divergence Based Kernel for SVM Classification in Multimedia Applications , 2003, NIPS.

[84]  E. B. Newman,et al.  A Scale for the Measurement of the Psychological Magnitude Pitch , 1937 .

[85]  Peter Knees,et al.  Artist Classification with Web-Based Data , 2004, ISMIR.

[86]  Risto Mukkulainen,et al.  Script Recognition with Hierarchical Feature Maps , 1990 .

[87]  George Tzanetakis,et al.  Musical genre classification of audio signals , 2002, IEEE Trans. Speech Audio Process..

[88]  Richard Polfreman,et al.  Sound spotting: a frame-based approach , 2001 .

[89]  Yiming Yang,et al.  A Comparative Study on Feature Selection in Text Categorization , 1997, ICML.

[90]  Oliver Hellmuth,et al.  A multiple feature model for musical similarity retrieval , 2003, ISMIR.

[91]  Tim Pohle,et al.  A COMPARISON OF MUSIC SIMILARITY MEASURES FOR A P 2 P APPLICATION , 2003 .

[92]  H. Fastl Fluctuation strength and temporal masking patterns of amplitude-modulated broadband noise , 1982, Hearing Research.

[93]  Jonathan Foote,et al.  Content-based retrieval of music and audio , 1997, Other Conferences.

[94]  Elias Pampalk,et al.  Content-based organization and visualization of music archives , 2002, MULTIMEDIA '02.

[95]  François Pachet,et al.  Automatic extraction of music descriptors from acoustic signals , 2004, ISMIR.

[96]  Mark B. Sandler,et al.  A tutorial on onset detection in music signals , 2005, IEEE Transactions on Speech and Audio Processing.

[97]  Elias Pampalk A Matlab Toolbox to Compute Music Similarity from Audio , 2004, ISMIR.

[98]  Gerhard Widmer,et al.  Hierarchical Organization and Description of Music Collections at the Artist Level , 2005, ECDL.

[99]  Gerhard Widmer,et al.  A new approach to hierarchical clustering and structuring of data with Self-Organizing Maps , 2004, Intell. Data Anal..

[100]  John Shawe-Taylor,et al.  An Investigation of Feature Models for Music Genre Classification Using the Support Vector Classifier , 2005, ISMIR.

[101]  Gerhard Widmer,et al.  Classification of dance music by periodicity patterns , 2003, ISMIR.

[102]  Fabio Vignoli,et al.  Mapping Music In The Palm Of Your Hand, Explore And Discover Your Collection , 2004, ISMIR.

[103]  Òscar Celma,et al.  Foafing the Music: A Music Recommendation System based on RSS Feeds and User Preferences , 2005, ISMIR.

[104]  T. Kohonen Self-organized formation of topology correct feature maps , 1982 .