Learning Latent Semantic Models for Music from Social Tags

Abstract In this paper we describe how to build a variety of information retrieval models for music collections based on social tags. We discuss the particular nature of social tags for music and apply latent semantic dimension reduction methods to co-occurrence counts of words in tags given to individual tracks. We evaluate the performance of various latent semantic models in relation to both previous work and a simple full-rank vector space model based on tags. We investigate the extent to which our low-dimensional semantic spaces respect traditional catalogue organization by artist and genre, and how well they generalize to unseen tracks, and we illustrate some of the concepts expressed by the learned dimensions.

[1]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[2]  Daniel Gatica-Perez,et al.  Modeling Semantic Aspects for Cross-Media Image Indexing , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  N. Scaringella,et al.  Automatic genre classification of music content: a survey , 2006, IEEE Signal Process. Mag..

[4]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[5]  Peter Knees,et al.  Artist Classification with Web-Based Data , 2004, ISMIR.

[6]  Mark B. Sandler,et al.  A Semantic Space for Music Derived from Social Tags , 2007, ISMIR.

[7]  Daniel P. W. Ellis,et al.  Support vector machine active learning for music retrieval , 2006, Multimedia Systems.

[8]  D. Ellis Learning the meaning of music , 2005 .

[9]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[10]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[11]  Jean-Julien Aucouturier,et al.  Ten Experiments on the Modeling of Polyphonic Timbre. (Dix Expériences sur la Modélisation du Timbre Polyphonique) , 2006 .

[12]  Brian Whitman Semantic rank reduction of music audio , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[13]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[14]  Yong Yu,et al.  Exploring social annotations for the semantic web , 2006, WWW '06.

[15]  Steffen Staab Emergent Semantics , 2002, IEEE Intell. Syst..

[16]  Masataka Goto,et al.  An Efficient Hybrid Music Recommender System Using an Incrementally Trainable Probabilistic Generative Model , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[17]  Thierry Bertin-Mahieux,et al.  Automatic Generation of Social Tags for Music Recommendation , 2007, NIPS.

[18]  Philippe Cudré-Mauroux,et al.  Emergent Semantics , 2008, Encyclopedia of Multimedia.

[19]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[20]  P. Schmitz,et al.  Inducing Ontology from Flickr Tags , 2006 .

[21]  Gert R. G. Lanckriet,et al.  Towards musical query-by-semantic-description using the CAL500 data set , 2007, SIGIR.

[22]  Elias Pampalk,et al.  Computational Models of Music Similarity and their Application in Music Information Retrieval , 2006 .

[23]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[24]  David M. Pennock,et al.  Probabilistic Models for Unified Collaborative and Content-Based Recommendation in Sparse-Data Environments , 2001, UAI.

[25]  Peter Knees,et al.  The Quest for Ground Truth in Musical Artist Tagging in the Social Web Era , 2007, ISMIR.

[26]  Baile Shi,et al.  An Efficient Solution to Factor Drifting Problem in the pLSA Model , 2005, The Fifth International Conference on Computer and Information Technology (CIT'05).

[27]  Oliver Hummel,et al.  Using cultural metadata for artist recommendations , 2003, Proceedings Third International Conference on WEB Delivering of Music.