Using Regression to Combine Data Sources for Semantic Music Discovery

In the process of automatically annotating songs with descriptive labels, multiple types of input information can be used. These include keyword appearances in web documents, acoustic features of the song’s audio content, and similarity with other tagged songs. Given these individual data sources, we explore the question of how to aggregate them. We find that fixed-combination approaches like sum and max perform well but that trained linear regression models work better. Retrieval performance improves with more data sources. On the other hand, for large numbers of training songs, Bayesian hierarchical models that aim to share information across individual tag regressions offer no advantage.

[1]  D. Lindley,et al.  Bayes Estimates for the Linear Model , 1972 .

[2]  Gert R. G. Lanckriet,et al.  Five Approaches to Collecting Tags for Music , 2008, ISMIR.

[3]  Peter A. Morris,et al.  Combining Expert Judgments: A Bayesian Approach , 1977 .

[4]  J. Stephen Downie,et al.  The music information retrieval evaluation exchange (2005-2007): A window into music information retrieval research , 2008 .

[5]  H. B. Mitchell,et al.  Multi-Sensor Data Fusion: An Introduction , 2007 .

[6]  Daniel P. W. Ellis,et al.  Multiple-Instance Learning for Music Information Retrieval , 2008, ISMIR.

[7]  Peter E. Rossi,et al.  Bayesian Statistics and Marketing , 2005 .

[8]  Nuno Vasconcelos,et al.  Image indexing with mixture hierarchies , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[9]  Arun Ross,et al.  Information fusion in biometrics , 2003, Pattern Recognit. Lett..

[10]  Josef Kittler,et al.  Combining classifiers: A theoretical framework , 1998, Pattern Analysis and Applications.

[11]  Claudio De Stefano,et al.  Using Bayesian Network for combining classifiers , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[12]  Adrian E. Raftery,et al.  Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors , 1999 .

[13]  J. Stephen Downie,et al.  The Music Information Retrieval Evaluation eXchange (MIREX) , 2006 .

[14]  Robert A. Jacobs,et al.  Methods For Combining Experts' Probability Assessments , 1995, Neural Computation.

[15]  Julian J. Faraway,et al.  Practical Regression and Anova using R , 2002 .

[16]  Thierry Bertin-Mahieux,et al.  Automatic Generation of Social Tags for Music Recommendation , 2007, NIPS.

[17]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[18]  Robert P. W. Duin,et al.  The combining classifier: to train or not to train? , 2002, Object recognition supported by user interaction for service robots.

[19]  Douglas Turnbull,et al.  Using Artist Similarity to Propagate Semantic Information , 2009, ISMIR.

[20]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.