论文信息 - Autotagging music with conditional restricted Boltzmann machines - 字舞流文

Autotagging music with conditional restricted Boltzmann machines

This paper describes two applications of conditional restricted Boltzmann machines (CRBMs) to the task of autotagging music. The first consists of training a CRBM to predict tags that a user would apply to a clip of a song based on tags already applied by other users. By learning the relationships between tags, this model is able to pre-process training data to significantly improve the performance of a support vector machine (SVM) autotagging. The second is the use of a discriminative RBM, a type of CRBM, to autotag music. By simultaneously exploiting the relationships among tags and between tags and audio-based features, this model is able to significantly outperform SVMs, logistic regression, and multi-layer perceptrons. In order to be applied to this problem, the discriminative RBM was generalized to the multi-label setting and four different learning algorithms for it were evaluated, the first such in-depth analysis of which we are aware.

Razvan Pascanu | Yoshua Bengio | Michael I. Mandel | Hugo Larochelle | Yoshua Bengio | H. Larochelle | Razvan Pascanu

[1] J. Besag. Statistical Analysis of Non-Lattice Data , 1975 .

[2] Paul Smolensky,et al. Information processing in dynamical systems: foundations of harmony theory , 1986 .

[3] Michael I. Jordan,et al. Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[4] Bernhard Schölkopf,et al. New Support Vector Algorithms , 2000, Neural Computation.

[5] M. Opper,et al. Comparing the Mean Field Method and Belief Propagation for Approximate Inference in MRFs , 2001 .

[6] David M. Pennock,et al. Methods and metrics for cold-start recommendations , 2002, SIGIR '02.

[7] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.

[8] Ryan M. Rifkin,et al. Musical query-by-description as a multiclass learning problem , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[9] Geoffrey E. Hinton,et al. A New Learning Algorithm for Mean Field Boltzmann Machines , 2002, ICANN.

[10] Mehryar Mohri,et al. AUC Optimization vs. Error Rate Minimization , 2003, NIPS.

[11] Geoffrey E. Hinton,et al. Exponential Family Harmoniums with an Application to Information Retrieval , 2004, NIPS.

[12] M. Pretti. A message-passing algorithm with damping , 2005 .

[13] Geoffrey E. Hinton,et al. Modeling Human Motion Using Binary Latent Variables , 2006, NIPS.

[14] Geoffrey E. Hinton,et al. Restricted Boltzmann machines for collaborative filtering , 2007, ICML '07.

[15] Thierry Bertin-Mahieux,et al. Automatic Generation of Social Tags for Music Recommendation , 2007, NIPS.

[16] Daniel P. W. Ellis,et al. Please Scroll down for Article Journal of New Music Research a Web-based Game for Collecting Music Metadata a Web-based Game for Collecting Music Metadata , 2022 .

[17] Grigorios Tsoumakas,et al. Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[18] Daniel P. W. Ellis,et al. Multiple-Instance Learning for Music Information Retrieval , 2008, ISMIR.

[19] Thierry Bertin-Mahieux,et al. Autotagger: A Model for Predicting Social Tags from Acoustic Features on Large Music Databases , 2008 .

[20] Yoshua Bengio,et al. Classification using discriminative restricted Boltzmann machines , 2008, ICML '08.

[21] Paul Lamere,et al. Social Tagging and Music Information Retrieval , 2008 .

[22] Rossano Schifanella,et al. Folks in Folksonomies: social link prediction from shared metadata , 2010, WSDM '10.

[23] Douglas Eck,et al. Learning Tags that Vary Within a Song , 2010, ISMIR.

[24] Youngmoo E. Kim,et al. Exploring automatic music annotation with "acoustically-objective" tags , 2010, MIR '10.