论文信息 - Learning to Tag using Noisy Labels

Learning to Tag using Noisy Labels

In order to organize and retrieve the ever growing collection of multimedia objects on the Web, many algorithms have been developed to automatically tag images, music and videos. One source of labeled data for training these algorithms are tags collected from the Web, via collaborative tagging websites (e.g., Flickr, Last.FM and YouTube) or crowdsourcing applications (e.g., human computation games and Mechanical Turk). A common approach is to use tags directly as labels for training algorithms in a supervised way. This approach is problematic, because the presence of synonyms and misspellings amongst the tags creates a label space that is overly fragmented, with a huge number of classes, many of which are sparse and semantically equivalent to one another. In this work, we investigate a method for training tagging algorithms using a reduced set of labels corresponding to topics derived from the tags. We show that our proposed method is comparable, in terms of annotation and retrieval performance, to the method of using tags directly as labels, while being more efficient to train (as there are fewer classes) and less wasteful (eliminating the need to discard tags that are associated with too few examples). We demonstrate our results using a dataset collected by a human computation game, called TagATune.

[1] Tao Li,et al. A comparative study on content-based music genre classification , 2003, SIGIR.

[2] Daniel P. W. Ellis,et al. Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[3] Grigorios Tsoumakas,et al. Multi-Label Classification of Music into Emotions , 2008, ISMIR.

[4] Michael I. Jordan,et al. Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[5] Mark B. Sandler,et al. A Semantic Space for Music Derived from Social Tags , 2007, ISMIR.

[6] Edith Law,et al. Input-agreement: a new mechanism for collecting data using human computation games , 2009, CHI.

[7] Gert R. G. Lanckriet,et al. A Game-Based Approach for Collecting Semantic Annotations of Music , 2007, ISMIR.

[8] Ning Hu,et al. Understanding Search Performance in Query-by-Humming Systems , 2004, ISMIR.

[9] Xindong Wu,et al. Eliminating Class Noise in Large Datasets , 2003, ICML.

[10] Adam L. Berger,et al. A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[11] Thomas L. Griffiths,et al. Probabilistic Topic Models , 2007 .

[12] I. Csiszár. Maxent, Mathematics, and Information Theory , 1996 .

[13] David M. Blei,et al. Supervised Topic Models , 2007, NIPS.

[14] J. Lafferty,et al. Mixed-membership models of scientific publications , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[15] Daniel P. W. Ellis,et al. Please Scroll down for Article Journal of New Music Research a Web-based Game for Collecting Music Metadata a Web-based Game for Collecting Music Metadata , 2022 .

[16] Thomas Sikora,et al. BeatBank ? An MPEG-7 Compliant Query by Tapping System , 2004 .

[17] Gert R. G. Lanckriet,et al. Towards musical query-by-semantic-description using the CAL500 data set , 2007, SIGIR.

[18] Joan Serrà,et al. Music Mood Representations from Social Tags , 2009, ISMIR.

[19] Geoffrey E. Hinton,et al. Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[20] Paul Lamere,et al. Social Tagging and Music Information Retrieval , 2008 .

[21] J. Nocedal. Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[22] Andrew McCallum,et al. Using Maximum Entropy for Text Classification , 1999 .

[23] Thierry Bertin-Mahieux,et al. Autotagger: A Model for Predicting Social Tags from Acoustic Features on Large Music Databases , 2008 .

[24] Perry R. Cook,et al. Easy As CBA: A Simple Probabilistic Model for Tagging Music , 2009, ISMIR.

[25] Naonori Ueda,et al. Modeling Social Annotation Data with Content Relevance using a Topic Model , 2009, NIPS.

[26] C. Elkan,et al. Topic Models , 2008 .