On the Use of Anti-Word Models for Audio Music Annotation and Retrieval

Query-by-semantic-description (QBSD) is a natural way for searching/annotating music in a large database. To improve QBSD, we propose the use of anti-words for each annotation word based on the concept of supervised multiclass labeling (SML). More specifically, words that are highly associated with the opposite semantic meaning of a word constitute its anti-word set. By modeling both a word and its anti-word set, our annotation system can achieve 31.1% of equal mean per-word precision and recall, while the original SML model achieves 27.8%. Moreover, by constructing the models of the anti-word explicitly, the performance is also significantly improved for the retrieval system, especially when the query keyword is the antonym of an existing annotation word.

[1]  Thomas Sikora,et al.  BeatBank ? An MPEG-7 Compliant Query by Tapping System , 2004 .

[2]  Ning Hu,et al.  Understanding Search Performance in Query-by-Humming Systems , 2004, ISMIR.

[3]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Gert R. G. Lanckriet,et al.  Identifying Words that are Musically Meaningful , 2007, ISMIR.

[5]  Nuno Vasconcelos,et al.  Learning Mixture Hierarchies , 1998, NIPS.

[6]  Daniel P. W. Ellis,et al.  Song-Level Features and Support Vector Machines for Music Classification , 2005, ISMIR.

[7]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[8]  Gert R. G. Lanckriet,et al.  Towards musical query-by-semantic-description using the CAL500 data set , 2007, SIGIR.

[9]  Ryan M. Rifkin,et al.  Musical query-by-description as a multiclass learning problem , 2002, 2002 IEEE Workshop on Multimedia Signal Processing..

[10]  Malcolm Slaney,et al.  Mixtures of probability experts for audio retrieval and indexing , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[11]  R. Manmatha,et al.  Multiple Bernoulli relevance models for image and video annotation , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[12]  Ajay Kapur,et al.  Query-by-Beat-Boxing: Music Retrieval For The DJ , 2004, ISMIR.

[13]  Tao Li,et al.  Factors in automatic musical genre classification of audio signals , 2003, 2003 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (IEEE Cat. No.03TH8684).

[14]  Masataka Goto,et al.  Recent studies on music information processing , 2004 .

[15]  Michael I. Jordan,et al.  Modeling annotated data , 2003, SIGIR.

[16]  Thomas Hofmann,et al.  Unsupervised Learning by Probabilistic Latent Semantic Analysis , 2004, Machine Learning.

[17]  Jyh-Shing Roger Jang,et al.  A Query-by-Singing System based on Dynamic Programming , 2000 .

[18]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[19]  P. Cano,et al.  Automatic sound annotation , 2004, Proceedings of the 2004 14th IEEE Signal Processing Society Workshop Machine Learning for Signal Processing, 2004..

[20]  Jyh-Shing Roger Jang,et al.  Query by Tapping: A New Paradigm for Content-Based Music Retrieval from Acoustic Input , 2001, IEEE Pacific Rim Conference on Multimedia.

[21]  Daniel P. W. Ellis,et al.  Automatic Record Reviews , 2004, ISMIR.

[22]  Nuno Vasconcelos,et al.  Image indexing with mixture hierarchies , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[23]  Gustavo Carneiro,et al.  Supervised Learning of Semantic Classes for Image Annotation and Retrieval , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Jyh-Shing Roger Jang,et al.  Music Annotation and Retrieval System Using Anti-Models , 2008 .

[25]  Thierry Bertin-Mahieux,et al.  Automatic Generation of Social Tags for Music Recommendation , 2007, NIPS.

[26]  Jyh-Shing Roger Jang,et al.  A General Framework of Progressive Filtering and Its Application to Query by Singing/Humming , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[27]  Malcolm Slaney,et al.  Semantic-audio retrieval , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.