A tag-level factor graph model for semantic music discovery

This paper proposes a semantic music discovery system based on a tag-level factor graph (TFG) model with utilization of tag probability and content similarity in a unified fashion. The content similarities are calculated based on the extracted pitch features while tag probabilities are obtained from our previous auto-tagging system. The TFG model consists of a set of node and edge feature functions, which define the impact of tag probabilities and content similarities on representative degrees of songs. Representative degrees indicate to which extent a song is a representative one given the query tag. The loopy max-product inference algorithm is applied to obtain the values of all representative degrees that maximize the joint probability distribution of the TFG model. Experiment results show the TFG model improves the performance by 5.6% higher in the precision rate at top 3 music and 3.5% higher at both top 5 and 10 music.

[1]  Milind R. Naphade,et al.  A probabilistic framework for semantic video indexing, filtering, and retrieval , 2001, IEEE Trans. Multim..

[2]  Paul Lamere,et al.  Social Tagging and Music Information Retrieval , 2008 .

[3]  Masataka Goto,et al.  Multimedia information retrieval: music and audio , 2013, MM '13.

[4]  Joris M. Mooij,et al.  libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models , 2010, J. Mach. Learn. Res..

[5]  Riccardo Miotto,et al.  A Probabilistic Model to Combine Tags and Acoustic Similarity for Music Retrieval , 2012, TOIS.

[6]  Gert R. G. Lanckriet,et al.  Semantic Annotation and Retrieval of Music and Sound Effects , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[7]  Qin Yan,et al.  Music auto-tagging with variable feature sets and probabilistic annotation , 2014, 2014 9th International Symposium on Communication Systems, Networks & Digital Sign (CSNDSP).

[8]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[9]  Douglas Turnbull,et al.  Using Regression to Combine Data Sources for Semantic Music Discovery , 2009, ISMIR.

[10]  Gert R. G. Lanckriet,et al.  A Game-Based Approach for Collecting Semantic Annotations of Music , 2007, ISMIR.

[11]  Matti Karjalainen,et al.  A computationally efficient multipitch analysis model , 2000, IEEE Trans. Speech Audio Process..

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Ellen M. Voorhees,et al.  Retrieval evaluation with incomplete information , 2004, SIGIR '04.

[14]  Anne H. H. Ngu,et al.  Towards Effective Content-Based Music Retrieval With Multiple Acoustic Feature Combination , 2006, IEEE Transactions on Multimedia.

[15]  Jimeng Sun,et al.  Social influence analysis in large-scale networks , 2009, KDD.

[16]  Gert R. G. Lanckriet,et al.  Combining audio content and social context for semantic music discovery , 2009, SIGIR.