Efficient and effective prediction of social tags to enhance web search

As the web has grown into an integral part of daily life, social annotation has become a popular manner for web users to manage resources. This method of management has many potential applications, but it is limited in applicability by the cold-start problem, especially for new resources on the web. In this article, we study automatic tag prediction for web pages comprehensively and utilize the predicted tags to improve search performance. First, we explore the stabilizing phenomenon of tag usage in a social bookmarking system. Then, we propose a two-stage tag prediction approach, which is efficient and is effective in making use of early annotations from users. In the first stage, content-based ranking, candidate tags are selected and ranked to generate an initial tag list. In the second stage, random-walk re-ranking, we adopt a random-walk model that utilizes tag co-occurrence information to re-rank the initial list. The experimental results show that our algorithm effectively proposes appropriate tags for target web pages. In addition, we present a framework to incorporate tag prediction in a general web search. The experimental results of the web search validate the hypothesis that the proposed framework significantly enhances the typical retrieval model. © 2011 Wiley Periodicals, Inc.

[1]  Hector Garcia-Molina,et al.  Social tag prediction , 2008, SIGIR '08.

[2]  Chun Chen,et al.  Personalized tag recommendation using graph-based ranking on multi-type interrelated objects , 2009, SIGIR.

[3]  CHENGXIANG ZHAI,et al.  A study of smoothing methods for language models applied to information retrieval , 2004, TOIS.

[4]  Hongyuan Zha,et al.  Exploring social annotations for information retrieval , 2008, WWW.

[5]  Georgia Koutrika,et al.  Can social bookmarking improve web search? , 2008, WSDM '08.

[6]  Yang Song,et al.  Real-time automatic tag recommendation , 2008, SIGIR '08.

[7]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[8]  Zhi-Hua Zhou,et al.  ML-KNN: A lazy learning approach to multi-label learning , 2007, Pattern Recognit..

[9]  Yong Yu,et al.  Exploring social annotations for the semantic web , 2006, WWW '06.

[10]  Yong Yu,et al.  Optimizing web search using social annotations , 2007, WWW '07.

[11]  Nick Craswell,et al.  Random walks on the click graph , 2007, SIGIR.

[12]  Tony Hammond,et al.  Social Bookmarking Tools (I): A General Overview , 2005, D Lib Mag..

[13]  Dinan Gunawardena,et al.  Social tags: meaning and suggestions , 2008, CIKM '08.

[14]  Marcus Fontoura,et al.  Using annotations in enterprise search , 2006, WWW '06.

[15]  Valentin Robu,et al.  The complex dynamics of collaborative tagging , 2007, WWW '07.

[16]  Moni Naor,et al.  Rank aggregation methods for the Web , 2001, WWW '01.

[17]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[18]  H.S. Al-Khalifa,et al.  Measuring the Semantic Value of Folksonomies , 2006, 2006 Innovations in Information Technology.

[19]  Grigorios Tsoumakas,et al.  Mining Multi-label Data , 2010, Data Mining and Knowledge Discovery Handbook.

[20]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[21]  Kevyn Collins-Thompson,et al.  Query expansion using random walk models , 2005, CIKM '05.

[22]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[23]  Roelof van Zwol,et al.  Flickr tag recommendation based on collective knowledge , 2008, WWW.

[24]  Bamshad Mobasher,et al.  Personalized recommendation in social tagging systems using hierarchical clustering , 2008, RecSys '08.

[25]  Yang Song,et al.  A sparse gaussian processes classification framework for fast tag suggestions , 2008, CIKM '08.

[26]  Siegfried Handschuh,et al.  P-TAG: large scale automatic generation of personalized annotation tags for the web , 2007, WWW '07.

[27]  Gerard Salton,et al.  On the use of spreading activation methods in automatic information , 1988, SIGIR '88.

[28]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.