Social Propagation: Boosting Social Annotations for Web Mining

This paper is concerned with the problem of boosting social annotations using propagation, which is also called social propagation. In particular, we focus on propagating social annotations of web pages (e.g., annotations in Del.icio.us). Social annotations are novel resources and valuable in many web applications, including web search and browsing. Although they are developing fast, social annotations of web pages cover only a small proportion (<0.1%) of the World Wide Web. To alleviate the low coverage of annotations, a general propagation model based on Random Surfer is proposed. Specifically, four steps are included, namely basic propagation, multiple-annotation propagation, multiple-link-type propagation, and constraint-guided propagation. The model is evaluated on a dataset of 40,422 web pages randomly sampled from 100 most popular English sites and ten famous academic sites. Each page’s annotations are obtained by querying the history interface of Del.icio.us. Experimental results show that the proposed model is very effective in increasing the coverage of annotations while still preserving novel properties of social annotations. Applications of propagated annotations on web search and classification further verify the effectiveness of the model.

[1]  Siegfried Handschuh,et al.  P-TAG: large scale automatic generation of personalized annotation tags for the web , 2007, WWW '07.

[2]  Azadeh Shakery,et al.  A probabilistic relevance propagation model for hypertext retrieval , 2006, CIKM '06.

[3]  Stephen E. Robertson,et al.  Okapi at TREC-3 , 1994, TREC.

[4]  Bernardo A. Huberman,et al.  Usage patterns of collaborative tagging systems , 2006, J. Inf. Sci..

[5]  Bernhard Schölkopf,et al.  Ranking on Data Manifolds , 2003, NIPS.

[6]  Marc Najork,et al.  Measuring Index Quality Using Random Walks on the Web , 1999, Comput. Networks.

[7]  Ravi Kumar,et al.  Visualizing tags over time , 2006, WWW '06.

[8]  Marcus Fontoura,et al.  Using annotations in enterprise search , 2006, WWW '06.

[9]  Andreas Hotho,et al.  Information Retrieval in Folksonomies: Search and Ranking , 2006, ESWC.

[10]  Yong Yu,et al.  Optimizing web search using social annotations , 2007, WWW '07.

[11]  Donald A. Norman,et al.  Representation in Memory. , 1983 .

[12]  San Murugesan,et al.  A simple method to extract key terms , 2006, Int. J. Electron. Bus..

[13]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[14]  Sergio Greco,et al.  Mining User Preferences, Page Content and Usage to Personalize Website Navigation , 2005, World Wide Web.

[15]  Paul Miller,et al.  Metadata for the Masses , 1996 .

[16]  Shotaro Akaho,et al.  BaggTaming — Learning from Wild and Tame Data , 2008 .

[17]  Fabio Crestani,et al.  Searching the web by constrained spreading activation , 2000, Inf. Process. Manag..

[18]  Valentin Robu,et al.  The complex dynamics of collaborative tagging , 2007, WWW '07.

[19]  Adam Mathes,et al.  Folksonomies-Cooperative Classification and Communication Through Shared Metadata , 2004 .

[20]  Hongyuan Zha,et al.  Exploring social annotations for information retrieval , 2008, WWW.

[21]  Peter Mika Ontologies Are Us: A Unified Model of Social Networks and Semantics , 2005, International Semantic Web Conference.

[22]  Yong Yu,et al.  Using social annotations to improve language model for information retrieval , 2007, CIKM '07.

[23]  Christopher H. Brooks,et al.  Improved annotation of the blogosphere via autotagging and hierarchical clustering , 2006, WWW '06.

[24]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[25]  Learning Ranking Function via Relevance Propagation 1 , 2005 .

[26]  Yong Yu,et al.  Exploring social annotations for the semantic web , 2006, WWW '06.

[27]  Yong Yu,et al.  An Unsupervised Model for Exploring Hierarchical Semantics from Social Annotations , 2007, ISWC/ASWC.

[28]  Ravi Kumar,et al.  On the Bursty Evolution of Blogspace , 2003, WWW '03.

[29]  Mark Levene,et al.  Ranking Pages by Topology and Popularity within Web Sites , 2006, World Wide Web.

[30]  Rui Li,et al.  Towards effective browsing of large scale social annotations , 2007, WWW '07.

[31]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[32]  Qiang Yang,et al.  Exploiting the hierarchical structure for link analysis , 2005, SIGIR '05.

[33]  Christoph Meinel,et al.  Web Search Personalization Via Social Bookmarking and Tagging , 2007, ISWC/ASWC.

[34]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[35]  Mor Naaman,et al.  Towards automatic extraction of event and place semantics from flickr tags , 2007, SIGIR.

[36]  Yiming Yang,et al.  A re-examination of text categorization methods , 1999, SIGIR '99.

[37]  Tao Qin,et al.  A study of relevance propagation for web search , 2005, SIGIR '05.