Exploiting social bookmarking services to build clustered user interest profile for personalized search

Search engine users tend to write short queries, generally comprising of two or three query words. As these queries are often ambiguous or incomplete, search engines tend to return results whose rankings reflect a community of intent. Moreover, search engines are designed to satisfy the needs of the general populace, not those of a specific searcher. To address these issues, we propose two methods that use Singular Value Decomposition (SVD) to build a Clustered User Interest Profile (CUIP), for each user, from the tags annotated by a community of users to web resources of interest. A CUIP consists of clusters of semantically or syntactically related tags, each cluster identifying a topic of the user's interest. The matching cluster, to the given user's query, aids in disambiguation of user search needs and assists the search engine to generate a set of personalized search results. A series of experiments was executed against two data sets to judge the clustering tendency of the cluster structure CUIP, and to evaluate the quality of personalized search. The experiment results indicate that the CUIP based personalized search outperforms the baseline search and is better than the other approaches that use social bookmarking services for building a user profile and use it for personalized search.

[1]  Susan T. Dumais,et al.  Learning user interaction models for predicting web search result preferences , 2006, SIGIR.

[2]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[3]  Graeme Hirst,et al.  Semantic distance in WordNet: An experimental, application-oriented evaluation of five measures , 2004 .

[4]  Paolo Ferragina,et al.  A personalized search engine based on Web-snippet hierarchical clustering , 2008 .

[5]  Ellen M. Voorhees,et al.  The TREC-8 Question Answering Track Report , 1999, TREC.

[6]  Susan T. Dumais,et al.  Improving Web Search Ranking by Incorporating User Behavior Information , 2019, SIGIR Forum.

[7]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[8]  Amanda Spink,et al.  Searching the Web: the public and their queries , 2001 .

[9]  B. C. Brookes,et al.  Information Sciences , 2020, Cognitive Skills You Need for the 21st Century.

[10]  Hong-Gee Kim,et al.  Semantically Enriched User Interest Profile Built from Users' Tweets , 2012, ICADL.

[11]  Hong-Gee Kim,et al.  Semantically Enriched Clustered User Interest Profile Built from Users' Tweets , 2012, AIRS.

[12]  Amanda Spink,et al.  Web searching on the Vivisimo search engine , 2006, J. Assoc. Inf. Sci. Technol..

[13]  Xuehua Shen,et al.  Context-sensitive information retrieval using implicit feedback , 2005, SIGIR '05.

[14]  Christoph Meinel,et al.  Web Search Personalization Via Social Bookmarking and Tagging , 2007, ISWC/ASWC.

[15]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[16]  Wolfgang Nejdl,et al.  Summarizing local context to personalize global web search , 2006, CIKM '06.

[17]  Paolo Ferragina,et al.  A personalized search engine based on Web‐snippet hierarchical clustering , 2008, Softw. Pract. Exp..

[18]  Ji-Rong Wen,et al.  WWW 2007 / Track: Search Session: Personalization A Largescale Evaluation and Analysis of Personalized Search Strategies ABSTRACT , 2022 .

[19]  Byron Dom,et al.  An Information-Theoretic External Cluster-Validity Measure , 2002, UAI.

[20]  Yong Yu,et al.  Exploring folksonomy for personalized search , 2008, SIGIR '08.

[21]  Olivia R. Liu Sheng,et al.  Interest-based personalized search , 2007, TOIS.

[22]  Nicola Henze,et al.  Interweaving Public User Profiles on the Web , 2010, UMAP.

[23]  Hinrich Schütze,et al.  Personalized search , 2002, CACM.

[24]  Bamshad Mobasher,et al.  Personalized recommendation in social tagging systems using hierarchical clustering , 2008, RecSys '08.

[25]  Paul M. B. Vitányi,et al.  The Google Similarity Distance , 2004, IEEE Transactions on Knowledge and Data Engineering.

[26]  John R. Kender,et al.  Clustering web images using association rules, interestingness measures, and hypergraph partitions , 2006, ICWE '06.

[27]  Hong-Gee Kim,et al.  Using folksonomies for building user interest profile , 2011, UMAP'11.

[28]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[29]  John Millar Carroll Interfacing Thought: Cognitive Aspects of Human-Computer Interaction , 2003 .

[30]  Vipin Kumar,et al.  Introduction to Data Mining, (First Edition) , 2005 .

[31]  GauchSusan,et al.  Ontology-based personalized search and browsing , 2003 .

[32]  Joemon M. Jose,et al.  A comparison of general vs personalised affective models for the prediction of topical relevance , 2010, SIGIR '10.

[33]  J. Gower,et al.  Minimum Spanning Trees and Single Linkage Cluster Analysis , 1969 .

[34]  Arkaitz Zubiaga,et al.  Reorganizing clouds: A study on tag clustering and evaluation , 2012, Expert Syst. Appl..

[35]  Hongxia Jin,et al.  Exploring online social activities for adaptive search personalization , 2010, CIKM.

[36]  Joemon M. Jose,et al.  Personalizing Web Search with Folksonomy-Based User and Document Profiles , 2010, ECIR.

[37]  Susan T. Dumais,et al.  Characterizing the value of personalizing search , 2007, SIGIR.

[38]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[39]  P. Rousseeuw Silhouettes: a graphical aid to the interpretation and validation of cluster analysis , 1987 .

[40]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[41]  Jaime Teevan,et al.  Implicit feedback for inferring user preference: a bibliography , 2003, SIGF.

[42]  Pablo Castells,et al.  Personalized diversification of search results , 2012, SIGIR '12.

[43]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[44]  Yi Cai,et al.  Personalized search by tag-based user profile and resource profile in collaborative tagging systems , 2010, CIKM.

[45]  Eric Brill,et al.  Improving web search ranking by incorporating user behavior information , 2006, SIGIR.

[46]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[47]  Qi Gao,et al.  Analyzing user modeling on twitter for personalized news recommendations , 2011, UMAP'11.

[48]  Alexander Pretschner,et al.  Ontology-based personalized search and browsing , 2003, Web Intell. Agent Syst..

[49]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[50]  Andrew Trotman,et al.  Document Clustering Evaluation: Divergence from a Random Baseline , 2012, ArXiv.