Personalized ranking for digital libraries based on log analysis

Given the exponential increase of indexable context on the Web, ranking is an increasingly difficult problem in information retrieval systems. Recent research shows that implicit feedback regarding user preferences can be extracted from web access logs in order to increase ranking performance. We analyze the implicit user feedback from access logs in the CiteSeer academic search engine and show how site structure can better inform the analysis of clickthrough feedback providing accurate personalized ranking services tailored to individual information retrieval systems. Experiment and analysis shows that our proposed method is more accurate on predicting user preferences than any non-personalized ranking methods when user preferences are stable over time. We compare our method with several non-personalized ranking methods including ranking SVMlight as well as several ranking functions specific to the academic document domain. The results show that our ranking algorithm can reach 63.59% accuracy in comparison to 50.02% for ranking SVMlight and below 43% for all other single feature ranking methods. We also show how the derived personalized ranking vectors can be employed for other ranking-related purposes such as recommendation systems.

[1]  Thorsten Joachims,et al.  Accurately Interpreting Clickthrough Data as Implicit Feedback , 2017 .

[2]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[3]  Hang Li,et al.  AdaRank: a boosting algorithm for information retrieval , 2007, SIGIR.

[4]  C. Lee Giles,et al.  Popularity Weighted Ranking for Academic Digital Libraries , 2007, ECIR.

[5]  John Riedl,et al.  GroupLens: an open architecture for collaborative filtering of netnews , 1994, CSCW '94.

[6]  Abhinandan Das,et al.  Google news personalization: scalable online collaborative filtering , 2007, WWW '07.

[7]  MAGDALINI EIRINAKI,et al.  Web mining for web personalization , 2003, TOIT.

[8]  J. E. Hirsch,et al.  An index to quantify an individual's scientific research output , 2005, Proc. Natl. Acad. Sci. USA.

[9]  Kevin Harris,et al.  SERF: integrating human recommendations with search , 2004, CIKM '04.

[10]  Eric Brill,et al.  Beyond PageRank: machine learning for static ranking , 2006, WWW '06.

[11]  Daryl E. Chubin,et al.  Is citation analysis a legitimate evaluation tool? , 1979, Scientometrics.

[12]  Jaideep Srivastava,et al.  Automatic personalization based on Web usage mining , 2000, CACM.

[13]  M. Fan,et al.  A power method for the structured singular value , 1988, Proceedings of the 27th IEEE Conference on Decision and Control.

[14]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[15]  C. Lee Giles,et al.  Probabilistic user behavior models , 2003, Third IEEE International Conference on Data Mining.

[16]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[17]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[18]  Ji-Rong Wen,et al.  A large-scale evaluation and analysis of personalized search strategies , 2007, WWW '07.

[19]  Cynthia A. Thompson,et al.  Personalized Conversational Case-Based Recommendation , 2000, EWCBR.

[20]  Maurice D. Mulvenna,et al.  Personalization on the Net using Web mining: introduction , 2000, CACM.

[21]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[22]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[23]  Stuart K. Card,et al.  Information foraging in information access environments , 1995, CHI '95.

[24]  F. Menczer,et al.  Personalizing PageRank Based on Domain Profiles , 2004 .

[25]  Dániel Fogaras,et al.  Towards Scaling Fully Personalized PageRank , 2004, WAW.

[26]  Taher H. Haveliwala Topic-sensitive PageRank , 2002, IEEE Trans. Knowl. Data Eng..

[27]  Loren G. Terveen,et al.  Does “authority” mean quality? predicting expert quality ratings of Web documents , 2000, SIGIR '00.

[28]  Myra Spiliopoulou,et al.  Web usage mining for Web site evaluation , 2000, CACM.

[29]  W. Bruce Croft,et al.  Relevance Feedback and Personalization: A Language Modeling Perspective , 2001, DELOS.

[30]  David M. Pennock,et al.  Collaborative filtering with maximum entropy , 2004, IEEE Intelligent Systems.

[31]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[32]  Jennifer Widom,et al.  Scaling personalized web search , 2003, WWW '03.

[33]  Gediminas Adomavicius,et al.  Toward the next generation of recommender systems: a survey of the state-of-the-art and possible extensions , 2005, IEEE Transactions on Knowledge and Data Engineering.