An iterative method for personalized results adaptation in cross-language search

Abstract On today's Web, people often desire to not only retrieve results which are of relevance to their query, but for those results to be of particular relevance to them as an individual. In most personalized search systems, the scores obtained from different rankers are linearly combined to provide the personalized ranked list. Moreover, when compared to the personalization research in monolingual web search, relatively few studies extend to the cross-language domain. In this paper we investigate the personalized results adaptation problem in the context of cross-language web search. The main contribution of this research is a novel iterative ranking method based on document associations obtained from an initial ranker. The method assumes that results retrieved by non-personalized rankers and personalized rankers mutually reinforce each other, rather than being used in linear combination. The method is applied in a personalized cross-language search scenario on a semi-automatically constructed test collection and a real-world dataset. The experimental results suggest that the proposed personalized result adaptation method can produce better results than previous approaches for cross-language web search. The results also prove that the semi-automatically constructed test collection can be used as an alternative dataset for evaluation in the absence of available real-world datasets.

[1]  Zoubin Ghahramani,et al.  Learning from labeled and unlabeled data with label propagation , 2002 .

[2]  Vincent P. Wade,et al.  Personalised Information Retrieval: survey and classification , 2013, User Modeling and User-Adapted Interaction.

[3]  Javed A. Aslam,et al.  Models for metasearch , 2001, SIGIR '01.

[4]  Wolfgang Nejdl,et al.  Using ODP metadata to personalize search , 2005, SIGIR '05.

[5]  Vincent P. Wade,et al.  Towards Personalized Multilingual Information Access - Exploring the Browsing and Search Behavior of Multilingual Users , 2014, UMAP.

[6]  Dong Zhou,et al.  Multilingual user modeling for personalized re-ranking of multilingual web search results , 2012, UMAP Workshops.

[7]  Mandar Mitra,et al.  Improving query expansion using WordNet , 2013, J. Assoc. Inf. Sci. Technol..

[8]  Dong Zhou,et al.  Towards multilingual user models for Personalized Multilingual Information Retrieval , 2011, PMHR '11.

[9]  W. Bruce Croft,et al.  Relevance-Based Language Models , 2001, SIGIR '01.

[10]  M. de Rijke,et al.  Building simulated queries for known-item topics: an analysis using six european languages , 2007, SIGIR.

[11]  W. Bruce Croft,et al.  LDA-based document models for ad-hoc retrieval , 2006, SIGIR.

[12]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[13]  ChengXiang Zhai,et al.  Implicit user modeling for personalized search , 2005, CIKM '05.

[14]  Paul-Alexandru Chirita,et al.  Personalized query expansion for the web , 2007, SIGIR.

[15]  M. de Rijke,et al.  Personalized document re-ranking based on Bayesian probabilistic matrix factorization , 2014, SIGIR.

[16]  Yi-Kai Liu,et al.  Multilingual Summarization: Dimensionality Reduction and a Step Towards Optimal Term Coverage , 2013 .

[17]  Dong Zhou,et al.  Improving search via personalized query expansion using social media , 2012, Information Retrieval.

[18]  Jiawei Han,et al.  Modeling hidden topics on document manifold , 2008, CIKM '08.

[19]  Hakim Hacid,et al.  Sopra: a new social personalized ranking function for improving web search , 2013, SIGIR.

[20]  Jimmy J. Lin,et al.  Exploiting Representations from Statistical Machine Translation for Cross-Language Information Retrieval , 2014, TOIS.

[21]  Susan Gauch,et al.  Personalizing Search Based on User Search Histories , 2004 .

[22]  Jianfeng Gao,et al.  Extending query translation to cross-language query expansion with markov chain models , 2007, CIKM '07.

[23]  Hakim Hacid,et al.  Using social annotations to enhance document representation for personalized search , 2013, SIGIR.

[24]  Helen Ashman,et al.  A Hybrid Technique for English-Chinese Cross Language Information Retrieval , 2008, TALIP.

[25]  Peter Lavin,et al.  Towards Evaluating the Impact of Anaphora Resolution on Text Summarisation from a Human Perspective , 2016, NLDB.

[26]  Azadeh Shakery,et al.  Mining a Persian-English comparable corpus for cross-language information retrieval , 2014, Inf. Process. Manag..

[27]  Garrison W. Cottrell,et al.  Fusion Via a Linear Combination of Scores , 1999, Information Retrieval.

[28]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[29]  Raymond Y. K. Lau,et al.  Incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy , 2016, Inf. Process. Manag..

[30]  Mark Claypool,et al.  Inferring User Interest , 2001, IEEE Internet Comput..

[31]  Yong Yu,et al.  Exploring folksonomy for personalized search , 2008, SIGIR '08.

[32]  Luis M. de Campos,et al.  An automatic methodology to evaluate personalized information retrieval systems , 2014, User Modeling and User-Adapted Interaction.

[33]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[34]  Ray R. Larson Introduction to Information Retrieval , 2010 .

[35]  Tao Wang,et al.  Personalized search for social media via dominating verbal context , 2016, Neurocomputing.

[36]  Joemon M. Jose,et al.  Personalizing Web Search with Folksonomy-Based User and Document Profiles , 2010, ECIR.

[37]  Alessandro Micarelli,et al.  Anatomy and Empirical Evaluation of an Adaptive Web-Based Information Filtering System , 2004, User Modeling and User-Adapted Interaction.

[38]  Deng Cai,et al.  Topic modeling with network regularization , 2008, WWW.

[39]  Dong Zhou,et al.  Latent Document Re-Ranking , 2009, EMNLP.

[40]  Hongxia Jin,et al.  Exploring online social activities for adaptive search personalization , 2010, CIKM.

[41]  Tao Qin,et al.  Supervised rank aggregation , 2007, WWW '07.

[42]  Mária Bieliková,et al.  Efficient Representation of the Lifelong Web Browsing User Characteristics , 2013, UMAP Workshops.

[43]  John D. Lafferty,et al.  A Study of Smoothing Methods for Language Models Applied to Ad Hoc Information Retrieval , 2017, SIGF.

[44]  Yi Cai,et al.  Personalized search by tag-based user profile and resource profile in collaborative tagging systems , 2010, CIKM.

[45]  Huan Liu,et al.  CubeSVD: a novel approach to personalized Web search , 2005, WWW '05.

[46]  Hakim Hacid,et al.  Social networks and information retrieval, how are they converging? A survey, a taxonomy and an analysis of social information retrieval approaches and platforms , 2016, Inf. Syst..

[47]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[48]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[49]  Key-Sun Choi,et al.  A Comparison of Collocation-Based Similarity Measures in Query Expansion , 1999, Inf. Process. Manag..

[50]  Melike Sah,et al.  Personalized concept-based search on the Linked Open Data , 2016, J. Web Semant..