Information Retrieval and Graph Analysis Approaches for Book Recommendation

A combination of multiple information retrieval approaches is proposed for the purpose of book recommendation. In this paper, book recommendation is based on complex user's query. We used different theoretical retrieval models: probabilistic as InL2 (Divergence from Randomness model) and language model and tested their interpolated combination. Graph analysis algorithms such as PageRank have been successful in Web environments. We consider the application of this algorithm in a new retrieval approach to related document network comprised of social links. We called Directed Graph of Documents (DGD) a network constructed with documents and social information provided from each one of them. Specifically, this work tackles the problem of book recommendation in the context of INEX (Initiative for the Evaluation of XML retrieval) Social Book Search track. A series of reranking experiments demonstrate that combining retrieval models yields significant improvements in terms of standard ranked retrieval metrics. These results extend the applicability of link analysis algorithms to different environments.

[1]  C. J. van Rijsbergen,et al.  Probabilistic models of information retrieval based on measuring the divergence from randomness , 2002, TOIS.

[2]  Jimmy J. Lin,et al.  PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval , 2008, BMC Bioinformatics.

[3]  Rocio Guillén GIR with Language Modeling and DFR Using Terrier , 2008, CLEF.

[4]  Patrice Bellot,et al.  Collaborative Filtering for Book Recommandation , 2014, CLEF.

[5]  Stephen E. Robertson,et al.  Probabilistic models of indexing and searching , 1980, SIGIR '80.

[6]  Tao Tao,et al.  Language Model Information Retrieval with Document Expansion , 2006, NAACL.

[7]  D. Rossetti Poems: HE AND I , 2013 .

[8]  Patrice Bellot,et al.  Automatic annotation of bibliographical references in digital humanities books, articles and blogs , 2011, BooksOnline '11.

[9]  W. Bruce Croft,et al.  A Language Modeling Approach to Information Retrieval , 1998, SIGIR Forum.

[10]  Jian-Yun Nie,et al.  Modèles de langue appliqués à la recherche d'information contextuelle , 2006, CORIA.

[11]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[12]  W. Bruce Croft,et al.  Organizing and searching large files of document descriptions , 1978 .

[13]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[14]  W. Bruce Croft,et al.  A Markov random field model for term dependencies , 2005, SIGIR '05.

[15]  Gabriella Kazai,et al.  Overview of the INEX 2014 Social Book Search Track , 2014, CLEF.

[16]  Norbert Fuhr,et al.  Applying the Divergence from Randomness Approach for Content-Only Search in XML Documents , 2004, ECIR.

[17]  Iadh Ounis,et al.  Research directions in Terrier: a search engine for advanced retrieval on the Web , 2007 .

[18]  E. A. Fox,et al.  Combining the Evidence of Multiple Query Representations for Information Retrieval , 1995, Inf. Process. Manag..

[19]  Ben He,et al.  Terrier : A High Performance and Scalable Information Retrieval Platform , 2022 .

[20]  Gabriella Kazai,et al.  Overview of INEX 2014 , 2014, CLEF.

[21]  Joon Ho Lee,et al.  Combining multiple evidence from different properties of weighting schemes , 1995, SIGIR '95.

[22]  Craig MacDonald,et al.  Terrier Information Retrieval Platform , 2005, ECIR.

[23]  Ludovic Bonnefoy,et al.  Do Social Information Help Book Search? , 2012, CLEF.

[24]  Oren Kurland,et al.  PageRank without hyperlinks: structural re-ranking using links induced by language models , 2005, SIGIR '05.

[25]  W. Bruce Croft,et al.  Combining the language model and inference network approaches to retrieval , 2004, Inf. Process. Manag..

[26]  W. Bruce Croft,et al.  A general language model for information retrieval , 1999, CIKM '99.

[27]  Norbert Fuhr,et al.  Overview and Results of the INEX 2009 Interactive Track , 2010, ECDL.

[28]  Jaana Kekäläinen,et al.  IR evaluation methods for retrieving highly relevant documents , 2000, SIGIR '00.

[29]  Iadh Ounis,et al.  University of Glasgow at TREC 2004: Experiments in Web, Robust, and Terabyte Tracks with Terrier , 2004, TREC.