On the Effectiveness of Bayesian Network-based Models for Document Ranking

Theoretical soundness and technical feasibility of treating the problem of document ranking in IR as an inference problem in Bayesian Networks, was studied recently. A pilot framework was also proposed there. In this paper, we provide two implementations of the framework: BNBM25, the one based on BM25, and BNMATF, which is based on MATF, a recently proposed innovative ranking function. We empirically verify the effectiveness of these two implementations on several standard test collections. Positive, significant results are obtained. Potentials of this BN-based framework in addition to its verified effectiveness are also discussed. As a result of the study, we believe that the technique is promising, worthy of further analysis and application.

[1]  Xiangji Huang,et al.  Modeling Term Associations for Probabilistic Information Retrieval , 2014, TOIS.

[2]  Lise Getoor,et al.  Using Semantics and Statistics to Turn Data into Knowledge , 2015, AI Mag..

[3]  Andrew McCallum,et al.  Introduction to Statistical Relational Learning , 2007 .

[4]  Xing Tan,et al.  Ranking Documents Through Stochastic Sampling on Bayesian Network-based Models: A Pilot Study , 2016, SIGIR.

[5]  Nir Friedman,et al.  Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning , 2009 .

[6]  Kathryn B. Laskey MEBN: A language for first-order Bayesian knowledge bases , 2008, Artif. Intell..

[7]  Ben He,et al.  CRTER: using cross terms to enhance probabilistic information retrieval , 2011, SIGIR '11.

[8]  Matthew Richardson,et al.  Markov logic networks , 2006, Machine Learning.

[9]  Jiaul H. Paik A novel TF-IDF weighting scheme for effective ranking , 2013, SIGIR.

[10]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[11]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[12]  Ben Taskar,et al.  Introduction to Statistical Relational Learning (Adaptive Computation and Machine Learning) , 2007 .

[13]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..