Understanding the role of human-inspired heuristics for retrieval models

Relevance estimation is one of the core concerns of information retrieval (IR) studies. Although existing retrieval models gained much success in both deepening our understanding of information seeking behavior and building effective retrieval systems, we have to admit that the models work in a rather different manner from how humans make relevance judgments. Users’ information seeking behaviors involve complex cognitive processes, however, the majority of these behavior patterns are not considered in existing retrieval models. To bridge the gap between practical user behavior and retrieval model, it is essential to systematically investigate user cognitive behavior during relevance judgement and incorporate these heuristics into retrieval models. In this paper, we aim to formally define a set of basic user reading heuristics during relevance judgement and investigate their corresponding modeling strategies in retrieval models. Further experiments are conducted to evaluate the effectiveness of different reading heuristics for improving ranking performance. Based on a large-scale Web search dataset, we find that most reading heuristics can improve the performance of retrieval model and establish guidelines for improving the design of retrieval models with human-inspired heuristics. Our study sheds light on building retrieval model from the perspective of cognitive behavior. This article is an extension of Li et al. [1]. Compared with the previous conference version, it systematically introduces the reading heuristics for retrieval model. It also includes an extensive study of modeling strategies and experimental results to evaluate different reading heuristics. E-mail: yiqunliu@tsinghua.edu.cn

[1]  Jun Xu,et al.  Modeling Diverse Relevance Patterns in Ad-hoc Retrieval , 2018, SIGIR.

[2]  Chang Zhou,et al.  Cognitive Graph for Multi-Hop Reading Comprehension at Scale , 2019, ACL.

[3]  Yiqun Liu,et al.  Sogou-QCL: A New Dataset with Click Relevance Label , 2018, SIGIR.

[4]  Kam-Fai Wong,et al.  A retrospective study of a hybrid document-context based retrieval model , 2007, Inf. Process. Manag..

[5]  Tao Tao,et al.  Diagnostic Evaluation of Information Retrieval Models , 2011, TOIS.

[6]  Gerard de Melo,et al.  PACRR: A Position-Aware Neural IR Model for Relevance Matching , 2017, EMNLP.

[7]  Jian-Yun Nie,et al.  Empirical Study of Multi-level Convolution Models for IR Based on Representations and Interactions , 2018, ICTIR.

[8]  Tao Tao,et al.  A formal study of information retrieval heuristics , 2004, SIGIR '04.

[9]  Xueqi Cheng,et al.  DeepRank: A New Deep Architecture for Relevance Ranking in Information Retrieval , 2017, CIKM.

[10]  Zhiyuan Liu,et al.  End-to-End Neural Ad-hoc Ranking with Kernel Pooling , 2017, SIGIR.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  T. Munich,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[13]  Larry P. Heck,et al.  Learning deep structured semantic models for web search using clickthrough data , 2013, CIKM.

[14]  Stephen E. Robertson,et al.  Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval , 1994, SIGIR '94.

[15]  Yishay Mansour,et al.  Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.

[16]  Xueqi Cheng,et al.  Text Matching as Image Recognition , 2016, AAAI.

[17]  Frank Keller,et al.  Modeling Human Reading with Neural Attention , 2016, EMNLP.

[18]  Yiqun Liu,et al.  Understanding Reading Attention Distribution during Relevance Judgement , 2018, CIKM.

[19]  Hang Li,et al.  Convolutional Neural Network Architectures for Matching Natural Language Sentences , 2014, NIPS.

[20]  Yiqun Liu,et al.  Incorporating Non-sequential Behavior into Click Models , 2015, SIGIR.

[21]  P. C. Wason,et al.  Dual processes in reasoning? , 1975, Cognition.

[22]  Tao Tao,et al.  An exploration of proximity measures in information retrieval , 2007, SIGIR.

[23]  Lili Mou,et al.  Jumper: Learning When to Make Classification Decision in Reading , 2018, IJCAI.

[24]  Yang Liu,et al.  Fast and Accurate Text Classification: Skimming, Rereading and Early Stopping , 2018, ICLR.

[25]  Joachim Bingel,et al.  Sequence Classification with Human Attention , 2018, CoNLL.

[26]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[27]  W. Bruce Croft,et al.  Neural Ranking Models with Weak Supervision , 2017, SIGIR.

[28]  W. Bruce Croft,et al.  A Deep Relevance Matching Model for Ad-hoc Retrieval , 2016, CIKM.

[29]  Wei-Yun Ma,et al.  Speed Reading: Learning to Read ForBackward via Shuttle , 2018, EMNLP.