What Should We Teach in Information Retrieval?

Modern Information Retrieval (IR) systems, such as search engines, recommender systems, and conversational agents, are best thought of as interactive systems. And their development is best thought of as a two-stage development process: offline development followed by continued online adaptation and development based on interactions with users. In this opinion paper, we take a closer look at existing IR textbooks and teaching materials, and examine to which degree they cover the offline and online stages of the IR system development process. We notice that current teaching materials in IR focus mostly on search and on the offline development phase. Other scenarios of interacting with information are largely absent from current IR teaching materials, as is the (interactive) online development phase. We identify a list of scenarios and a list of topics that we believe are essential to any modern set of IR teaching materials that claims to fully cover IR system development. In particular, we argue for more attention, in basic IR teaching materials, to scenarios such as recommender systems, and to topics such as query and interaction mining and understanding, online evaluation, and online learning to rank.

[1]  T. Graepel,et al.  Private traits and attributes are predictable from digital records of human behavior , 2013, Proceedings of the National Academy of Sciences.

[2]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[3]  Ryen W. White,et al.  Exploratory Search: Beyond the Query-Response Paradigm , 2009, Exploratory Search: Beyond the Query-Response Paradigm.

[4]  Carlos Castillo,et al.  Social Data: Biases, Methodological Pitfalls, and Ethical Boundaries , 2019, Front. Big Data.

[5]  Ryen W. White Interactions with Search Systems , 2016 .

[6]  Diane Kelly,et al.  Methods for Evaluating Interactive Information Retrieval Systems with Users , 2009, Found. Trends Inf. Retr..

[7]  Nicholas J. Belkin People, Interacting with Information1 , 2016, SIGF.

[8]  Jun Wang,et al.  Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting , 2016, Found. Trends Inf. Retr..

[9]  Maarten de Rijke,et al.  Click Models for Web Search and their Applications to IR: WSDM 2016 Tutorial , 2016, WSDM '16.

[10]  Björn Buchhold,et al.  Semantic Search on Text and Knowledge Bases , 2016, Found. Trends Inf. Retr..

[11]  Bhaskar Mitra,et al.  Neural Networks for Information Retrieval , 2017, SIGIR.

[12]  Roberto Raieli MultiMedia Information Retrieval. , 2013 .

[13]  Lihong Li,et al.  Neural Approaches to Conversational AI , 2019, Found. Trends Inf. Retr..

[14]  Peter Ingwersen,et al.  The Turn - Integration of Information Seeking and Retrieval in Context , 2005, The Kluwer International Series on Information Retrieval.

[15]  Christopher J. C. Burges,et al.  From RankNet to LambdaRank to LambdaMART: An Overview , 2010 .

[16]  Katja Hofmann,et al.  Information Retrieval manuscript No. (will be inserted by the editor) Balancing Exploration and Exploitation in Listwise and Pairwise Online Learning to Rank for Information Retrieval , 2022 .

[17]  M. de Rijke,et al.  Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial , 2016, SIGIR.

[18]  Thorsten Joachims,et al.  Optimizing search engines using clickthrough data , 2002, KDD.

[19]  Yiqun Liu,et al.  Incorporating vertical results into search click models , 2013, SIGIR.

[20]  Thorsten Joachims,et al.  Counterfactual Risk Minimization: Learning from Logged Bandit Feedback , 2015, ICML.

[21]  M. de Rijke,et al.  Information Discovery in E-commerce: Half-day SIGIR 2018 Tutorial , 2018, SIGIR.

[22]  Filip Radlinski,et al.  Optimized interleaving for online retrieval evaluation , 2013, WSDM.

[23]  Nicole Bauer,et al.  Information Retrieval Implementing And Evaluating Search Engines , 2016 .

[24]  Katja Hofmann,et al.  A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.

[25]  Filip Radlinski,et al.  Online Evaluation for Information Retrieval , 2016, Found. Trends Inf. Retr..

[26]  Katja Hofmann,et al.  Reusing historical interaction data for faster online learning to rank for IR , 2013, DIR.

[27]  Andrei Broder,et al.  A taxonomy of web search , 2002, SIGF.

[28]  Nick Craswell,et al.  An experimental comparison of click position-bias models , 2008, WSDM '08.

[29]  Filip Radlinski,et al.  Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.

[30]  Wei Chu,et al.  Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms , 2010, WSDM '11.

[31]  Bhaskar Mitra,et al.  An Introduction to Neural Information Retrieval , 2018, Found. Trends Inf. Retr..

[32]  Ian H. Witten,et al.  Managing Gigabytes: Compressing and Indexing Documents and Images , 1999 .

[33]  Chao Liu,et al.  Efficient multiple-click models in web search , 2009, WSDM '09.

[34]  Bart P. Knijnenburg,et al.  Privacy for Recommender Systems: Tutorial Abstract , 2017, RecSys.

[35]  Yiqun Liu,et al.  Unbiased Learning to Rank: Theory and Practice , 2018, CIKM.

[36]  Artem Grotov,et al.  Online Learning to Rank for Information Retrieval: SIGIR 2016 Tutorial , 2016, SIGIR.

[37]  Filip Radlinski,et al.  Learning diverse rankings with multi-armed bandits , 2008, ICML '08.

[38]  Thorsten Joachims,et al.  Counterfactual Evaluation and Learning for Search, Recommendation and Ad Placement , 2016, SIGIR.

[39]  M. de Rijke,et al.  An Introduction to Click Models for Web Search: SIGIR 2015 Tutorial , 2015, SIGIR.

[40]  Gleb Gusev,et al.  Online Evaluation for Effective Web Service Development , 2018, ArXiv.

[41]  Berkant Barla Cambazoglu,et al.  Scalability Challenges in Web Search Engines , 2015, Scalability Challenges in Web Search Engines.

[42]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[43]  Charu C. Aggarwal,et al.  Recommender Systems: The Textbook , 2016 .

[44]  Henriette Cramer,et al.  Assessing and Addressing Algorithmic Bias - But Before We Get There , 2018, AAAI Spring Symposia.

[45]  Maarten de Rijke,et al.  Advanced Click Models and their Applications to IR: SIGIR 2015 Tutorial , 2015, SIGIR.

[46]  Benjamin Piwowarski,et al.  A user browsing model to predict search engine click data from past observations. , 2008, SIGIR '08.

[47]  Francesco Bonchi,et al.  Algorithmic Bias: From Discrimination Discovery to Fairness-aware Data Mining , 2016, KDD.

[48]  Fabrizio Silvestri,et al.  Mining Query Logs: Turning Search Usage Data into Knowledge , 2010, Found. Trends Inf. Retr..

[49]  M. de Rijke,et al.  Click Models for Web Search , 2015, Click Models for Web Search.

[50]  Ryen W. White Beliefs and biases in web search , 2013, SIGIR.

[51]  Craig MacDonald,et al.  Search Result Diversification , 2015, Found. Trends Inf. Retr..

[52]  Wei Chu,et al.  A contextual-bandit approach to personalized news article recommendation , 2010, WWW '10.

[53]  Xu Chen,et al.  Explainable Recommendation: A Survey and New Perspectives , 2018, Found. Trends Inf. Retr..

[54]  Md. Mustafizur Rahman,et al.  Neural information retrieval: at the end of the early years , 2017, Information Retrieval Journal.

[55]  Qiang Yang,et al.  Beyond ten blue links: enabling user click modeling in federated web search , 2012, WSDM '12.

[56]  Fabio Crestani,et al.  Mobile Information Retrieval , 2017, SpringerBriefs in Computer Science.

[57]  Evangelos Kanoulas,et al.  A Short Survey on Online and Offline Methods for Search Quality Evaluation , 2015, RuSSIR.

[58]  Olivier Chapelle,et al.  A dynamic bayesian network click model for web search ranking , 2009, WWW '09.

[59]  Mark Levene,et al.  Search Engines: Information Retrieval in Practice , 2011, Comput. J..

[60]  Roberto Raieli Multimedia information retrieval : theory and techniques , 2013 .

[61]  Henry Chai Fairness in machine learning lecture notes (Lecture 25) , 2019 .

[62]  Krisztian Balog,et al.  Entity-Oriented Search , 2018, The Information Retrieval Series.

[63]  Tetsuya Sakai,et al.  Laboratory Experiments in Information Retrieval , 2018, The Information Retrieval Series.

[64]  Filip Radlinski,et al.  How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.

[65]  ChengXiang Zhai,et al.  Statistical Language Models for Information Retrieval , 2008, NAACL.

[66]  Fernando Diaz,et al.  Research Frontiers in Information Retrieval: Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018) , 2018, SIGF.

[67]  Grace Hui Yang,et al.  Differential Privacy for Information Retrieval , 2018, WSDM.

[68]  M. de Rijke,et al.  A Survey of Query Auto Completion in Information Retrieval , 2016, Found. Trends Inf. Retr..

[69]  Marti A. Hearst Search User Interfaces , 2009 .

[70]  Justin Zobel,et al.  What We Talk About When We Talk About Information Retrieval , 2018, SIGIR Forum.

[71]  Emine Yilmaz,et al.  Research Frontiers in Information Retrieval Report from the Third Strategic Workshop on Information Retrieval in Lorne (SWIRL 2018) , 2018 .

[72]  Ricardo Baeza-Yates,et al.  Modern Information Retrieval - the concepts and technology behind search, Second edition , 2011 .

[73]  Tie-Yan Liu,et al.  Learning to Rank for Information Retrieval , 2011 .

[74]  Thorsten Joachims,et al.  Interactively optimizing information retrieval systems as a dueling bandits problem , 2009, ICML '09.

[75]  Jaime Arguello,et al.  Aggregated Search , 2017, Found. Trends Inf. Retr..

[76]  Jiafeng Guo,et al.  Analysis of the Paragraph Vector Model for Information Retrieval , 2016, ICTIR.

[77]  Ian H. Witten,et al.  Managing gigabytes (2nd ed.): compressing and indexing documents and images , 1999 .