Predicting Search Satisfaction Metrics with Interleaved Comparisons
暂无分享,去创建一个
[1] Steve Fox,et al. Evaluating implicit measures to improve web search , 2005, TOIS.
[2] Kuansan Wang,et al. PSkip: estimating relevance ranking quality from web search clickthrough data , 2009, KDD.
[3] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.
[4] Susan T. Dumais,et al. Characterizing the value of personalizing search , 2007, SIGIR.
[5] Eugene Agichtein,et al. Understanding “ Abandoned ” Ads : Towards Personalized Commercial Intent Inference via Mouse Movement Analysis , 2008 .
[6] Thorsten Joachims,et al. Evaluating Retrieval Performance Using Clickthrough Data , 2003, Text Mining.
[7] Filip Radlinski,et al. Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.
[8] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.
[9] Mounia Lalmas,et al. Absence time and user engagement: evaluating ranking functions , 2013, WSDM '13.
[10] Katja Hofmann,et al. Reusing historical interaction data for faster online learning to rank for IR , 2013, DIR.
[11] Xiaolong Li,et al. Inferring search behaviors using partially observable Markov (POM) model , 2010, WSDM '10.
[12] Yisong Yue,et al. Beyond position bias: examining result attractiveness as a source of presentation bias in clickthrough data , 2010, WWW '10.
[13] Thorsten Joachims,et al. Optimizing search engines using clickthrough data , 2002, KDD.
[14] Chih-Hung Hsieh,et al. Towards better measurement of attention and satisfaction in mobile search , 2014, SIGIR.
[15] Stephen E. Fienberg,et al. Testing Statistical Hypotheses , 2005 .
[16] Andrew Turpin,et al. Do batch and user evaluations give the same results? , 2000, SIGIR '00.
[17] Andrew Turpin,et al. Why batch and user evaluations do not give the same results , 2001, SIGIR '01.
[18] Ryen W. White,et al. No search result left behind: branching behavior with browser tabs , 2012, WSDM '12.
[19] Katja Hofmann,et al. A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.
[20] Mark Sanderson,et al. Test Collection Based Evaluation of Information Retrieval Systems , 2010, Found. Trends Inf. Retr..
[21] Filip Radlinski,et al. On caption bias in interleaving experiments , 2012, CIKM '12.
[22] Filip Radlinski,et al. Optimized interleaving for online retrieval evaluation , 2013, WSDM.
[23] Ryen W. White,et al. Modeling dwell time to predict click-level satisfaction , 2014, WSDM.
[24] Yang Song,et al. Context-aware web search abandonment prediction , 2014, SIGIR.
[25] Nick Craswell,et al. Beyond clicks: query reformulation as a predictor of search satisfaction , 2013, CIKM.
[26] Filip Radlinski,et al. Comparing the sensitivity of information retrieval metrics , 2010, SIGIR.
[27] Tao Qin,et al. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .
[28] Jaana Kekäläinen,et al. Cumulated gain-based evaluation of IR techniques , 2002, TOIS.
[29] Katja Hofmann,et al. Lerot: an online learning to rank framework , 2013, LivingLab '13.
[30] Ron Kohavi,et al. Responsible editor: R. Bayardo. , 2022 .
[31] José Luis Vicedo González,et al. TREC: Experiment and evaluation in information retrieval , 2007, J. Assoc. Inf. Sci. Technol..
[32] M. de Rijke,et al. Multileaved Comparisons for Fast Online Evaluation , 2014, CIKM.
[33] Ahmed Hassan Awadallah,et al. Beyond DCG: user behavior as a predictor of a successful search , 2010, WSDM '10.
[34] Jane Li,et al. Good abandonment in mobile and PC internet search , 2009, SIGIR.
[35] Yue Gao,et al. Learning more powerful test statistics for click-based retrieval evaluation , 2010, SIGIR.
[36] Kuansan Wang,et al. Inferring search behaviors using partially observable markov model with duration (POMD) , 2011, WSDM '11.
[37] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[38] Eugene Agichtein,et al. Towards predicting web searcher gaze position from mouse movements , 2010, CHI Extended Abstracts.
[39] Andrew Turpin,et al. Further Analysis of Whether Batch and User Evaluations Give the Same Results with a Question-Answering Task , 2000, TREC.
[40] Filip Radlinski,et al. Relevance and Effort: An Analysis of Document Utility , 2014, CIKM.
[41] Fernando Diaz,et al. Robust models of mouse movement on dynamic web search results pages , 2013, CIKM.
[42] Yang Song,et al. Modeling action-level satisfaction for search task satisfaction prediction , 2014, SIGIR.
[43] M. de Rijke,et al. Probabilistic Multileave for Online Retrieval Evaluation , 2015, SIGIR.
[44] Falk Scholer,et al. User performance versus precision measures for simple search tasks , 2006, SIGIR.
[45] Nicolai Meinshausen,et al. Quantile Regression Forests , 2006, J. Mach. Learn. Res..