Continuous Evaluation of Large-Scale Information Access Systems: A Case for Living Labs
暂无分享,去创建一个
Martha Larson | Frank Hopfgartner | Krisztian Balog | Benjamin Kille | Anne Schuth | Liadh Kelly | Andreas Lommatzsch | Anne Schuth | B. Kille | F. Hopfgartner | K. Balog | L. Kelly | M. Larson | A. Lommatzsch
[1] Filip Radlinski,et al. Large-scale validation and analysis of interleaved search evaluation , 2012, TOIS.
[2] Filip Radlinski,et al. Evaluating the accuracy of implicit feedback from clicks and query reformulations in Web search , 2007, TOIS.
[3] Ron Kohavi. Online Controlled Experiments: Lessons from Running A/B/n Tests for 12 Years , 2015, KDD.
[4] W. Bruce Croft,et al. A Probabilistic Retrieval Model for Semistructured Data , 2009, ECIR.
[5] Fernando Diaz,et al. Robust models of mouse movement on dynamic web search results pages , 2013, CIKM.
[6] Andreas Lommatzsch,et al. Development and Evaluation of a Highly Scalable News Recommender System , 2015, CLEF.
[7] Nick Craswell,et al. Beyond clicks: query reformulation as a predictor of search satisfaction , 2013, CIKM.
[8] Krisztian Balog,et al. Towards a Living Lab for Information Retrieval Research and Development - A Proposal for a Living Lab for Product Search Tasks , 2011, CLEF.
[9] Guy Shani,et al. A Survey of Accuracy Evaluation Metrics of Recommendation Tasks , 2009, J. Mach. Learn. Res..
[10] Filip Radlinski,et al. Relevance and Effort: An Analysis of Document Utility , 2014, CIKM.
[11] Sahin Albayrak,et al. Real-time recommendations for user-item streams , 2015, SAC.
[12] Juliana Freire,et al. Reproducibility of Data-Oriented Experiments in e-Science (Dagstuhl Seminar 16041) , 2016, Dagstuhl Reports.
[13] Katja Hofmann,et al. A probabilistic method for inferring preferences from clicks , 2011, CIKM '11.
[14] M. de Rijke,et al. Probabilistic Multileave for Online Retrieval Evaluation , 2015, SIGIR.
[15] Martha Larson,et al. Stream-Based Recommendations: Online and Offline Evaluation as a Service , 2015, CLEF.
[16] David Hawking,et al. If SIGIR had an Academic Track, What Would Be In It? , 2015, SIGIR.
[17] Andrei Broder,et al. A taxonomy of web search , 2002, SIGF.
[18] Maarten de Rijke,et al. OpenSearch: Lessons Learned from an Online Evaluation Campaign , 2018, ACM J. Data Inf. Qual..
[19] Krisztian Balog,et al. Head First: Living Labs for Ad-hoc Search Evaluation , 2014, CIKM.
[20] Carol Peters,et al. Report on the SIGIR 2009 workshop on the future of IR evaluation , 2009, SIGF.
[21] Kevin C. Almeroth,et al. Workshop and challenge on news recommender systems , 2013, RecSys.
[22] Krisztian Balog,et al. Towards a living lab for information retrieval research and development: a proposal for a living lab for product search tasks , 2011 .
[23] Frank Hopfgartner,et al. An experimental evaluation of ontology-based user profiles , 2014, Multimedia Tools and Applications.
[24] Martha Larson,et al. Recommender Systems Evaluation: A 3D Benchmark , 2012, RUE@RecSys.
[25] Martha Larson,et al. Idomaar: A Framework for Multi-dimensional Benchmarking of Recommender Algorithms , 2016, RecSys Posters.
[26] Frank Hopfgartner,et al. Benchmarking News Recommendations in a Living Lab , 2014, CLEF.
[27] Frank Hopfgartner,et al. Join the Living Lab: Evaluating News Recommendations in Real-Time , 2015, ECIR.
[28] Filip Radlinski,et al. Predicting Search Satisfaction Metrics with Interleaved Comparisons , 2015, SIGIR.
[29] Martha Larson,et al. Benchmarking News Recommendations: The CLEF NewsREEL Use Case , 2016, SIGF.
[30] Tao Qin,et al. LETOR: Benchmark Dataset for Research on Learning to Rank for Information Retrieval , 2007 .
[31] Frank Hopfgartner,et al. The Potentials of Recommender Systems Challenges for Student Learning , 2016, NIPS 2016.
[32] Stefano Mizzaro,et al. Reproduce and Improve , 2018, ACM J. Data Inf. Qual..
[33] Xiaolong Li,et al. Inferring search behaviors using partially observable Markov (POM) model , 2010, WSDM '10.
[34] James Allan,et al. Frontiers, challenges, and opportunities for information retrieval: Report from SWIRL 2012 the second strategic workshop on information retrieval in Lorne , 2012, SIGF.
[35] Frank Hopfgartner,et al. Real-time Recommendation of Streamed Data , 2015, RecSys.
[36] Susan T. Dumais,et al. Evaluation Challenges and Directions for Information-Seeking Support Systems , 2009, Computer.
[37] Gareth J. F. Jones,et al. Evaluating Personal Information Retrieval , 2012, ECIR.
[38] Frank Hopfgartner,et al. CLEF 2017 NewsREEL Overview: A Stream-Based Recommender Task for Evaluation and Education , 2017, CLEF.
[39] Filip Radlinski,et al. How does clickthrough data reflect retrieval quality? , 2008, CIKM '08.
[40] Filip Radlinski,et al. Optimized interleaving for online retrieval evaluation , 2013, WSDM.
[41] Ryen W. White,et al. Modeling dwell time to predict click-level satisfaction , 2014, WSDM.
[42] Falk Scholer,et al. User performance versus precision measures for simple search tasks , 2006, SIGIR.
[43] Jimmy J. Lin,et al. Evaluation-as-a-Service for the Computational Sciences , 2018, ACM J. Data Inf. Qual..
[44] Tie-Yan Liu. Learning to Rank for Information Retrieval , 2009, Found. Trends Inf. Retr..
[45] Mark D. Smucker,et al. Report on the CIKM workshop on living labs for information retrieval evaluation , 2014, SIGF.
[46] Frank Hopfgartner,et al. Shedding light on a living lab: the CLEF NEWSREEL open recommendation platform , 2014, IIiX.
[47] M. de Rijke,et al. Multileaved Comparisons for Fast Online Evaluation , 2014, CIKM.
[48] Krisztian Balog,et al. Extended Overview of the Living Labs for Information Retrieval Evaluation (LL4IR) CLEF Lab 2015 , 2015, CLEF.
[49] Jane Li,et al. Good abandonment in mobile and PC internet search , 2009, SIGIR.
[50] Jimmy J. Lin,et al. Evaluation-as-a-Service: Overview and Outlook , 2015, ArXiv.