Repeatable evaluation of information retrieval effectiveness in dynamic environments
暂无分享,去创建一个
[1] Peter Bailey,et al. Measuring Search Engine Quality , 2001, Information Retrieval.
[2] Shengli Wu,et al. Methods for ranking information retrieval systems without relevance judgments , 2003, SAC '03.
[3] Lawrence L. Kupper,et al. How Appropriate are Popular Sample Size Formulas , 1989 .
[4] Gary Marchionini,et al. A Comparative Study of Web Search Service Performance , 1996 .
[5] Jacques Savoy,et al. Retrieval effectiveness on the web , 2001, Inf. Process. Manag..
[6] Justin Zobel,et al. How reliable are the results of large-scale information retrieval experiments? , 1998, SIGIR '98.
[7] Thorsten Joachims,et al. Accurately interpreting clickthrough data as implicit feedback , 2005, SIGIR '05.
[8] Peter Bacchetti,et al. Peer review of statistics in medical research: the other problem , 2002, BMJ : British Medical Journal.
[9] Nazli Goharian,et al. Extracting unstructured data from template generated web documents , 2003, CIKM '03.
[10] L. Hothorn,et al. A Unified Approach to Simultaneous Rank Test Procedures in the Unbalanced One-way Layout , 2001 .
[11] Charles L. A. Clarke,et al. Overview of the TREC 2004 Terabyte Track , 2004, TREC.
[12] Howard Greisdorf. Relevance: An Interdisciplinary and Information Science Perspective , 2000, Informing Sci. Int. J. an Emerg. Transdiscipl..
[13] David Carmel,et al. Scaling IR-system evaluation using term relevance sets , 2004, SIGIR '04.
[14] Howard C. Blue,et al. Chapter 7. , 2007 .
[15] J. MacKinnon,et al. Bootstrap tests: how many bootstraps? , 2000 .
[16] Amit Singhal,et al. A case study in web search using TREC algorithms , 2001, WWW '01.
[17] Ophir Frieder,et al. Hourly analysis of a very large topically categorized web query log , 2004, SIGIR '04.
[18] Susan A. Murphy,et al. Monographs on statistics and applied probability , 1990 .
[19] SpinkAmanda,et al. How are we searching the world wide web , 2006 .
[20] Amanda Spink,et al. From E-Sex to E-Commerce: Web Search Changes , 2002, Computer.
[21] Peter Bruza,et al. Interactive Internet search: keyword, directory and query reformulation mechanisms compared , 2000, SIGIR '00.
[22] Abdur Chowdhury,et al. A picture of search , 2006, InfoScale '06.
[23] Ophir Frieder,et al. Surrogate scoring for improved metasearch precision , 2005, SIGIR '05.
[24] Peter Bailey,et al. Is it fair to evaluate Web systems using TREC ad hoc methods , 1999, SIGIR 1999.
[25] Ophir Frieder,et al. Predicting query difficulty on the web by learning visual clues , 2005, SIGIR '05.
[26] Ophir Frieder,et al. Using manually-built web directories for automatic evaluation of known-item retrieval , 2003, SIGIR.
[27] Jaideep Srivastava,et al. First 20 precision among World Wide Web search services (search engines) , 1999 .
[28] Abdur Chowdhury,et al. Automatic evaluation of world wide web search services , 2002, SIGIR '02.
[29] Monika Henzinger,et al. Analysis of a very large web search engine query log , 1999, SIGF.
[30] Mark Sanderson,et al. Forming test collections with no system pooling , 2004, SIGIR '04.
[31] J. MacKinnon. Applications of the Fast Double Bootstrap , 2006 .
[32] Ellen M. Voorhees,et al. Evaluation by highly relevant documents , 2001, SIGIR '01.
[33] Harold Borko,et al. Automatic indexing , 1981, ACM '81.
[34] Amanda Spink,et al. A temporal comparison of AltaVista Web searching , 2005, J. Assoc. Inf. Sci. Technol..
[35] Filippo Menczer,et al. A General Evaluation Framework for Topical Crawlers , 2005, Information Retrieval.
[36] Dan Klein,et al. Evaluating strategies for similarity search on the web , 2002, WWW '02.
[37] Ophir Frieder,et al. Temporal analysis of a very large topically categorized Web query log , 2007, J. Assoc. Inf. Sci. Technol..
[38] Pia Borlund,et al. The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..
[39] Ellen M. Voorhees,et al. The effect of topic set size on retrieval experiment error , 2002, SIGIR '02.
[40] Amanda Spink,et al. Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..
[41] Abdur Chowdhury,et al. Using titles and category names from editor-driven taxonomies for automatic evaluation , 2003, CIKM '03.
[42] James Allan,et al. A critical examination of TDT's cost function , 2002, SIGIR '02.
[43] Ian Soboroff,et al. Ranking retrieval systems without relevance judgments , 2001, SIGIR '01.
[44] David Hawking,et al. Overview of TREC-7 Very Large Collection Track , 1997, TREC.
[45] Rabia Nuray-Turan,et al. Automatic performance evaluation of Web search engines , 2004, Inf. Process. Manag..
[46] HenzingerMonika,et al. Analysis of a very large web search engine query log , 1999 .
[47] Abdur Chowdhury. Automatic Evaluation of Web Search Services , 2005, Adv. Comput..
[48] Ophir Frieder,et al. A framework for determining necessary query set sizes to evaluate web search effectiveness , 2005, WWW '05.
[49] Bart Selman,et al. The Hidden Web , 1997, AI Mag..
[50] Ellen M. Voorhees,et al. Retrieval evaluation with incomplete information , 2004, SIGIR '04.
[51] Amanda Spink,et al. An analysis of Web searching by European AlltheWeb.com users , 2005, Inf. Process. Manag..
[52] J. Berger. Statistical Decision Theory and Bayesian Analysis , 1988 .
[53] Mark Sanderson,et al. Information retrieval system evaluation: effort, sensitivity, and reliability , 2005, SIGIR '05.
[54] Christopher Olston,et al. What's new on the web?: the evolution of the web from a search engine perspective , 2004, WWW '04.
[55] James Allan,et al. Incremental test collections , 2005, CIKM '05.
[56] Ellen M. Voorhees,et al. Evaluating evaluation measure stability , 2000, SIGIR '00.
[57] J. Troendle. Approximating the power of Wilcoxon's rank-sum test against shift alternatives. , 1999, Statistics in medicine.
[58] J. Hoenig,et al. Statistical Practice The Abuse of Power: The Pervasive Fallacy of Power Calculations for Data Analysis , 2001 .
[59] Sriram Raghavan,et al. Crawling the Hidden Web , 2001, VLDB.
[60] David Hawking,et al. Which Search Engine is Best at Finding Online Services? , 2001, WWW Posters.
[61] Jacques Savoy,et al. Statistical inference in retrieval effectiveness evaluation , 1997, Inf. Process. Manag..
[62] B. J. Collings,et al. Estimating the power of the two-sample Wilcoxon test for location shift. , 1988, Biometrics.
[63] Donna K. Harman,et al. Results and Challenges in Web Search Evaluation , 1999, Comput. Networks.
[64] Rabia Nuray-Turan,et al. Automatic ranking of information retrieval systems using data fusion , 2006, Inf. Process. Manag..
[65] Ellen M. Voorhees,et al. Variations in relevance judgments and the measurement of retrieval effectiveness , 1998, SIGIR '98.
[66] David A. Hull. Using statistical testing in the evaluation of retrieval experiments , 1993, SIGIR.
[67] Gerard Salton,et al. Automatic indexing , 1980, ACM '80.
[68] Hector Garcia-Molina,et al. Efficient Crawling Through URL Ordering , 1998, Comput. Networks.
[69] Wei-Hao Lin,et al. Revisiting the effect of topic set size on retrieval error , 2005, SIGIR '05.
[70] Giles,et al. Searching the world wide Web , 1998, Science.
[71] Javed A. Aslam,et al. A unified model for metasearch, pooling, and system evaluation , 2003, CIKM '03.
[72] M. Kenward,et al. An Introduction to the Bootstrap , 2007 .
[73] Longzhuang Li,et al. Precision Evaluation of Search Engines , 2004, World Wide Web.
[74] Dayne Freitag,et al. A Machine Learning Architecture for Optimizing Web Search Engines , 1999 .
[75] Andrei Broder,et al. A taxonomy of web search , 2002, SIGF.
[76] James Blustein,et al. A Statistical Analysis of the TREC-3 Data , 1995, TREC.
[77] Longzhuang Li,et al. A new method for automatic performance comparison of search engines , 2004, World Wide Web.
[78] Ophir Frieder,et al. Evaluation of filtering current news search results , 2004, SIGIR '04.
[79] Andrew Turpin,et al. Why batch and user evaluations do not give the same results , 2001, SIGIR '01.
[80] Michael D. Gordon,et al. Finding Information on the World Wide Web: The Retrieval Effectiveness of Search Engines , 1999, Inf. Process. Manag..
[81] S. Goodman,et al. A comment on replication, p-values and evidence. , 1992, Statistics in medicine.
[82] J. MacKinnon,et al. The power of bootstrap and asymptotic tests , 2006 .
[83] Peter Bailey,et al. Overview of the TREC-8 Web Track , 2000, TREC.