Deciding on an adjustment for multiplicity in IR experiments
暂无分享,去创建一个
[1] J. Sunklodas,et al. Approximation of distributions of sums of weakly dependent random variables by the normal distribution , 1987 .
[2] Alistair Moffat,et al. Principles for robust evaluation infrastructure , 2011, DESIRE '11.
[3] E. Pitman. Significance Tests Which May be Applied to Samples from Any Populations , 1937 .
[4] Alistair Moffat,et al. Statistical power in retrieval experimentation , 2008, CIKM '08.
[5] Joseph P. Romano,et al. Generalizations of the familywise error rate , 2005, math/0507420.
[6] K. Gabriel,et al. On closed testing procedures with special reference to ordered analysis of variance , 1976 .
[7] W. John Wilbur,et al. Non-parametric significance tests of retrieval performance comparisons , 1994, J. Inf. Sci..
[8] J. Shaffer. Multiple Hypothesis Testing , 1995 .
[9] S. S. Young,et al. Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .
[10] Charles L. A. Clarke,et al. Overview of the TREC 2010 Web Track , 2010, TREC.
[11] S. Lange,et al. Adjusting for multiple testing--when and how? , 2001, Journal of clinical epidemiology.
[12] Yifan Huang,et al. To permute or not to permute , 2006, Bioinform..
[13] Anand Swaminathan,et al. Information Retrieval System Evaluation , 2012 .
[14] Ben Carterette,et al. Multiple testing in statistical analysis of systems-based information retrieval experiments , 2012, TOIS.
[15] M. Kenward,et al. An Introduction to the Bootstrap , 2007 .
[16] Mark Sanderson,et al. Quantifying test collection quality based on the consistency of relevance judgements , 2011, SIGIR.
[17] S. Holm. A Simple Sequentially Rejective Multiple Test Procedure , 1979 .
[18] Jacques Savoy,et al. Statistical inference in retrieval effectiveness evaluation , 1997, Inf. Process. Manag..
[19] Y. Benjamini,et al. Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .
[20] Jing Zhou,et al. Streamwise Feature Selection , 2006, J. Mach. Learn. Res..
[21] John D. Lafferty,et al. A study of smoothing methods for language models applied to Ad Hoc information retrieval , 2001, SIGIR '01.
[22] J. Stephen Downie,et al. How Significant is Statistically Significant? The case of Audio Music Similarity and Retrieval , 2012, ISMIR.
[23] S. Dudoit,et al. Multiple Hypothesis Testing in Microarray Experiments , 2003 .
[24] J. Hsu,et al. Applying the Generalized Partitioning Principle to Control the Generalized Familywise Error Rate , 2007, Biometrical journal. Biometrische Zeitschrift.
[25] Olivier Chapelle,et al. Expected reciprocal rank for graded relevance , 2009, CIKM.
[26] Jon Brumbaugh,et al. DEPARTMENT OF HEALTH AND HUMAN SERVICES FOOD AND DRUG ADMINISTRATION , 2000 .
[27] Yogendra P. Chaubey. Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .
[28] Stephen E. Robertson,et al. Understanding inverse document frequency: on theoretical arguments for IDF , 2004, J. Documentation.
[29] James F Troendle,et al. Multiple Testing with Minimal Assumptions , 2008, Biometrical journal. Biometrische Zeitschrift.
[30] Pedro Larrañaga,et al. A review of feature selection techniques in bioinformatics , 2007, Bioinform..
[31] Mark Sanderson,et al. Information retrieval system evaluation: effort, sensitivity, and reliability , 2005, SIGIR '05.
[32] LarrañagaPedro,et al. A review of feature selection techniques in bioinformatics , 2007 .
[33] James Allan,et al. A comparison of statistical significance tests for information retrieval evaluation , 2007, CIKM '07.
[34] E. Pitman. SIGNIFICANCE TESTS WHICH MAY BE APPLIED TO SAMPLES FROM ANY POPULATIONS III. THE ANALYSIS OF VARIANCE TEST , 1938 .
[35] Ellen M. Voorhees,et al. Bias and the limits of pooling for large collections , 2007, Information Retrieval.
[36] Susan A. Murphy,et al. Monographs on statistics and applied probability , 1990 .
[37] H. Scheffé. A METHOD FOR JUDGING ALL CONTRASTS IN THE ANALYSIS OF VARIANCE , 1953 .
[38] James Blustein,et al. A Statistical Analysis of the TREC-3 Data , 1995, TREC.
[39] Takuji Nishimura,et al. Mersenne twister: a 623-dimensionally equidistributed uniform pseudo-random number generator , 1998, TOMC.
[40] Gordon V. Cormack,et al. Validity and power of t-test for comparing MAP and GMAP , 2007, SIGIR.