A model for quantitative evaluation of an end-to-end question-answering system
暂无分享,去创建一个
Nina Wacholder | Paul B. Kantor | Ying Sun | Bing Bai | Tomek Strzalkowski | Sharon G. Small | Boris Yamrom | Robert Rittman | Diane Kelly | B. Yamrom | D. Kelly | Ying Sun | Bing Bai | P. Kantor | T. Strzalkowski | N. Wacholder | Robert Rittman
[1] Inderjeet Mani,et al. How to Evaluate Your Question Answering System Every Day ... and Still Get Real Work Done , 2000, LREC.
[2] Paul B. Kantor,et al. A study of information seeking and retrieving. II. Users, questions, and effectiveness , 1988, J. Am. Soc. Inf. Sci..
[3] Stephen P. Harter,et al. Variations in Relevance Assessments and the Measurement of Retrieval Effectiveness , 1996, J. Am. Soc. Inf. Sci..
[4] Stephen P. Harter,et al. Evaluation of information retrieval systems : Approaches, issues, and methods , 1997 .
[5] Pertti Vakkari,et al. Task-based information searching , 2005, Annu. Rev. Inf. Sci. Technol..
[6] Ellen M. Voorhees,et al. The Evaluation of Question Answering Systems : Lessons Learned from the TREC QA Track , 2002 .
[7] Cyril Cleverdon,et al. The Cranfield tests on index language devices , 1997 .
[8] Nicholas J. Belkin,et al. Characteristics of Texts Affecting Relevance Judgments , 1993 .
[9] Pia Borlund,et al. The concept of relevance in IR , 2003, J. Assoc. Inf. Sci. Technol..
[10] Nina Wacholder,et al. HITIQA: Towards Analytical Question Answering , 2004, COLING.
[11] Amanda Spink,et al. Multiple Search Sessions Model of End-User Behavior: An Exploratory Study , 1996, J. Am. Soc. Inf. Sci..
[12] Ellen M. Voorhees,et al. The Philosophy of Information Retrieval Evaluation , 2001, CLEF.
[13] Jean Tague-Sutcliffe,et al. Some Perspectives on the Evaluation of Information Retrieval Systems , 1996, J. Am. Soc. Inf. Sci..
[14] Stephen E. Robertson,et al. Evaluating Interactive Systems in TREC , 1996, J. Am. Soc. Inf. Sci..
[15] Karen Spärck Jones. Automatic language and information processing: rethinking evaluation , 2001, Natural Language Engineering.
[16] Arne Jönsson,et al. Wizard of Oz studies: why and how , 1993, IUI '93.
[17] Tomek Strzalkowski,et al. HITIQA: An Interactive Question Answering System: A Preliminary Report , 2003, ACL 2003.
[18] Nina Wacholder,et al. Cross evaluation - A pilot application of a new evaluation mechanism , 2004, ASIST.
[19] Nina Wacholder,et al. Using interview data to identify evaluation criteria for interactive, analytical question-answering systems , 2007, J. Assoc. Inf. Sci. Technol..
[20] Paul B. Kantor,et al. The Information Quest: A Dynamic Model of User's Information Needs. , 1999 .
[21] Pertti Vakkari,et al. Changes in relevance criteria and problem stages in task performance , 2000, J. Documentation.
[22] Andrew Turpin,et al. Why batch and user evaluations do not give the same results , 2001, SIGIR '01.
[23] Karen Sparck Jones. Is question answering a rational task , 2003 .
[24] Nina Wacholder,et al. HITIQA : A Question Answering Analytical Tool , .
[25] Mark T. Maybury. Toward a Question Answering Roadmap , 2003, New Directions in Question Answering.
[26] Lynette Hirschman,et al. Deep Read: A Reading Comprehension System , 1999, ACL.
[27] Paul Over,et al. The TREC interactive track: an annotated bibliography , 2001, Inf. Process. Manag..
[28] Donna K. Harman,et al. Overview of the First Text REtrieval Conference (TREC-1) , 1992, TREC.
[29] Paul B. Kantor,et al. Cross-Evaluation: A new model for information system evaluation , 2006, J. Assoc. Inf. Sci. Technol..
[30] Peter Ingwersen,et al. The development of a method for the evaluation of interactive information retrieval systems , 1997, J. Documentation.
[31] Amanda Spink,et al. Real life, real users, and real needs: a study and analysis of user queries on the web , 2000, Inf. Process. Manag..
[32] Cyril W. Cleverdon,et al. The significance of the Cranfield tests on index languages , 1991, SIGIR '91.
[33] Lisa Krizan,et al. Intelligence Essentials for Everyone , 1999 .
[34] Ellen M. Voorhees,et al. Implementing a Question Answering Evaluation , 2007 .
[35] Ellen M. Voorhees,et al. The TREC-8 Question Answering Track , 2001, LREC.
[36] Tefko Saracevic,et al. RELEVANCE: A review of and a framework for the thinking on the notion in information science , 1997, J. Am. Soc. Inf. Sci..
[37] Martin Chodorow,et al. Automated Essay Scoring for Nonnative English Speakers , 1999 .
[38] Andrew Turpin,et al. Do batch and user evaluations give the same results? , 2000, SIGIR '00.
[39] Ellen M. Voorhees,et al. Evaluating the Evaluation: A Case Study Using the TREC 2002 Question Answering Track , 2003, NAACL.
[40] Ellen M. Voorhees,et al. Building a question answering test collection , 2000, SIGIR '00.
[41] Paul B. Kantor,et al. A study of information seeking and retrieving. III. Searchers, searches, and overlap , 1988, J. Am. Soc. Inf. Sci..
[42] Paul B. Kantor,et al. A study of information seeking and retrieving. I. background and methodology , 1988 .