论文信息 - The binomial cumulative distribution function, or, is my system better than yours?

The binomial cumulative distribution function, or, is my system better than yours?

In human language technology, it is becoming more and more common to run systematic evaluations in which two or more systems, or two or more versions of the same system, are pitted one against the other. We propose the binomial cumulative distribution function as a way to assess the cumulative effect of the measures collected in such evaluations. We present an application of this measure to the evaluation of the NL interface to an Intelligent Tutoring System. We conclude by discussing a few issues pertaining to this statistical measure.

Barbara Di Eugenio | Michael Glass | Michael J. Scott

[1] S. Siegel,et al. Nonparametric Statistics for the Behavioral Sciences , 2022, The SAGE Encyclopedia of Research Design.

[2] Ehud Reiter,et al. Using a Randomised Controlled Clinical Trial to Evaluate an NLG System , 2001, ACL.

[3] Raymond H. Myers,et al. Probability and Statistics for Engineers and Scientists. , 1973 .

[4] James Shaw,et al. Segregatory Coordination and Ellipsis in Text Generation , 1998, ACL.

[5] Johanna D. Moore,et al. An Empirical Study of the Influence of Argument Conciseness on Argument Effectiveness , 2000, ACL.

[6] Xiaorong Huang,et al. Paraphrasing and Aggregating Argumentative Texts Using Text Structure , 1996, INLG.

[7] Michael White,et al. EXEMPLARS: A Practical, Extensible Framework For Dynamic Text Generation , 1998, INLG.

[8] Johanna D. Moore,et al. Generating descriptions of complex activities , 1997 .

[9] Douglas M. Towne. Approximate Reasoning Techniques for Intelligent Diagnostic Instruction , 1997 .