Evaluating Web-based Question Answering Systems

The official evaluation of TREC-style Q&A systems is done manually, which is quite expensive and not scalable to web-based Q&A systems. An automatic evaluation technique is needed for dynamic Q&A systems. This paper presents a set of metrics that have been implemented in our web-based Q&A system, namely NSIR. It also shows the correlations between the different metrics.