Relative evaluation of informativeness in machine generated summaries

This paper is concerned with the relative evaluation of information content in summaries. We study the effect of crossing the summary-question pairs for a comprehension test based summary evaluation. Using the scheme, machine generated and human authored summaries from the broadcast news stories are evaluated. The approach does not use absolute scores. Instead it relies on a relative comparison, effectively alleviating the subjectivity of individual summary authors. The evaluation indicates that less than half (44%) of information is shared between human authored summaries of roughly 15 words. On the other hand, 27% of information in machine generated summaries is shared with human authored summaries.