Toward indicative discussion fora summarization

Summarization of electronic discussion fora is a unique challenge; techniques that work startlingly well on monolithic documents tend to fare poorly in this informal setting. Additionally, conventional techniques ignore much of the structures that have the potential to serve as valuable features in the summarization task. We present several novel examples of such features, including the catalyst score, which is effective at identifying salient messages without looking at their content. We also describe and evaluate NewsSum, a prototype summarization system that is able to efficiently generate variable-length summarizations of Usenet threads.