Multi-document summarization by cluster/prole relevance and redundancy removal

We describe a sentence extraction system that produces two sorts of multi-document summaries: the rst is a general-purpose summary of a cluster of related documents while the second is an entity-based summary of documents related to a particular person. The general-purpose summary is generated by a process that ranks sentences based on their document and cluster \worthiness". The personality-based summary is constructed by a process that ranks sentences according to a metric that uses coreference and lexical information in a person prole. In both cases, a process of redundancy removal is applied to exclude repeated information.