Incremental multi-document summarization: An incremental clustering based approach

Documents which are published both online and offline are considered to be the primary source of information. Astonishing growth of documentation and communication systems tends to flood these pools of information sources with enormous amount of documents. In such a scenario, it is critical to envisage algorithms and methodologies that can convert these huge collection of documents to their best possible form of summaries. It will help to cater to the information hunters who needs only the abstract summaries in a fully digestible form. Different methods which can perform this task can be compared on the basis of quality of summary that it generates and the amount of processing power that it demands. Existing methods are capable of generating summaries incrementally (update summaries as and when new documents are added to the pool). But the inability to keep summaries unaffected by the order in which new documents are added to the pool and the need to process whole set of documents (together with those which are already summarized)each time the summary needs to be updated, pulls them back from their potential applications. We propose a mechanism that can overcome these difficulties and generate update summaries in an affordable way.