Incorporating prior knowledge into a transductive ranking algorithm for multi-document summarization

This paper presents a transductive approach to learn ranking functions for extractive multi-document summarization. At the first stage, the proposed approach identifies topic themes within a document collection, which help to identify two sets of relevant and irrelevant sentences to a question. It then iteratively trains a ranking function over these two sets of sentences by optimizing a ranking loss and fitting a prior model built on keywords. The output of the function is used to find further relevant and irrelevant sentences. This process is repeated until a desired stopping criterion is met.