Multi-document Biased Summarization based on topic-oriented characteristic database of term-pair Co-occurrence

This paper proposes to utilize the latent semantic relations implied by co-occurrence terms in the sample documents, calculate the co-occurrence rate and establish the topic-oriented database of Word Co-occurrence to obtain Biased Summarization. The database is a semantic repository that can be expanded and updated in the particular topic filed. Then the automatic extraction method of Multi-document Biased Summarization is designed by using the similarity between the sentence of the target-side document and the clustering groups of the characteristic term-chains. Meanwhile, the characteristic terms are extracted from the database. In sense, this method can control the window size of the co-occurrence for one paragraph, and the experimental results ultimately show that this extraction method is effective in the tackling articles which are written in the traditional text structures.