Summarization Based on Hidden Topic Markov Model with Multi-features

Based on hidden topic Markov model(HTMM), the authors eliminate assumption limitation in LDA(latent dirichlet allocation) to exploit the structure information during generating summary, and use multi-features based on document content to improve the summary quality. Furthermore, a method for developing single-document summarization to multi-document summarization without breaking document structure is proposed, to achieve the perfect automatic summarization system. Meanwhile, experiment results on the standard dataset DUC2007 show the advantage of HTMM and multi-feature. Compared with the performace of LDA, ROUGE values are improved based on HTMM with multi-features.