The Similarity Measure Based on LDA for Automatic Summarization

Abstract This paper proposes a novel similarity measure for automatic text summarization. The topic space model is built through the Latent Dirichlet Allocation. The word, sentence, document and corpus are represented as vectors in the same topic space. LMMR and LSD algorithm are introduced to create the summary. An experiment is illustrated on DUC data and the results prove the proposed measure and algorithm effective and well performed.