A Novel Application of Fuzzy Set Theory and Topic Model in Sentence Extraction for Vietnamese Text

Summary Vietnamese language has common characteristics with some Asian languages such as Chinese, Japanese, Korean ... They do not define words based on spaces. In this article, we present a method that application of Fuzzy set theory and topic model to extract sentences in Vietnamese texts which have been categorized by topic. This method based on identification of important features as : length of sentence, weight of terms in sentences, position of sentences ..., then extracting important sentences according to the ratio, this ratio indicate which sentences in original text will be extracted. We also built a system based on this method and experiments have obtained good results, satisfying the given requirements.