An Improved Extractive Summarization Technique for Bengali Text(s)

At present, the text summarization has become an important tool for the user to retrieve the required information quickly. Many techniques on extractive text summarization have been developed for English text(s). However, there is a few works done for Bengali text(s) summarization. In this paper, an improved extractive Bengali text summarization technique has been proposed with enhancing the word scoring process, position value heuristics and summary procedure of the existing summarizer. In the word scoring technique, each word is preprocessed using noise removal, tokenization, stop word removal and stemming operation. Then, a heuristic to find the word score is proposed through checking it in all the input documents. Moreover, a modified heuristic is proposed for the sentence scoring in which it has given the priority to the middle sentence highest and then the upper and lower sentences from the middle sentence will be less emphasized. Finally, top k-sentences are extracted from each of the clusters of sentences and sorted the extracted sentences as their actual appearances in the original document(s). Thus, the final summary is synchronized with the original document(s). In comparison to the preceding method, the experimental result shows that the proposed technique produced better summarization to satisfy the users.

[1]  A. Kogilavani,et al.  CLUSTERING AND FEATURE SPECIFIC SENTENCE EXTRACTION BASED SUMMARIZATION OF MULTIPLE DOCUMENTS , 2010 .

[2]  Le Sun,et al.  A cue-based hub-authority approach for multi-document text summarization , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[3]  K.M.a b Shiva Kumar,et al.  Text summarization using clustering technique and SVM technique , 2015 .

[4]  Gurpreet Singh Lehal,et al.  A Survey of Text Summarization Extractive Techniques , 2010 .

[5]  Rainu Nanda,et al.  Implementation of k-Means Clustering Algorithm in CUDA , 2014 .

[6]  N. R. Kasture,et al.  A Survey on Methods of Abstractive Text Summarization , 2014 .

[7]  Ayush Agrawal,et al.  Extraction based approach for text summarization using k-means clustering , 2014 .

[8]  Po Hu,et al.  Multi-View Sentence Ranking for Query-Biased Summarization , 2010, 2010 International Conference on Computational Intelligence and Software Engineering.

[9]  Masud Ibn Afjal,et al.  An extractive text summarization technique for Bengali document(s) using K-means clustering algorithm , 2017, 2017 IEEE International Conference on Imaging, Vision & Pattern Recognition (icIVPR).

[10]  Shree Jaswal,et al.  Multiple Text Document Summarization System using hybrid Summarization technique , 2015, 2015 1st International Conference on Next Generation Computing Technologies (NGCT).

[11]  Xueming Li,et al.  Automatic Summarization for Chinese Text Based on Sub Topic Partition and Sentence Features , 2011, 2011 2nd International Symposium on Intelligence Information Processing and Trusted Computing.