Topic-Based Bengali Opinion Summarization

In this paper the development of an opinion summarization system that works on Bengali News corpus has been described. The system identifies the sentiment information in each document, aggregates them and represents the summary information in text. The present system follows a topic-sentiment model for sentiment identification and aggregation. Topic-sentiment model is designed as discourse level theme identification and the topic-sentiment aggregation is achieved by theme clustering (k-means) and Document level Theme Relational Graph representation. The Document Level Theme Relational Graph is finally used for candidate summary sentence selection by standard page rank algorithms used in Information Retrieval (IR). As Bengali is a resource constrained language, the building of annotated gold standard corpus and acquisition of linguistics tools for lexico-syntactic, syntactic and discourse level features extraction are described in this paper. The reported accuracy of the Theme detection technique is 83.60% (precision), 76.44% (recall) and 79.85% (F-measure). The summarization system has been evaluated with Precision of 72.15%, Recall of 67.32% and F-measure of 69.65%.

[1]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[2]  Janyce Wiebe,et al.  Effects of Adjective Orientation and Gradability on Sentence Subjectivity , 2000, COLING.

[3]  Paul Resnick,et al.  Reputation systems , 2000, CACM.

[4]  Peter Willett,et al.  Recent trends in hierarchic document clustering: A critical review , 1988, Inf. Process. Manag..

[5]  Hsin-Hsi Chen,et al.  Major topic detection and its application to opinion summarization , 2005, SIGIR '05.

[6]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[7]  Edward M. Reingold,et al.  Graph drawing by force‐directed placement , 1991, Softw. Pract. Exp..

[8]  Sivaji Bandyopadhyay,et al.  Subjectivity Detection in English and Bengali: A CRF-based Approach , 2009 .

[9]  C. J. van Rijsbergen,et al.  The use of hierarchic clustering in information retrieval , 1971, Inf. Storage Retr..

[10]  Jackie Chi Kit Cheung,et al.  Multi-Document Summarization of Evaluative Text , 2013, EACL.

[11]  Sivaji Bandyopadhyay,et al.  Theme detection an exploration of opinion subjectivity , 2009, 2009 3rd International Conference on Affective Computing and Intelligent Interaction and Workshops.

[12]  Ben Shneiderman,et al.  Analyzing Social Media Networks with NodeXL: Insights from a Connected World , 2010 .

[13]  Sivaji Bandyopadhyay,et al.  Dependency Parser for Bengali: the JU System at ICON 2009 , 2009 .

[14]  Liang Zhou,et al.  On the Summarization of Dynamically Introduced Information: Online Discussions and Blogs , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[15]  Rohini K. Srihari,et al.  Using Verbs and Adjectives to Automatically Classify Blog Sentiment , 2006, AAAI Spring Symposium: Computational Approaches to Analyzing Weblogs.

[16]  Katsumi Tanaka,et al.  Fair News Reader: Recommending News Articles with Different Sentiments Based on User Preference , 2007, KES.

[17]  Jeonghee Yi,et al.  Sentiment analysis: capturing favorability using natural language processing , 2003, K-CAP '03.