NoteSum: An integrated note summarization system by using text mining algorithms

Abstract This study implemented an integrated system of Note Summarization (NoteSum) that merged with multi-users’ notes and searched for relevant information on the Internet and, slides, and textbooks to create a summary for students to learn effectively. The integrated system's framework consists of four different modules: Topic Identification Module, Supporting Material Finding Module, Content Mapping Module and Learning Material Integrating Module. Five experiments were conducted; these resulted in the following findings. First, translating notes with the assistance of topic terms could enhance translation quality. Second, when mapping contents, NoteSum performed better in a discussion-based course rather than in a technical course. Third, the Jensen-Shannon (JS) Divergence was used to assess the generated summary that performed better for the discussion-based course. Fourth, the three attributes—presence of topic terms, number of non-topic words, and ratio of the words with important parts of speech—had different effects on different subjects. Finally, we compared NoteSum with other existing summarization systems. The results indicated that the NoteSum-generated summary was closer to students’ original notes and thus resulted in better performance in readability, informativeness, and completeness. All the results confirm that our proposed NoteSum is an effective note summarization system for student learning.

[1]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[2]  Roman Kern,et al.  Efficient linear text segmentation based on information retrieval techniques , 2009, MEDES.

[3]  Luca Cagliero,et al.  GraphSum: Discovering correlations among multiple terms for graph-based summarization , 2013, Inf. Sci..

[4]  Luca Cagliero,et al.  Learning From Summaries: Supporting e-Learning Activities by Means of Document Summarization , 2016, IEEE Transactions on Emerging Topics in Computing.

[5]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[6]  Guy Lapalme,et al.  Framework for Abstractive Summarization using Text-to-Text Generation , 2011, Monolingual@ACL.

[7]  Marcus Felson,et al.  Community Structure and Collaborative Consumption: A Routine Activity Approach , 1978 .

[8]  Dragomir R. Radev,et al.  LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[9]  R. Slavin Comprehensive Approaches to Cooperative Learning. , 1999 .

[10]  Johanna D. Moore,et al.  Automatic Segmentation of Multiparty Dialogue , 2006, EACL.

[11]  Dan Klein,et al.  Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network , 2003, NAACL.

[12]  Ani Nenkova,et al.  A Survey of Text Summarization Techniques , 2012, Mining Text Data.

[13]  Xindong Wu,et al.  Multi-document summarization using closed patterns , 2016, Knowl. Based Syst..

[14]  Subhankar Ghosh,et al.  Text summarization using Wikipedia , 2014, Inf. Process. Manag..

[15]  The effects of cooperative homework on mathematics achievement of Chinese high school students , 1996 .

[16]  L. Farinetti,et al.  Test-Driven Summarization: Combining Formative Assessment with Teaching Document Summarization , 2017, 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC).

[17]  Matthew Purver,et al.  Meeting Structure Annotation , 2008 .

[18]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[19]  David M. Blei,et al.  Syntactic Topic Models , 2008, NIPS.

[20]  Ani Nenkova,et al.  Automatically Assessing Machine Summary Content Without a Gold Standard , 2013, CL.

[21]  Jacob Eisenstein,et al.  Hierarchical Text Segmentation from Multi-Scale Lexical Cohesion , 2009, NAACL.

[22]  Eric SanJuan,et al.  Summary Evaluation with and without References , 2010, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[23]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[24]  Victoria J. Gallagher,et al.  Dynamics of Peer Education in Cooperative Learning Workgroups , 2000 .

[25]  S. Saraswathi,et al.  Multi-document Text Summarization in E-learning System for Operating System Domain , 2011, ACC.

[26]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[27]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[28]  Tao Li,et al.  Weighted consensus multi-document summarization , 2012, Inf. Process. Manag..

[29]  Ji-Wei Wu,et al.  An Efficient Linear Text Segmentation Algorithm Using Hierarchical Agglomerative Clustering , 2011, 2011 Seventh International Conference on Computational Intelligence and Security.

[30]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[31]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[32]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[33]  Yihong Gong,et al.  Integrating Document Clustering and Multidocument Summarization , 2011, TKDD.

[34]  Nian-Shing Chen,et al.  Personalized Text Content Summarizer for Mobile Learning: An Automatic Text Summarization System with Relevance Based Language Model , 2012, 2012 IEEE Fourth International Conference on Technology for Education.

[35]  Elena Lloret,et al.  The challenging task of summary evaluation: an overview , 2017, Language Resources and Evaluation.

[36]  Christopher D. Manning,et al.  Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger , 2000, EMNLP.

[37]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[38]  Winston Vaughan,et al.  Effects of Cooperative Learning on Achievement and Attitude Among Students of Color , 2002 .

[39]  Devendra K. Tayal,et al.  Text Summarization Using WordNet Graph Based Sentence Ranking , 2019 .

[40]  Eric Fosler-Lussier,et al.  Discourse Segmentation of Multi-Party Conversation , 2003, ACL.