An Intelligent Automatic Text Summarizer

This paper describes an intelligent text summarizer that summarizes a given piece of text into three different summaries based on three different algorithms. This summarizer uses statistical methods to summarize a text like considering the frequency of words, rare words etc. It then gives a meaningful title to the main text and finally selects the best summary out of a list of given summaries. This summarizer allots the writer a competence level (in written English) after analyzing the text like number of rare words used. The title generator of the summarizer gives a short title to the main text. Results obtained through experiments showed that it is indeed possible to determine the competence level of the writer from the text and proximity of the sentences play a vital role in selecting the best summary.

[1]  Jihoon Yang,et al.  Text Summarization by Sentence Segment Extraction Using Machine Learning Algorithms , 2000, PAKDD.

[2]  Dragomir R. Radev,et al.  Generating summaries of multiple news articles , 1995, SIGIR '95.

[3]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[4]  Hercules Dalianis,et al.  Generation of Reference Summaries , 2005 .

[5]  Chris Buckley,et al.  Automatic Text Summarization by Paragraph Extraction , 1997 .

[6]  Yun-Fa Hu,et al.  Sentences clustering based automatic summarization , 2003, Proceedings of the 2003 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.03EX693).

[7]  Eduard H. Hovy,et al.  Aggregation in Natural Language Generation , 1993, EWNLG.

[8]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[9]  Eduard H. Hovy,et al.  Identifying Topics by Position , 1997, ANLP.

[10]  Shuming Shi,et al.  Web page title extraction and its application , 2007, Inf. Process. Manag..

[11]  M.S. Jameel,et al.  Enhancements in query evaluation and page summarization of The Thinking Algorithm , 2008, 2008 International Symposium on Information Technology.

[12]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[13]  Andreas Paepcke,et al.  Seeing the whole in parts: text summarization for web browsing on handheld devices , 2001, WWW '01.

[14]  Vibhu O. Mittal,et al.  OCELOT: a system for summarizing Web pages , 2000, SIGIR '00.

[15]  Shuming Shi,et al.  Title extraction from bodies of HTML documents and its application to web page retrieval , 2005, SIGIR '05.