Mining Discourse Markers for Chinese Textual Summarization

Discourse markers foreshadow the message thrust of texts and saliently guide their rhetorical structure which are important for content filtering and text abstraction. This paper reports on efforts to automatically identify and classify discourse markers in Chinese texts using heuristic-based and corpus-based data-mining methods, as an integral part of automatic text summarization via rhetorical structure and Discourse Markers. Encouraging results are reported.

[1]  Wilfried Brauer,et al.  Connectionist Modeling of Human Event Memorization Processes with Application to Automatic Text Summarization , 1998 .

[2]  Eduard H. Hovy,et al.  Automated Discourse Generation Using Discourse Structure Relations , 1993, Artif. Intell..

[3]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[4]  Lisa F. Rau,et al.  Automatic Condensation of Electronic Publications by Sentence Selection , 1995, Inf. Process. Manag..

[5]  Brigitte Endres-Niggemeyer,et al.  How to Implement a Naturalistic Model of Abstracting: Four Core Working Steps of an Expert Abstractor , 1995, Inf. Process. Manag..

[6]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[7]  Johanna D. Moore,et al.  Investigating Cue Selection and Placement in Tutorial Discourse , 1995, ACL.

[8]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[9]  Kevin Knight,et al.  Mining online text , 1999, Commun. ACM.

[10]  Graeme Hirst,et al.  Discourse-Oriented Anaphora Resolution in Natural Language Understanding: a Review , 1981, CL.

[11]  Benjamin Ka-Yin T'sou,et al.  A Knowledge-based Machine-aided System for Chinese Text Abstraction , 1992, COLING.

[12]  Inderjeet Mani,et al.  Using Cohesion and Coherence Models for Text Summarization , 1998 .

[13]  Michael Halliday,et al.  Cohesion in English , 1976 .

[14]  Chunyu Kit,et al.  Automatic Chinese Text Generation Based On Inference Trees , 1991, ROCLING.

[15]  K. Kong,et al.  Are simple business request letters really simple? A comparison of Chinese and English business request letters , 1998 .

[16]  Candace L. Sidner,et al.  Attention, Intentions, and the Structure of Discourse , 1986, CL.

[17]  Graeme Hirst,et al.  Lexical Cohesion Computed by Thesaural relations as an indicator of the structure of text , 1991, CL.

[18]  L SidnerCandace,et al.  Attention, intentions, and the structure of discourse , 1986 .

[19]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[20]  Chung Hee Hwang,et al.  Tense Trees as the “Fine Structure” of Discourse , 1992, ACL.

[21]  Kathleen McKeown,et al.  Discourse Strategies for Generating Natural-Language Text , 1985, Artif. Intell..

[22]  Daniel Marcu,et al.  From discourse structures to text summaries , 1997 .

[23]  Chris D. Paice,et al.  Constructing literature abstracts by computer: Techniques and prospects , 1990, Inf. Process. Manag..

[24]  Frances C. Johnson,et al.  The application of linguistic processing to automatic abstract generation , 1997 .

[25]  Seiji Miike,et al.  Abstract Generation Based on Rhetorical Structure Extraction , 1994, COLING.

[26]  Julia Hirschberg,et al.  Empirical Studies on the Disambiguation of Cue Phrases , 1993, Comput. Linguistics.

[27]  K. McKeown,et al.  Discourse Strategies for Generating Natural-Language Text , 1985, Artif. Intell..

[28]  Diane J. Litman,et al.  Cue Phrase Classification Using Machine Learning , 1996, J. Artif. Intell. Res..

[29]  Kathleen McKeown,et al.  Emergent Linguistic Rules from inducing Decision Trees: Disambiguating Discourse Clue Words , 1994, AAAI.

[30]  Marti A. Hearst Text Tiling: Segmenting Text into Multi-paragraph Subtopic Passages , 1997, CL.

[31]  Andy Kirkpatrick Information sequencing in Modern Standard Chinese in a genre of extended spoken discourse , 1993 .

[32]  Dragomir R. Radev,et al.  Generating summaries of multiple news articles , 1995, SIGIR '95.

[33]  P. H. Lindsay Human Information Processing , 1977 .