Automatic Identification of Rhetorical Questions

A question may be asked not only to elicit information, but also to make a statement. Questions serving the latter purpose, called rhetorical questions, are often lexically and syntactically indistinguishable from other types of questions. Still, it is desirable to be able to identify rhetorical questions, as it is relevant for many NLP tasks, including information extraction and text summarization. In this paper, we explore the largely understudied problem of rhetorical question identification. Specifically, we present a simple n-gram based language model to classify rhetorical questions in the Switchboard Dialogue Act Corpus. We find that a special treatment of rhetorical questions which incorporates contextual information achieves the highest performance.

[1]  J. Sadock,et al.  Toward a Linguistic Theory of Speech Acts , 1975 .

[2]  Yorick Wilks,et al.  Dialogue Act Classification Based on Intra-Utterance Features∗ , 2005 .

[3]  Andreas Stolcke,et al.  Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? , 1998, Language and speech.

[4]  Kenneth Brian Samuel,et al.  Discourse learning: an investigation of dialogue act tagging using transformation-based learning , 2000 .

[5]  A. L. Edwards Note on the “correction for continuity” in testing the significance of the difference between correlated proportions , 1948, Psychometrika.

[6]  David Vilar,et al.  Dialogue act classification using a Bayesian approach ∗ , 2004 .

[7]  Dirk Heylen,et al.  DIALOGUE-ACT TAGGING USING SMART FEATURE SELECTION; RESULTS ON MULTIPLE CORPORA , 2006, 2006 IEEE Spoken Language Technology Workshop.

[8]  Andreas Stolcke,et al.  AUTOMATIC DIALOG ACT LABELING WITH MINIMAL SUPERVISION , 2008 .

[9]  Chung-hye Han Deriving the Interpretation of Rhetorical Questions , 2005 .

[10]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[11]  Andreas Stolcke,et al.  Dialogue act modeling for automatic tagging and recognition of conversational speech , 2000, CL.

[12]  Elmar Nöth,et al.  Integrated dialog act segmentation and classification using prosodic features and language models , 1997, EUROSPEECH.

[13]  Ken Samuel,et al.  Automatically Selecting Useful Phrases for Dialogue Act Tagging , 1999, ArXiv.

[14]  Rajesh Bhatt,et al.  Argument-Adjunct Asymmetries in Rhetorical Questions , 1998 .

[15]  Klaus Zechner,et al.  Automatic Summarization of Open-Domain Multiparty Dialogues in Diverse Genres , 2002, CL.

[16]  Robert van Rooy,et al.  Negative Polarity Items in Questions: Strength as Relevance , 2003, J. Semant..

[17]  John J. Godfrey,et al.  SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[18]  Katharina Morik,et al.  Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring , 1999, ICML.

[19]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[20]  Norbert Reithinger,et al.  Dialogue act classification using language models , 1997, EUROSPEECH.

[21]  Elizabeth Shriberg,et al.  Automatic dialog act segmentation and classification in multiparty meetings , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[22]  A. Stolcke,et al.  Automatic detection of discourse structure for speech recognition and understanding , 1997, 1997 IEEE Workshop on Automatic Speech Recognition and Understanding Proceedings.

[23]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .

[24]  Anton Nijholt,et al.  Dialogue Act Recognition with Bayesian Networks for Dutch Dialogues , 2002, SIGDIAL Workshop.