Rhetorical Figure Detection: the Case of Chiasmus

We propose an approach to detecting the rhetorical figure called chiasmus, which involves the repetition of a pair of words in reverse order, as in “all for one, one for all”. Although repetitions of words are common in natural language, true instances of chiasmus are rare, and the question is therefore whether a computer can effectively distinguish a chiasmus from a random criss-cross pattern. We argue that chiasmus should be treated as a graded phenomenon, which leads to the design of an engine that extracts all criss-cross patterns and ranks them on a scale from prototypical chiasmi to less and less likely instances. Using an evaluation inspired by information retrieval, we demonstrate that our system achieves an average precision of 61%. As a by-product of the evaluation we also construct the first annotated corpus of chiasmi.

[1]  Helge Nordahl Variantes chiasmiques. Essai de description formelle , 1971 .

[2]  Marie Dubremetz,et al.  Vers une identification automatique du chiasme de mots , 2013 .

[3]  Jakub Jan Gawryjolek,et al.  Automated Annotation and Visualization of Rhetorical Figures , 2009 .

[4]  Helmut Schmidt,et al.  Probabilistic part-of-speech tagging using decision trees , 1994 .

[5]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[6]  Chrysanne DiMarco,et al.  Constructing a Rhetorical Figuration Ontology , 2009 .

[7]  Alain Rabatel Points de vue en confrontation dans les antimétaboles PLUS et MOINS , 2008 .

[8]  Graeme Hirst,et al.  A Tale of Two Cultures: Bringing Literary Analysis and Computational Linguistics Together , 2013, CLfL@NAACL-HLT.

[9]  Pierre Fontanier,et al.  Les figures du discours , 1977 .

[10]  Henri Morier,et al.  Dictionnaire de poétique et de rhétorique , 1961 .

[11]  Clare Cavanagh,et al.  Translation: The Princeton Encyclopedia of Poetry and Poetics: Fourth Edition , 2012 .

[12]  G. G. Chowdhury,et al.  The internet and information retrieval research: a brief review , 1999, J. Documentation.

[13]  Mario García-Page El "retruécano léxico" y sus límites , 1991 .

[14]  Peter Willett,et al.  Estimating the recall performance of Web search engines , 1997 .

[15]  Daniel Devatman Hromada,et al.  Initial Experiments with Multilingual Extraction of Rhetoric Figures by means of PERL-compatible Regular Expressions , 2011, RANLP.