Hybrid text simplification using synchronous dependency grammars with hand-written and automatically harvested rules

We present an approach to text simplification based on synchronous dependency grammars. The higher level of abstraction afforded by dependency representations allows for a linguistically sound treatment of complex constructs requiring reordering and morphological change, such as conversion of passive voice to active. We present a synchronous grammar formalism in which it is easy to write rules by hand and also acquire them automatically from dependency parses of aligned English and Simple English sentences. The grammar formalism is optimised for monolingual translation in that it reuses ordering information from the source sentence where appropriate. We demonstrate the superiority of our approach over a leading contemporary system based on quasi-synchronous tree substitution grammars, both in terms of expressivity and performance.

[1]  Advaith Siddharthan,et al.  Text Simplification using Typed Dependencies: A Comparision of the Robustness of Different Generation Strategies , 2011, ENLG.

[2]  Mark Dras,et al.  Tree adjoining grammar and the reluctant paraphrasing of text , 1999 .

[3]  Napoleon Katsos,et al.  Reformulating Discourse Connectives for Non-Expert Readers , 2010, NAACL.

[4]  Yvonne Margaret Canning,et al.  Syntactic simplification of text , 2002 .

[5]  Advaith Siddharthan,et al.  Resolving Pronouns Robustly: Plumbing the Depths of Shallowness , 2003 .

[6]  Raman Chandrasekar,et al.  Motivations and Methods for Text Simplification , 1996, COLING.

[7]  Marie-Francine Moens,et al.  Text simplification for children , 2010, SIGIR 2010.

[8]  Lucia Specia,et al.  Supporting the Adaptation of Texts for Poor Literacy Readers: a Text Simplification Editor for Brazilian Portuguese , 2009, BEA@NAACL.

[9]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[10]  Horacio Saggion,et al.  Text Simplification Tools for Spanish , 2012, LREC.

[11]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[12]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[13]  Advaith Siddharthan,et al.  Preserving Discourse Structure when Simplifying Text , 2003, ENLG@EACL.

[14]  Leo Lentz,et al.  Coherence Marking, Prior Knowledge, and Comprehension of Informative and Persuasive Texts: Sorting Things Out , 2008 .

[15]  李幼升,et al.  Ph , 1989 .

[16]  Ani Nenkova,et al.  Information Status Distinctions and Referring Expressions: An Empirical Study of References to People in News Summaries , 2011, CL.

[17]  Emiel Krahmer,et al.  Sentence Simplification by Monolingual Machine Translation , 2012, ACL.

[18]  Michelle Gaddy Everson,et al.  Effects of Causal Text Revisions on More- and Less-Skilled Readers' Comprehension of Easy and Difficult Texts , 2000 .

[19]  Yuan Ding,et al.  Machine Translation Using Probabilistic Synchronous Dependency Insertion Grammars , 2005, ACL.

[20]  Advaith Siddharthan,et al.  Complex Lexico-syntactic Reformulation of Sentences Using Typed Dependency Representations , 2010, INLG.

[21]  Renata Pontin de Mattos Fortes,et al.  Towards Brazilian Portuguese automatic text simplification systems , 2008, DocEng '08.

[22]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[23]  Advaith Siddharthan,et al.  An architecture for a text simplification system , 2002, Language Engineering Conference, 2002. Proceedings.

[24]  Mirella Lapata,et al.  Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming , 2011, EMNLP.

[25]  Ani Nenkova,et al.  Automatically Learning Cognitive Status for Multi-Document Summarization of Newswire , 2005, HLT/EMNLP.

[26]  David Kauchak,et al.  Learning to Simplify Sentences Using Wikipedia , 2011, Monolingual@ACL.

[27]  Margaret G. McKeown,et al.  Revising Social Studies Text from a Text-Processing Perspective: Evidence of Improved Comprehensibility. , 1991 .

[28]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[29]  David A. Smith,et al.  Quasi-Synchronous Grammars: Alignment by Soft Projection of Syntactic Dependencies , 2006, WMT@HLT-NAACL.