Acquisition of Syntactic Simplification Rules for French

Text simplification is the process of reducing the lexical and syntactic complexity of a text while attempting to preserve (most of) its information content. It has recently emerged as an important research area, which holds promise for enhancing the text readability for the benefit of a broader audience as well as for increasing the performance of other applications. Our work focuses on syntactic complexity reduction and deals with the task of corpus-based acquisition of syntactic simplification rules for the French language. We show that the data-driven manual acquisition of simplification rules can be complemented by the semi-automatic detection of syntactic constructions requiring simplification. We provide the first comprehensive set of syntactic simplification rules for French, whose size is comparable to similar resources that exist for English and Brazilian Portuguese. Unlike these manually-built resources, our resource integrates larger lists of lexical cues signaling simplifiable constructions, that are useful for informing practical systems.

[1]  Daphne Koller,et al.  Sentence Simplification for Semantic Role Labeling , 2008, ACL.

[2]  Yansong Feng,et al.  Title Generation with Quasi-Synchronous Grammar , 2010, EMNLP.

[3]  Lucia Specia,et al.  Supporting the Adaptation of Texts for Poor Literacy Readers: a Text Simplification Editor for Brazilian Portuguese , 2009, BEA@NAACL.

[4]  Mark Dredze,et al.  Learning Simple Wikipedia: A Cogitation in Ascertaining Abecedarian Language , 2010, HLT-NAACL 2010.

[5]  Evelyne Tzoukermann,et al.  Expansion of multi-word terms for indexing and retrieval using morphology and syntax , 1997 .

[6]  Ani Nenkova,et al.  Syntactic Simplification for Improving Content Selection in Multi-Document Summarization , 2004, COLING.

[7]  Walter Daelemans,et al.  Automatic Sentence Simplification for Subtitling in Dutch and English , 2004, LREC.

[8]  Advaith Siddharthan,et al.  Syntactic Simplification and Text Cohesion , 2006 .

[9]  Aurélien Max Writing for Language-Impaired Readers , 2006, CICLing.

[10]  Lucia Specia,et al.  Readability Assessment for Text Simplification , 2010 .

[11]  Raman Chandrasekar,et al.  Automatic induction of rules for text simplification , 1997, Knowl. Based Syst..

[12]  Renata Pontin de Mattos Fortes,et al.  Towards Brazilian Portuguese automatic text simplification systems , 2008, DocEng '08.

[13]  Patrick Ruch,et al.  Comparing corpora and lexical ambiguity , 2000 .

[14]  Eric Wehrli,et al.  Fips, A “Deep” Linguistic Multilingual Parser , 2007, ACL 2007.

[15]  Mirella Lapata,et al.  Learning to Simplify Sentences with Quasi-Synchronous Grammar and Integer Programming , 2011, EMNLP.

[16]  Horacio Saggion,et al.  An Unsupervised Alignment Algorithm for Text Simplification Corpus Construction , 2011, Monolingual@ACL.

[17]  Advaith Siddharthan,et al.  An architecture for a text simplification system , 2002, Language Engineering Conference, 2002. Proceedings.

[18]  Marie-Francine Moens,et al.  Text simplification for children , 2010, SIGIR 2010.

[19]  Mari Ostendorf,et al.  Text simplification for language learners: a corpus analysis , 2007, SLaTE.

[20]  cationR. Chandrasekar Automatic Induction of Rules for Text Simpli , 1997 .

[21]  Raman Chandrasekar,et al.  Motivations and Methods for Text Simplification , 1996, COLING.

[22]  Ion Androutsopoulos,et al.  A Survey of Paraphrasing and Textual Entailment Methods , 2009, J. Artif. Intell. Res..

[23]  Dipanjan Das Andr,et al.  A Survey on Automatic Text Summarization , 2007 .

[24]  Mark Dras,et al.  Tree adjoining grammar and the reluctant paraphrasing of text , 1999 .