Automatic Text Simplification in Spanish: A Comparative Evaluation of Complementing Modules

In this paper we present two components of an automatic text simplification system for Spanish, aimed at making news articles more accessible to readers with cognitive disabilities. Our system in its current state consists of a rule-based lexical transformation component and a module for syntactic simplification. We evaluate the two components separately and as a whole, with a view to determining the level of simplification and the preservation of meaning and grammaticality. In order to test the readability level pre- and post-simplification, we apply seven readability measures for Spanish to three sets of randomly chosen news articles: the original texts, the output obtained after lexical transformations, the syntactic simplification output, and the output of both system components. To test whether the simplification output is grammatically correct and semantically adequate, we ask human annotators to grade pairs of original and simplified sentences according to these two criteria. Our results suggest that both components of our system produce simpler output when compared to the original, and that grammaticality and meaning preservation are positively rated by the annotators.

[1]  Iryna Gurevych,et al.  A Monolingual Tree-based Translation Model for Sentence Simplification , 2010, COLING.

[2]  John Sabatini,et al.  The Automated Text Adaptation Tool , 2007, NAACL.

[3]  David Kauchak,et al.  Simple English Wikipedia: A New Text Simplification Task , 2011, ACL.

[4]  Horacio Saggion,et al.  Reducing Text Complexity through Automatic Lexical Simplification: an Empirical Study for Spanish , 2012, Proces. del Leng. Natural.

[5]  Horacio Saggion,et al.  Can Spanish Be Simpler? LexSiS: Lexical Simplification for Spanish , 2012, COLING.

[6]  Seth Spaulding,et al.  A Spanish Readability Formula , 1956 .

[7]  Advaith Siddharthan,et al.  An architecture for a text simplification system , 2002, Language Engineering Conference, 2002. Proceedings.

[8]  Raman Chandrasekar,et al.  Motivations and Methods for Text Simplification , 1996, COLING.

[9]  Renata Pontin de Mattos Fortes,et al.  Towards Brazilian Portuguese automatic text simplification systems , 2008, DocEng '08.

[10]  Rada Mihalcea,et al.  Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Langu , 2011, ACL 2011.

[11]  Siobhan Devlin,et al.  Helping aphasic people process online information , 2006, Assets '06.

[12]  Pablo Gervás,et al.  Feasibility Analysis for SemiAutomatic Conversion of Text to Improve Readability , 2009, ICTA.

[13]  Christian Smith,et al.  Towards a Rule Based System for Automatic Simplification of Texts , 2010 .

[14]  Horacio Saggion,et al.  Text Simplification in Simplext. Making Text More Accessible , 2011, Proces. del Leng. Natural.

[15]  Horacio Saggion,et al.  Automatic Simplification of Spanish Text for e-Accessibility , 2012, ICCHP.

[16]  Kalina Bontcheva,et al.  Architectural elements of language engineering robustness , 2002, Natural Language Engineering.

[17]  Mari Ostendorf,et al.  Identifying targets for syntactic simplification , 2011, SLaTE.

[18]  Raquel Hervás,et al.  Análisis de la Simplificación de Expresiones Numéricas en Español mediante un Estudio Empírico , 2012, Linguamática.

[19]  Lucia Specia Translating from Complex to Simplified Sentences , 2010, PROPOR.

[20]  A. D. Ilarraza,et al.  First Approach to Automatic Text Simplification in Basque Marı́a , 2012 .

[21]  Noémie Elhadad,et al.  Putting it Simply: a Context-Aware Approach to Lexical Simplification , 2011, ACL.

[22]  Kentaro Inui,et al.  Text Simplification for Reading Assistance: A Project Note , 2003, IWP@ACL.