论文信息 - A Quantitative Insight into the Impact of Translation on Readability

A Quantitative Insight into the Impact of Translation on Readability

In this paper we investigate the impact of translation on readability. We propose a quantitative analysis of several shallow, lexical and morpho-syntactic features that have been traditionally used for assessing readability and have proven relevant for this task. We conduct our experiments on a parallel corpus of transcribed parliamentary sessions and we investigate readability metrics for the original segments of text, written in the language of the speaker, and their translations.

Liviu P. Dinu | Alina Maria Ciobanu

[1] Kevyn Collins-Thompson. Enriching Information Retrieval with Reading Level Prediction , 2011 .

[2] Niko Wilbert,et al. Modular Toolkit for Data Processing (MDP): A Python Data Processing Framework , 2008, Frontiers Neuroinformatics.

[3] Walt Detmar Meurers,et al. On Improving the Accuracy of Readability Classification using Insights from Second Language Acquisition , 2012, BEA@NAACL-HLT.

[4] R. Burciaga Valdez,et al. Are Condom Instructions in Spanish Readable? Implications for AIDS Prevention Activities for Hispanics , 1989 .

[5] Luo Si,et al. A statistical model for scientific readability , 2001, CIKM '01.

[6] Emanuele Pianta,et al. Making Readability Indices Readable , 2012, PITR@NAACL-HLT.

[7] Samuel Reese,et al. FreeLing 2.1: Five Years of Open-source Language Processing Tools , 2010, LREC.

[8] Ani Nenkova,et al. Revisiting Readability: A Unified Framework for Predicting Text Quality , 2008, EMNLP.

[9] Francisco Casacuberta,et al. Topology of Strings: Median String is NP-Complete , 1999, Theor. Comput. Sci..

[10] Philipp Koehn,et al. Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[11] Lucia Specia,et al. Readability Assessment for Text Simplification , 2010 .

[12] Christina Schäffner,et al. Politics, media and translation: exploring synergies , 2010 .

[13] Jörg Tiedemann,et al. Statistical Machine Translation with Readability Constraints , 2013, NODALIDA.

[14] Weiguo Fan,et al. Automatic summarization of search engine hit lists , 2000 .

[15] Lluís Padró,et al. FreeLing 1.3: Syntactic and semantic services in an open-source NLP library , 2006, LREC.

[16] Hans van Halteren,et al. Source Language Markers in EUROPARL Translations , 2008, COLING.

[17] Simonetta Montemagni,et al. READ–IT: Assessing Readability of Italian Texts with a View to Text Simplification , 2011, SLPAT.

[18] Arthur C. Graesser,et al. Coh-Metrix: Analysis of text on cohesion and language , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[19] Lluís Padró,et al. Analizadores Multilingües en FreeLing , 2011, Linguamática.

[20] Thomas François,et al. Do NLP and machine learning improve traditional readability formulas? , 2012, PITR@NAACL-HLT.

[21] J. Chall,et al. Readability revisited : the new Dale-Chall readability formula , 1995 .

[22] Mihaela Bîrlădeanu,et al. Vocabularul reprezentativ al limbilor romanice , 1988 .

[23] Jörg Tiedemann,et al. Parallel Data, Tools and Interfaces in OPUS , 2012, LREC.

[24] Mari Ostendorf,et al. A machine learning approach to reading level assessment , 2009, Comput. Speech Lang..

[25] Maxine Eskénazi,et al. Combining Lexical and Grammatical Features to Improve Readability Measures for First and Second Language Texts , 2007, NAACL.

[26] G. Harry McLaughlin,et al. SMOG Grading - A New Readability Formula. , 1969 .

[27] Alexander Mehler,et al. Customization of the Europarl Corpus for Translation Studies , 2012, LREC.

[28] Liviu P. Dinu,et al. On the Syllabic Similarities of Romance Languages , 2005, CICLing.

[29] Mabel Crawford,et al. The Art of Plain Talk , 1969 .

[30] Xavier Carreras,et al. FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.