Análise Automática de Coerência Usando o Modelo Grade de Entidades para o Português (Automatic Coherence Analysis Using the Entity-grid Model for Portuguese) [in Portuguese]

In this paper we investigate the applicability of Barzilay and Lapata’s (2008) entity-grid model in the evaluation of coherence in scientific abstracts written in Portuguese. More specifically, we focused on assessing whether such model could be employed in the implementation of a classifier capable of detecting linearity breaks that affect coherence. Our experimental results are close to those of the original entity-grid model for English and very similar to the results reported by related works for other languages. Results are also close to those obtained by human judges, showing that the entity-grid model can be applied in the investigated context. Resumo. Este artigo apresenta os resultados de uma investigação acerca da aplicabilidade do modelo grade de entidades proposto por Barzilay e Lapata (2008) na avaliação de coerência em resumos cientı́ficos escritos em português. Mais especificamente, se buscou avaliar se tal modelo poderia ser empregado na implementação de um classificador capaz de detectar quebras de linearidade que afetam a coerência dos resumos. Os resultados experimentais se mostraram próximos aos do modelo original para a lı́ngua inglesa e semelhantes aos relatados por trabalhos relacionados para outras lı́nguas. Os resultados também foram próximos ao obtido por juı́zes humanos, mostrando que o modelo grade de entidades tem potencial para ser aplicado no contexto investigado.

[1]  Martin Chodorow,et al.  CriterionSM Online Essay Evaluation: An Application for Automated Evaluation of Student Essays , 2003, IAAI.

[2]  Manabu Okumura,et al.  Incorporating Cohesive Devices into Entity Grid Model in Evaluating Local Coherence of Japanese Text , 2010, CICLing.

[3]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[4]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[5]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[6]  Daniel Marcu,et al.  Evaluating Multiple Aspects of Coherence in Student Essays , 2004, NAACL.

[7]  Hwee Tou Ng,et al.  Automatically Evaluating Text Coherence Using Discourse Relations , 2011, ACL.

[8]  Thorsten Joachims,et al.  Training linear SVMs in linear time , 2006, KDD '06.

[9]  Erick Galani Maziero,et al.  CSTNews - A Discourse-Annotated Corpus for Single and Multi-Document Summarization of News Texts in Brazilian Portuguese , 2011 .

[10]  Vinícius M. A. de Souza,et al.  A coherence analysis module for SciPo: providing suggestions for scientific abstracts written in Portuguese , 2012, Journal of the Brazilian Computer Society.

[11]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[12]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[13]  Michael Strube,et al.  Extending the Entity-grid Coherence Model to Semantically Related Entities , 2007, ENLG.

[14]  Joel R. Tetreault,et al.  Using Entity-Based Features to Model Coherence in Student Essays , 2010, HLT-NAACL.

[15]  Sandra M. Aluísio,et al.  Análise da Inteligibilidade de textos via ferramentas de Processamento de Língua Natural: adaptando as métricas do Coh-Metrix para o Português , 2010, Linguamática.

[16]  Simone Teufel,et al.  Argumentative Zoning Applied to Critiquing Novices' Scientific Abstracts , 2006, Computing Attitude and Affect in Text.

[17]  Micha Elsner,et al.  Extending the Entity Grid with Entity-Specific Features , 2011, ACL.