论文信息 - Automatic Metrics for Genre-specific Text Quality

Automatic Metrics for Genre-specific Text Quality

To date, researchers have proposed different ways to compute the readability and coherence of a text using a variety of lexical, syntax, entity and discourse properties. But these metrics have not been defined with special relevance to any particular genre but rather proposed as general indicators of writing quality. In this thesis, we propose and evaluate novel text quality metrics that utilize the unique properties of different genres. We focus on three genres: academic publications, news articles about science, and machine generated text, in particular the output from automatic text summarization systems.

Annie Louis

[1] Lijun Feng,et al. Cognitively Motivated Features for Readability Assessment , 2009, EACL.

[2] R. Gunning. The Technique of Clear Writing. , 1968 .

[3] Regina Barzilay,et al. Catching the Drift: Probabilistic Content Models, with Applications to Generation and Summarization , 2004, NAACL.

[4] R. Flesch. A new readability yardstick. , 1948, The Journal of applied psychology.

[5] Joel R. Tetreault,et al. Using Entity-Based Features to Model Coherence in Student Essays , 2010, HLT-NAACL.

[6] Ani Nenkova,et al. Automatic identification of general and specific sentences by leveraging discourse annotations , 2011, IJCNLP.

[7] Ani Nenkova,et al. Revisiting Readability: A Unified Framework for Predicting Text Quality , 2008, EMNLP.

[8] Ani Nenkova,et al. A Coherence Model Based on Syntactic Patterns , 2012, EMNLP.

[9] Ani Nenkova,et al. Automatic Evaluation of Linguistic Quality in Multi-Document Summarization , 2010, ACL.

[10] Mirella Lapata,et al. Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[11] J. Chall,et al. A FORMULA FOR PREDICTING READABILITY , 1948 .