Detecting Article Errors Based on the Mass Count Distinction

This paper proposes a method for detecting errors concerning article usage and singular/plural usage based on the mass count distinction. Although the mass count distinction is particularly important in detecting these errors, it has been pointed out that it is hard to make heuristic rules for distinguishing mass and count nouns. To solve the problem, first, instances of mass and count nouns are automatically collected from a corpus exploiting surface information in the proposed method. Then, words surrounding the mass (count) instances are weighted based on their frequencies. Finally, the weighted words are used for distinguishing mass and count nouns. After distinguishing mass and count nouns, the above errors can be detected by some heuristic rules. Experiments show that the proposed method distinguishes mass and count nouns in the writing of Japanese learners of English with an accuracy of 93% and that 65% of article errors are detected with a precision of 70%.

[1]  K. Allan,et al.  Nouns and Countability , 1980 .

[2]  Lenhart K. Schubert,et al.  Two Theories For Computing The Logical Form Of Mass Expressions , 1984, COLING.

[3]  Ronald L. Rivest,et al.  Learning decision lists , 2004, Machine Learning.

[4]  B. T. S. Atkins,et al.  Predictable Meaning Shift: Some Linguistic Properties of Lexical Implication Rules , 1991, SIGLEX Workshop.

[5]  David Yarowsky,et al.  Homograph disambiguation in speech synthesis , 1994, Speech Synthesis Workshop.

[6]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[7]  Douglas B. Lenat,et al.  CYC: a large-scale investment in knowledge infrastructure , 1995, CACM.

[8]  Kathleen F. McCoy,et al.  English Error Correction: A Syntactic User Model Based on Principled “Mal-Rule” Scoring , 1996 .

[9]  Kathleen F. McCoy,et al.  Recognizing Syntactic Errors in the Writing of Second Language Learners , 1998, ACL.

[10]  Brendan S. Gillon,et al.  The Lexical Semantics of English Count and Mass Nouns , 1999 .

[11]  Martin Chodorow,et al.  An Unsupervised Method for Detecting Grammatical Errors , 2000, ANLP.

[12]  Francis Bond,et al.  Using an Ontology to Determine English Countability , 2002, COLING.

[13]  H. Hughes The Cambridge Grammar of the English Language , 2003 .

[14]  Peter Wagner,et al.  Inducing criteria for mass noun lexical mappings using the Cyc KB, and its extension to WordNet , 2003 .

[15]  Hitoshi Isahara,et al.  Automatic Error Detection in the Japanese Learners’ English Spoken Data , 2003, ACL.

[16]  Timothy Baldwin,et al.  Learning the Countability of English Nouns from Corpus Data , 2003, ACL.

[17]  Timothy Baldwin,et al.  A Plethora of Methods for Learning English Countability , 2003, EMNLP.