A Statistical Model Based on the Three Head Words for Detecting Article Errors

In this paper, we propose a statistical model for detecting article errors, which Japanese learners of English often make in English writing. It is based on the three head words --- the verb head, the preposition, and the noun head. To overcome the data sparseness problem, we apply the backed-off estimate to it. Experiments show that its performance (F-measure=0.70) is better than that of other methods. Apart from the performance, it has two advantages: (i) Rules for detecting article errors are automatically generated as conditional probabilities once a corpus is given; (ii) Its recall and precision rates are adjustable.