Detection of Incorrect Case Assignments in Automatically Generated Paraphrases

This paper addresses the issue of correcting transfer errors in paraphrasing. Our previous investigation into transfer errors occurring in lexical and structural paraphrasing of Japanese sentences revealed that case assignment tends to be incorrect, irrespective of the types of transfer (Fujita and Inui, 2003). Motivated by this observation, we propose an empirical method to detect incorrect case assignment. Our error detection model combines two error detection models. They are separately trained on a large collection of positive examples and a small collection of manually labeled negative examples. Experimental results show that our combined model significantly enhances the baseline model which is trained only on positive examples. We also propose a selective sampling scheme to reduce the cost of collecting negative examples, and confirm the effectiveness for the error detection task.