论文信息 - Detection of Incorrect Case Assignments in Automatically Generated Paraphrases

Detection of Incorrect Case Assignments in Automatically Generated Paraphrases

This paper addresses the issue of correcting transfer errors in paraphrasing. Our previous investigation into transfer errors occurring in lexical and structural paraphrasing of Japanese sentences revealed that case assignment tends to be incorrect, irrespective of the types of transfer (Fujita and Inui, 2003). Motivated by this observation, we propose an empirical method to detect incorrect case assignment. Our error detection model combines two error detection models. They are separately trained on a large collection of positive examples and a small collection of manually labeled negative examples. Experimental results show that our combined model significantly enhances the baseline model which is trained only on positive examples. We also propose a selective sampling scheme to reduce the cost of collecting negative examples, and confirm the effectiveness for the error detection task.

[1] Yuji Matsumoto,et al. Japanese Dependency Analysis using Cascaded Chunking , 2002, CoNLL.

[2] Eduard H. Hovy,et al. Learning surface text patterns for a Question Answering System , 2002, ACL.

[3] Kentaro Inui,et al. Text Simplification for Reading Assistance: A Project Note , 2003, IWP@ACL.

[4] Daniel Marcu,et al. Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[5] Naftali Tishby,et al. Distributional Clustering of English Words , 1993, ACL.

[6] Kentaro Torisawa. An Unsupervised Learning Method for Associative Relationships between Verb Phrases , 2002, COLING.

[7] Lillian Lee,et al. On the effectiveness of the skew divergence for statistical language analysis , 2001, AISTATS.

[8] Manabu Okumura,et al. Paraphrasing by Case Alternation , 2000 .

[9] Satoshi Sato,et al. Verb Paraphrase based on Case Frame Alignment , 2002, ACL.

[10] Frank Keller,et al. Using the Web to Overcome Data Sparseness , 2002, EMNLP.

[11] Thomas Hofmann,et al. Probabilistic latent semantic indexing , 1999, SIGIR '99.

[12] EstimationPeter,et al. The Mathematics of Machine Translation : Parameter , 2004 .

[13] Frank Keller,et al. Evaluating Smoothing Algorithms against Plausibility Judgements , 2001, ACL.

[14] Kevin Knight,et al. Automated Postediting of Documents , 1994, AAAI.

[15] Siobhan Devlin,et al. Simplifying Text for Language-Impaired Readers , 1999, EACL.