Japanese Zero Pronoun Resolution based on Ranking Rules and Machine Learning

Anaphora resolution is one of the most important research topics in Natural Language Processing. In English, overt pronouns such as she and definite noun phrases such as the company are anaphors that refer to preceding entities (antecedents). In Japanese, anaphors are often omitted, and these omissions are called zero pronouns. There are two major approaches to zero pronoun resolution: the heuristic approach and the machine learning approach. Since we have to take various factors into consideration, it is difficult to find a good combination of heuristic rules. Therefore, the machine learning approach is attractive, but it requires a large amount of training data. In this paper, we propose a method that combines ranking rules and machine learning. The ranking rules are simple and effective, while machine learning can take more factors into account. From the results of our experiments, this combination gives better performance than either of the two previous approaches.

[1]  真樹 村田,et al.  用例や表層表現を用いた日本語文章中の指示詞・代名詞・ゼロ代名詞の指示対象の推定 , 1997 .

[2]  Yuji Matsumoto,et al.  One Method for Resolving Japanese Zero Pronouns with Machine Learning Model , .

[3]  R. Iida,et al.  Incorporating Contextual Cues in Trainable Models for Coreference Resolution , 2003 .

[4]  Kazuhiro Seki,et al.  A Probabilistic Model for Japanese Zero Pronoun Resolution Integrating Syntactic and Semantic Features , 2001, NLPRS.

[5]  Yuji Matsumoto,et al.  Extracting Important Sentences with Support Vector Machines , 2002, COLING.

[6]  Manabu Okumura,et al.  Zero Pronoun Resolution in Japanese Discourse Based on Centering Theory , 1996, COLING.

[7]  Teruaki Aizawa,et al.  Automatic Linguistic Analysis for Language Teachers: The Case of Zeros , 2002, COLING.

[8]  Satoru Ikehara,et al.  Zero Pronoun Resolution in a Japanese to English Machine Translation System using Verbal Semantic Attributes , 1993 .

[9]  SUMITA Eiichiro,et al.  Ellipsis Resolution in Dialogues via Decision-tree Learning , 1997 .

[10]  Scott Bennett,et al.  Evaluating Automated and Manual Acquisition of Anaphora Resolution Strategies , 1995, ACL.

[11]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[12]  Hideki Isozaki,et al.  Efficient Support Vector Classifiers for Named Entity Recognition , 2002, COLING.

[13]  Satoru Ikehara,et al.  Intrasentential Resolution of Japanese Zero Pronouns using Pragmatic and Semantic Constraints , 1996 .

[14]  Scott Weinstein,et al.  Centering: A Framework for Modeling the Local Coherence of Discourse , 1995, CL.

[15]  Terumasa Ehara,et al.  Zero-subject Resolution by Probabilistic Model , 1996 .

[16]  Marilyn A. Walker,et al.  Japanese Discourse and the Process of Centering , 1994, Comput. Linguistics.

[17]  吉本 啓 Study of Japanese Zero Pronouns in Discourse Processing , 1986 .

[18]  M. Okumura Zero Pronoun Resolution Based on Centering Theory , 1996, COLING 1996.

[19]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[20]  Katharina Morik,et al.  Combining Statistical Learning with a Knowledge-Based Approach - A Case Study in Intensive Care Monitoring , 1999, ICML.

[21]  Kazuhiro Seki,et al.  A Probabilistic Method for Analyzing Japanese Anaphora Integrating Zero Pronoun Detection and Resolution , 2002, COLING.

[22]  Megumi Kameyama,et al.  A Property-Sharing Constraint in Centering , 1986, ACL.

[23]  Satoru Ikehara,et al.  Zero Pronoun Resolution in a Machine Translation System by using Japanese to English Verbal Semantic Attributes , 1992, ANLP.