A Twin-Candidate Based Approach for Event Pronoun Resolution using Composite Kernel

Event Anaphora Resolution is an important task for cascaded event template extraction and other NLP study. In this paper, we provide a first systematic study of resolving pronouns to their event verb antecedents for general purpose. First, we explore various positional, lexical and syntactic features useful for the event pronoun resolution. We further explore tree kernel to model structural information embedded in syntactic parses. A composite kernel is then used to combine the above diverse information. In addition, we employed a twin-candidate based preferences learning model to capture the pair wise candidates' preference knowledge. Besides we also look into the incorporation of the negative training instances with anaphoric pronouns whose antecedents are not verbs. Although these negative training instances are not used in previous study on anaphora resolution, our study shows that they are very useful for the final resolution through random sampling strategy. Our experiments demonstrate that it's meaningful to keep certain training data as development data to help SVM select a more accurate hyper plane which provides significant improvement over the default setting with all training data.

[1]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[2]  Christoph Müller Resolving It, This, and That in Unrestricted Multi-Party Dialog , 2007, ACL.

[3]  Jian Su,et al.  Coreference Resolution Using Competition Learning Approach , 2003, ACL.

[4]  Alessandro Moschitti,et al.  Making Tree Kernels Practical for Natural Language Learning , 2006, EACL.

[5]  Dan Klein,et al.  Accurate Unlexicalized Parsing , 2003, ACL.

[6]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[7]  Dan Klein,et al.  Fast Exact Inference with a Factored Model for Natural Language Parsing , 2002, NIPS.

[8]  Jian Su,et al.  An Entity-Mention Model for Coreference Resolution with Inductive Logic Programming , 2008, ACL.

[9]  Jian Su,et al.  Improving Pronoun Resolution Using Statistics-Based Semantic Compatibility Information , 2005, ACL.

[10]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[11]  Jian Su,et al.  Improving Pronoun Resolution by Incorporating Coreferential Information of Candidates , 2004, ACL.

[12]  Alessandro Moschitti,et al.  A Study on Convolution Kernels for Shallow Statistic Parsing , 2004, ACL.

[13]  Jian Su,et al.  A Twin-Candidate Model for Learning-Based Anaphora Resolution , 2008, Computational Linguistics.

[14]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[15]  Nicholas Asher,et al.  Reference to abstract objects in discourse , 1993, Studies in linguistics and philosophy.

[16]  Taeho Jo,et al.  A Multiple Resampling Method for Learning from Imbalanced Data Sets , 2004, Comput. Intell..

[17]  Michael Strube,et al.  A Machine Learning Approach to Pronoun Resolution in Spoken Dialogue , 2003, ACL.

[18]  Claire Cardie,et al.  Identifying Anaphoric and Non-Anaphoric Noun Phrases to Improve Coreference Resolution , 2002, COLING.

[19]  Michael Collins,et al.  New Ranking Algorithms for Parsing and Tagging: Kernels over Discrete Structures, and the Voted Perceptron , 2002, ACL.

[20]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[21]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[22]  Jian Su,et al.  A Twin-Candidate Model of Coreference Resolution with Non-Anaphor Identification Capability , 2005, IJCNLP.

[23]  Sameer Pradhan,et al.  Unrestricted Coreference: Identifying Entities and Events in OntoNotes , 2007, International Conference on Semantic Computing (ICSC 2007).

[24]  Donna K. Byron,et al.  Resolving Pronominal Reference to Abstract Entities , 2002, ACL.