A Convolutional Deep Neural Network for Coreference Resolution via Modeling Hierarchical Features

Coreference resolution is a major task of natural language processing NLP identifying which noun phrases or mentions refer to the same real-world entity or concept. The state-of-the-art methods applied to coreference resolution are mainly based on statistical machine learning, and their performance strongly depends on the quality of the extracted features. The extracted features are usually shallow features by artificial selection, which leads to the loss of unknown useful deep semantic information and becomes an obstacle for improving system performance. We explored a convolutional deep neural network CDNN to extract discourse level features automatically. Our method utilized all of the word tokens as input without complicated pre-processing. To begin with, the word tokens were transformed to vectors by looking up word embeddings. Secondly, mention-pair level features were extracted according to the given mentions. In the meanwhile, distance features were computed easily. Moreover, discourse level features were learned using a convolutional approach. Finally, these features were fed into a softmax classifier to predict the equivalence between two marked mentions. The experimental results demonstrate that our approach obtains a competitive score of average F1 over MUC, B3, and CEAF, which places it above the mean score of other systems on the dataset of CoNLL-2012 Shared Task.

[1]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[2]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[3]  Dan Roth,et al.  Understanding the Value of Features for Coreference Resolution , 2008, EMNLP.

[4]  Scott Bennett,et al.  Evaluating Automated and Manual Acquisition of Anaphora Resolution Strategies , 1995, ACL.

[5]  Vincent Ng,et al.  Narrowing the Modeling Gap: A Cluster-Ranking Approach to Coreference Resolution , 2014, J. Artif. Intell. Res..

[6]  Emmanuel Lassalle,et al.  Improving pairwise coreference models through feature space hierarchy learning , 2013, ACL.

[7]  Xiaoqiang Luo,et al.  On Coreference Resolution Performance Metrics , 2005, HLT.

[8]  Jun Zhao,et al.  Relation Classification via Convolutional Deep Neural Network , 2014, COLING.

[9]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[10]  Yoshua Bengio,et al.  Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.

[11]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[12]  Andrew Y. Ng,et al.  Parsing Natural Scenes and Natural Language with Recursive Neural Networks , 2011, ICML.

[13]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14]  Douglas E. Appelt,et al.  The (Non)Utility of Predicate-Argument Frequencies for Pronoun Interpretation , 2004, NAACL.

[15]  Simone Paolo Ponzetto,et al.  Exploiting Semantic Role Labeling, WordNet and Wikipedia for Coreference Resolution , 2006, NAACL.

[16]  Nenghai Yu,et al.  Word Alignment Modeling with Context Dependent Deep Neural Network , 2013, ACL.

[17]  Tony McEnery,et al.  Corpus annotation and reference resolution , 1997 .

[18]  Zellig S. Harris,et al.  Distributional Structure , 1954 .

[19]  Lynette Hirschman,et al.  A Model-Theoretic Coreference Scoring Scheme , 1995, MUC.