Neural Segmental Hypergraphs for Overlapping Mention Recognition

In this work, we propose a novel segmental hypergraph representation to model overlapping entity mentions that are prevalent in many practical datasets. We show that our model built on top of such a new representation is able to capture features and interactions that cannot be captured by previous models while maintaining a low time complexity for inference. We also present a theoretical analysis to formally assess how our representation is better than alternative representations reported in the literature in terms of representational power. Coupled with neural networks for feature learning, our model achieves the state-of-the-art performance in three benchmark datasets annotated with overlapping mentions.

[1]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[2]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[3]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[4]  Jian Su,et al.  Recognizing Names in Biomedical Texts: a Machine Learning Approach , 2004 .

[5]  Dan Roth,et al.  Joint Mention Extraction and Classification with Mention Hypergraphs , 2015, EMNLP.

[6]  Mark A. Przybocki,et al.  The Automatic Content Extraction (ACE) Program – Tasks, Data, and Evaluation , 2004, LREC.

[7]  Hwee Tou Ng,et al.  A Machine Learning Approach to Coreference Resolution of Noun Phrases , 2001, CL.

[8]  Jian Su,et al.  Enhancing HMM-based biomedical named entity recognition by studying special phenomena , 2004, J. Biomed. Informatics.

[9]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[10]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[11]  Andrew McCallum,et al.  Fast and Robust Joint Models for Biomedical Event Extraction , 2011, EMNLP.

[12]  Xiaoqiang Luo,et al.  A Statistical Model for Multilingual Entity Detection and Tracking , 2004, NAACL.

[13]  Heng Ji,et al.  Joint Event Extraction via Structured Prediction with Global Features , 2013, ACL.

[14]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[15]  Noah A. Smith,et al.  Segmental Recurrent Neural Networks , 2015, ICLR.

[16]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[17]  Dan Roth,et al.  Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[18]  Claire Gardent,et al.  Improving Machine Learning Approaches to Coreference Resolution , 2002, ACL.

[19]  Hongxia Jin,et al.  A Neural Transition-based Model for Nested Mention Recognition , 2018, EMNLP.

[20]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[21]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[22]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[23]  Hui Jiang,et al.  A Local Detection Approach for Named Entity Recognition and Mention Detection , 2017, ACL.

[24]  G. D. Zhou,et al.  Recognizing names in biomedical texts using mutual information independence model and SVM plus sigmoid , 2006, Int. J. Medical Informatics.

[25]  Noah A. Smith,et al.  Softmax-Margin CRFs: Training Log-Linear Models with Cost Functions , 2010, NAACL.

[26]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[27]  Jun'ichi Tsujii,et al.  Part-of-Speech Annotation of Biology Research Abstracts , 2004, LREC.

[28]  Koby Crammer,et al.  Flexible Text Segmentation with Structured Multilabel Classification , 2005, HLT.

[29]  Beatrice Alex,et al.  Recognising Nested Named Entities in Biomedical Text , 2007, BioNLP@ACL.

[30]  Dan Roth,et al.  A Constrained Latent Variable Model for Coreference Resolution , 2013, EMNLP.

[31]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[32]  J. Baker Trainable grammars for speech recognition , 1979 .

[33]  Heng Ji,et al.  Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach , 2017, EMNLP.

[34]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[35]  Michael Collins,et al.  Answer Extraction , 2000, ANLP.

[36]  Wei Lu,et al.  Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators , 2017, EMNLP.

[37]  Wei Lu,et al.  Learning to Recognize Discontiguous Entities , 2016, EMNLP.

[38]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.

[39]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[40]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[41]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[42]  Feodor F. Dragan,et al.  Dually Chordal Graphs , 1993, SIAM J. Discret. Math..

[43]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[44]  Giorgio Gallo,et al.  Directed Hypergraphs and Applications , 1993, Discret. Appl. Math..

[45]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[46]  Ralph Grishman,et al.  Information Extraction: Techniques and Challenges , 1997, SCIE.

[47]  Kemal Oflazer,et al.  Recall-Oriented Learning of Named Entities in Arabic Wikipedia , 2012, EACL.