Instance-Based Learning of Span Representations: A Case Study through Named Entity Recognition

Interpretable rationales for model predictions play a critical role in practical applications. In this study, we develop models possessing interpretable inference process for structured prediction. Specifically, we present a method of instance-based learning that learns similarities between spans. At inference time, each span is assigned a class label based on its similar spans in the training set, where it is easy to understand how much each training instance contributes to the predictions. Through empirical analysis on named entity recognition, we demonstrate that our method enables to build models that have high interpretability without sacrificing performance.

[1]  Hiroyuki Shindo,et al.  A Span Selection Model for Semantic Role Labeling , 2018, EMNLP.

[2]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[3]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[4]  Karl Stratos,et al.  Label-Agnostic Sequence Labeling by Copying Nearest Neighbors , 2019, ACL.

[5]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[6]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[7]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[8]  Christopher D. Manning,et al.  Nested Named Entity Recognition , 2009, EMNLP.

[9]  Philip S. Yu,et al.  Multi-grained Named Entity Recognition , 2019, ACL.

[10]  Makoto Nagao,et al.  A framework of a mechanical translation between Japanese and English by analogy principle , 1984 .

[11]  Dan Roth,et al.  Joint Mention Extraction and Classification with Mention Hypergraphs , 2015, EMNLP.

[12]  Erik F. Tjong Kim Sang,et al.  Memory-Based Named Entity Recognition , 2002, CoNLL.

[13]  Regina Barzilay,et al.  Rationalizing Neural Predictions , 2016, EMNLP.

[14]  Sophia Ananiadou,et al.  A Neural Layered Model for Nested Named Entity Recognition , 2018, NAACL.

[15]  Walter Daelemans,et al.  MBT: A Memory-Based Part of Speech Tagger-Generator , 1996, VLC@COLING.

[16]  Scott Lundberg,et al.  A Unified Approach to Interpreting Model Predictions , 2017, NIPS.

[17]  Claire Cardie,et al.  Nested Named Entity Recognition Revisited , 2018, NAACL.

[18]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[19]  Dan Klein,et al.  A Minimal Span-Based Neural Constituency Parser , 2017, ACL.

[20]  Jun'ichi Tsujii,et al.  GENIA corpus - a semantically annotated corpus for bio-textmining , 2003, ISMB.

[21]  Joakim Nivre,et al.  Memory-Based Dependency Parsing , 2004, CoNLL.

[22]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[23]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[24]  Motoaki Kawanabe,et al.  How to Explain Individual Classification Decisions , 2009, J. Mach. Learn. Res..

[25]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[26]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[27]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[28]  Wei Lu,et al.  Neural Segmental Hypergraphs for Overlapping Mention Recognition , 2018, EMNLP.

[29]  Navdeep Jaitly,et al.  Hybrid speech recognition with Deep Bidirectional LSTM , 2013, 2013 IEEE Workshop on Automatic Speech Recognition and Understanding.

[30]  Dan Klein,et al.  Multilingual Constituency Parsing with Self-Attention and Pre-Training , 2018, ACL.

[31]  Chin-Yew Lin,et al.  Towards Improving Neural Named Entity Recognition with Gazetteers , 2019, ACL.

[32]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[33]  Guandong Xu,et al.  A Boundary-aware Neural Model for Nested Named Entity Recognition , 2019, EMNLP.

[34]  Luke S. Zettlemoyer,et al.  End-to-end Neural Coreference Resolution , 2017, EMNLP.

[35]  Denali Molitor,et al.  Model Agnostic Supervised Local Explanations , 2018, NeurIPS.

[36]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[37]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[38]  Makoto Miwa,et al.  Deep Exhaustive Model for Nested Named Entity Recognition , 2018, EMNLP.

[39]  Omer Levy,et al.  Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling , 2018, ACL.

[40]  Walter Daelemans,et al.  Memory-Based Named Entity Recognition using Unannotated Data , 2003, CoNLL.

[41]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[42]  Baobao Chang,et al.  Graph-based Dependency Parsing with Bidirectional LSTM , 2016, ACL.

[43]  Iris Hendrickx,et al.  Memory-based one-step named-entity recognition: Effects of seed list features, classifier stacking, and unannotated data , 2003, CoNLL.

[44]  D. Kibler,et al.  Instance-based learning algorithms , 2004, Machine Learning.

[45]  Walter Daelemans,et al.  Memory-Based Language Processing , 2009, Studies in natural language processing.

[46]  Wei Lu,et al.  Labeling Gaps Between Words: Recognizing Overlapping Mentions with Mention Separators , 2017, EMNLP.

[47]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.