A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy

We study a variant of domain adaptation for named-entity recognition where multiple, heterogeneously tagged training sets are available. Furthermore, the test tag-set is not identical to any individual training tag-set. Yet, the relations between all tags are provided in a tag hierarchy, covering the test tags as a combination of training tags. This setting occurs when various datasets are created using different annotation schemes. This is also the case of extending a tag-set with a new tag by annotating only the new tag in a new dataset. We propose to use the given tag hierarchy to jointly learn a neural network that shares its tagging layer among all tag-sets. We compare this model to combining independent models and to a model based on the multitasking approach. Our experiments show the benefit of the tag-hierarchy model, especially when facing non-trivial consolidation of tag-sets.

[1]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[2]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[3]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Jeffrey M. Hausdorff,et al.  Physionet: Components of a New Research Resource for Complex Physiologic Signals". Circu-lation Vol , 2000 .

[5]  Yue Zhang,et al.  Neural Network for Heterogeneous Annotations , 2016, EMNLP.

[6]  Young-Bum Kim,et al.  New Transfer Learning Techniques for Disparate Label Sets , 2015, ACL.

[7]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[8]  Jiajun Zhang,et al.  Multichannel LSTM-CRF for Named Entity Recognition in Chinese Social Media , 2017, CCL.

[9]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[10]  Andrew McCallum,et al.  Marginal Likelihood Training of BiLSTM-CRF for Biomedical Named Entity Recognition from Disjoint Label Sets , 2018, EMNLP.

[11]  Xiaolong Wang,et al.  De-identification of clinical notes via recurrent neural network and conditional random field. , 2017, Journal of biomedical informatics.

[12]  Franck Dernoncourt,et al.  De-identification of patient notes with recurrent neural networks , 2016, J. Am. Medical Informatics Assoc..

[13]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[14]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[15]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[16]  Min Zhang,et al.  Coupled POS Tagging on Heterogeneous Annotations , 2017, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[17]  Andrew McCallum,et al.  Fast and Accurate Entity Recognition with Iterated Dilated Convolutions , 2017, EMNLP.

[18]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[19]  Xu Sun,et al.  A Unified Model for Cross-Domain and Semi-Supervised Named Entity Recognition in Chinese Social Media , 2017, AAAI.

[20]  Barbara Plank,et al.  When is multitask learning effective? Semantic sequence prediction under varying data conditions , 2016, EACL.

[21]  Min-Ling Zhang,et al.  A Review on Multi-Label Learning Algorithms , 2014, IEEE Transactions on Knowledge and Data Engineering.

[22]  Wei Xu,et al.  Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[23]  Franck Dernoncourt,et al.  Transfer Learning for Named-Entity Recognition with Neural Networks , 2017, LREC.

[24]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[25]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[26]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[27]  Qiang Yang,et al.  A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.

[28]  Christian N. S. Pedersen,et al.  The consensus string problem and the complexity of comparing hidden Markov models , 2002, J. Comput. Syst. Sci..

[29]  Timothy Baldwin,et al.  Named Entity Recognition for Novel Types by Transfer Learning , 2016, EMNLP.

[30]  Özlem Uzuner,et al.  Annotating longitudinal clinical narratives for de-identification: The 2014 i2b2/UTHealth corpus , 2015, J. Biomed. Informatics.

[31]  Stephen Pulman,et al.  Evaluating the State of the Art , 1995 .

[32]  Anders Søgaard,et al.  Deep multi-task learning with low level tasks supervised at lower layers , 2016, ACL.

[33]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[34]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[35]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[36]  F. Wilcoxon Individual Comparisons by Ranking Methods , 1945 .

[37]  Johannes Bjerva,et al.  Will my auxiliary tagging task help? Estimating Auxiliary Tasks Effectivity in Multi-Task Learning , 2017, NODALIDA.

[38]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[39]  Isabelle Augenstein,et al.  Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces , 2018, NAACL.

[40]  Joachim Bingel,et al.  Identifying beneficial task relations for multi-task learning in deep neural networks , 2017, EACL.

[41]  Yoshimasa Tsuruoka,et al.  A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.

[42]  Nanyun Peng,et al.  Multi-task Domain Adaptation for Sequence Tagging , 2016, Rep4NLP@ACL.