Learning Tag Dependencies for Sequence Tagging

Sequence tagging is the basis for multiple applications in natural language processing. Despite successes in learning long term token sequence dependencies with neural network, tag dependencies are rarely considered previously. Sequence tagging actually possesses complex dependencies and interactions among the input tokens and the output tags. We propose a novel multi-channel model, which handles different ranges of token-tag dependencies and their interactions simultaneously. A tag LSTM is augmented to manage the output tag dependencies and word-tag interactions, while three mechanisms are presented to efficiently incorporate token context representation and tag dependency. Extensive experiments on part-of-speech tagging and named entity recognition tasks show that the proposed model outperforms the BiLSTMCRF baseline by effectively incorporating the tag dependency feature.

[1]  Lluís Màrquez i Villodre,et al.  SVMTool: A general POS Tagger Generator Based on Support Vector Machines , 2004, LREC.

[2]  Kathleen Daly,et al.  Volume 7 , 1998 .

[3]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[4]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[5]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[6]  Eduard H. Hovy,et al.  End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF , 2016, ACL.

[7]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[8]  J. Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM networks , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[9]  Fei Xia,et al.  Unsupervised Dependency Parsing with Transferring Distribution via Parallel Guidance and Entropy Regularization , 2014, ACL.

[10]  Yue Zhang,et al.  Neural Networks for Open Domain Targeted Sentiment , 2015, EMNLP.

[11]  Peng Zhou,et al.  Joint Extraction of Entities and Relations Based on a Novel Tagging Scheme , 2017, ACL.

[12]  Eric Nichols,et al.  Named Entity Recognition with Bidirectional LSTM-CNNs , 2015, TACL.

[13]  Andrew McCallum,et al.  Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[14]  Baobao Chang,et al.  Max-Margin Tensor Neural Network for Chinese Word Segmentation , 2014, ACL.

[15]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[16]  Iryna Gurevych,et al.  Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging , 2017, EMNLP.

[17]  Gang Luo,et al.  Joint Named Entity Recognition and Disambiguation , 2015 .

[18]  Ashish Vaswani,et al.  Supertagging With LSTMs , 2016, NAACL.

[19]  M. V. Rossum,et al.  In Neural Computation , 2022 .

[20]  Dekang Lin,et al.  Phrase Clustering for Discriminative Learning , 2009, ACL.