论文信息 - Segment-Level Sequence Modeling using Gated Recursive Semi-Markov Conditional Random Fields - 字舞流文

Segment-Level Sequence Modeling using Gated Recursive Semi-Markov Conditional Random Fields

Most of the sequence tagging tasks in natural language processing require to recognize segments with certain syntactic role or semantic meaning in a sentence. They are usually tackled with Conditional Random Fields (CRFs), which do indirect word-level modeling over word-level features and thus cannot make full use of segment-level information. Semi-Markov Conditional Random Fields (Semi-CRFs) model segments directly but extracting segment-level features for Semi-CRFs is still a very challenging problem. This paper presents Gated Recursive Semi-CRFs (grSemi-CRFs), which model segments directly and automatically learn segmentlevel features through a gated recursive convolutional neural network. Our experiments on text chunking and named entity recognition (NER) demonstrate that grSemi-CRFs generally outperform other neural models.

Bo Zhang | Jun Zhu | Yong Cao | Zaiqing Nie | Jingwei Zhuo | Jun Zhu | Zaiqing Nie | Bo Zhang | Jingwei Zhuo | Yong Cao

[1] Galen Andrew,et al. A Hybrid Markov/Semi-Markov Conditional Random Field for Sequence Segmentation , 2006, EMNLP.

[2] 悠太菊池,et al. 大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[3] Percy Liang,et al. Semi-Supervised Learning for Natural Language , 2005 .

[4] Wei Xu,et al. Bidirectional LSTM-CRF Models for Sequence Tagging , 2015, ArXiv.

[5] Dekang Lin,et al. Phrase Clustering for Discriminative Learning , 2009, ACL.

[6] William W. Cohen,et al. Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[7] Han Zhao,et al. Self-Adaptive Hierarchical Sentence Model , 2015, IJCAI.

[8] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[9] Claire Cardie,et al. Extracting Opinion Expressions with semi-Markov Conditional Random Fields , 2012, EMNLP.

[10] Andrew McCallum,et al. Lexicon Infused Phrase Embeddings for Named Entity Resolution , 2014, CoNLL.

[11] Robert L. Mercer,et al. Class-Based n-gram Models of Natural Language , 1992, CL.

[12] Erik F. Tjong Kim Sang,et al. Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[13] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[14] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[15] Yoshua Bengio,et al. Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[16] Dan Roth,et al. Design Challenges and Misconceptions in Named Entity Recognition , 2009, CoNLL.

[17] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[18] Yiming Yang,et al. RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[19] Jun Suzuki,et al. Semi-Supervised Sequential Labeling and Segmentation Using Giga-Word Scale Unlabeled Data , 2008, ACL.

[20] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[21] Jun'ichi Tsujii,et al. Improving the Scalability of Semi-Markov Conditional Random Fields for Named Entity Recognition , 2006, ACL.

[22] Mitchell P. Marcus,et al. Text Chunking using Transformation-Based Learning , 1995, VLC@ACL.

[23] Bo Zhang,et al. Webpage understanding: an integrated approach , 2007, KDD '07.

[24] Xuanjing Huang,et al. Gated Recursive Neural Network for Chinese Word Segmentation , 2015, ACL.

[25] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[26] Sabine Buchholz,et al. Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[27] Beatrice Santorini,et al. Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.