Joint Segmentation and Tagging with Coupled Sequences Labeling

A Segmentation and tagging task is the fundamental problem in natural language processing (NLP). Traditional methods solve this problem in either pipeline or joint cross-label ways, which suffer from error propagation and large number of labels respectively. In this paper, we present a novel joint model for segmentation and tagging, which integrates two dependent Markov chains. One chain is used for segmentation, and the other is for tagging. The model parameters can be estimated simultaneously. Besides, we can optimize the whole model by improving the single chain. The experiments show that our model could achieve higher performance over traditional models on both English shallow parsing and Chinese word segmentation and POS tagging tasks. T A C

[1]  Xiao Chen,et al.  The Fourth International Chinese Language Processing Bakeoff: Chinese Word Segmentation, Named Entity Recognition and Chinese POS Tagging , 2008, IJCNLP.

[2]  Ryan T. McDonald,et al.  Scalable Large-Margin Online Learning for Structured Classification , 2005 .

[3]  Kevin Duh,et al.  Jointly Labeling Multiple Sequences: A Factorial HMM Approach , 2005, ACL.

[4]  Mary P. Harper,et al.  A Second-Order Hidden Markov Model for Part-of-Speech Tagging , 1999, ACL.

[5]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[6]  Hwee Tou Ng,et al.  Chinese Part-of-Speech Tagging: One-at-a-Time or All-at-Once? Word-Based or Character-Based? , 2004, EMNLP.

[7]  Koby Crammer,et al.  Ultraconservative Online Algorithms for Multiclass Problems , 2001, J. Mach. Learn. Res..

[8]  Xavier Carreras,et al.  Phrase recognition by filtering and ranking with perceptrons , 2003, RANLP.

[9]  Galia Angelova,et al.  Recent Advances in Natural Language Processing III, Selected Papers from RANLP 2003, Borovets, Bulgaria , 2004, RANLP.

[10]  Andrew McCallum,et al.  Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[11]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[12]  Koby Crammer,et al.  Online Passive-Aggressive Algorithms , 2003, J. Mach. Learn. Res..

[13]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[14]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[15]  Jian Su,et al.  Named Entity Recognition using an HMM-based Chunk Tagger , 2002, ACL.