论文信息 - Sequential Labeling Using Deep-Structured Conditional Random Fields

Sequential Labeling Using Deep-Structured Conditional Random Fields

We develop and present the deep-structured conditional random field (CRF), a multi-layer CRF model in which each higher layer's input observation sequence consists of the previous layer's observation sequence and the resulted frame-level marginal probabilities. Such a structure can closely approximate the long-range state dependency using only linear-chain or zeroth-order CRFs by constructing features on the previous layer's output (belief). Although the final layer is trained to maximize the log-likelihood of the state (label) sequence, each lower layer is optimized by maximizing the frame-level marginal probabilities. In this deep-structured CRF, both parameter estimation and state sequence inference are carried out efficiently layer-by-layer from bottom to top. We evaluate the deep-structured CRF on two natural language processing tasks: search query tagging and advertisement field segmentation. The experimental results demonstrate that the deep-structured CRF achieves word labeling accuracies that are significantly higher than the best results reported on these tasks using the same labeled training set.

[1] J. Darroch,et al. Generalized Iterative Scaling for Log-Linear Models , 1972 .

[2] J. Nocedal. Updating Quasi-Newton Matrices With Limited Storage , 1980 .

[3] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.

[4] Daniel P. W. Ellis,et al. Tandem connectionist feature extraction for conventional HMM systems , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[5] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[6] W. Bruce Croft,et al. Table extraction using conditional random fields , 2003, DG.O.

[7] Hector Garcia-Molina,et al. Extracting structured data from Web pages , 2003, SIGMOD '03.

[8] Andrew McCallum,et al. Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data , 2004, J. Mach. Learn. Res..

[9] Dan Klein,et al. Unsupervised Learning of Field Segmentation Models for Information Extraction , 2005, ACL.

[10] Dong Yu,et al. Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice search , 2005, INTERSPEECH.

[11] Henry A. Kautz,et al. Hierarchical Conditional Random Fields for GPS-Based Activity Recognition , 2005, ISRR.

[12] William W. Cohen,et al. Stacked Sequential Learning , 2005, IJCAI.

[13] Paul A. Viola,et al. Learning to extract information from semi-structured text using a discriminative context free grammar , 2005, SIGIR '05.

[14] Dong Yu,et al. A bidirectional target-filtering model of speech coarticulation and reduction: two-stage implementation for phonetic recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[15] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16] Dong Yu,et al. Structured speech modeling , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[17] Dong Yu,et al. A lattice search technique for a long-contextual-span hidden trajectory model of speech , 2006, Speech Commun..

[18] Bo Zhang,et al. Webpage understanding: an integrated approach , 2007, KDD '07.

[19] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.

[20] Ming-Wei Chang,et al. Guiding Semi-Supervision with Constraint-Driven Learning , 2007, ACL.

[21] Gideon S. Mann,et al. Generalized Expectation Criteria for Semi-Supervised Learning of Conditional Random Fields , 2008, ACL.

[22] I. V. Ramakrishnan,et al. Exploiting Structured Reference Data for Unsupervised Text Segmentation with Conditional Random Fields , 2008, SDM.

[23] Tran The Truyen. On Conditional Random Fields : Applications , Feature Selection , Parameter Estimation and Hierarchical Modelling , 2008 .

[24] Xiao Li,et al. Learning query intent from regularized click graphs , 2008, SIGIR '08.

[25] Rosie Jones,et al. The Linguistic Structure of English Web-Search Queries , 2008, EMNLP.

[26] Dong Yu,et al. Solving nonlinear estimation problems using splines [Lecture Notes] , 2009, IEEE Signal Processing Magazine.

[27] Xiao Li,et al. Extracting structured information from user queries with semi-supervised conditional random fields , 2009, SIGIR.

[28] Li Deng,et al. Learning in the Deep-Structured Conditional Random Fields , 2009 .

[29] Yifan Gong,et al. A Novel Framework and Training Algorithm for Variable-Parameter Hidden Markov Models , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[30] Pushmeet Kohli,et al. Associative hierarchical CRFs for object class image segmentation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[31] Dong Yu,et al. Using continuous features in the maximum entropy model , 2009, Pattern Recognit. Lett..

[32] Xiao Li. On the Use of Virtual Evidence in Conditional Random Fields , 2009, EMNLP.

[33] Dong Yu,et al. Solving Nonlinear Estimation Problems Using Splines , 2009 .

[34] Dong Yu,et al. Language recognition using deep-structured conditional random fields , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.