Inference Algorithms for Pattern-Based CRFs on Sequence Data

We consider Conditional random fields (CRFs) with pattern-based potentials defined on a chain. In this model the energy of a string (labeling) $$x_1\ldots x_n$$x1…xn is the sum of terms over intervals [i, j] where each term is non-zero only if the substring $$x_i\ldots x_j$$xi…xj equals a prespecified pattern w. Such CRFs can be naturally applied to many sequence tagging problems. We present efficient algorithms for the three standard inference tasks in a CRF, namely computing (i) the partition function, (ii) marginals, and (iii) computing the MAP. Their complexities are respectively $$O(\textit{nL})$$O(nL), $$O(\textit{nL} \ell _{\max })$$O(nLℓmax) and $$O(\textit{nL} \min \{|D|,\log (\ell _{\max }\!+\!1)\})$$O(nLmin{|D|,log(ℓmax+1)}) where L is the combined length of input patterns, $$\ell _{\max }$$ℓmax is the maximum length of a pattern, and D is the input alphabet. This improves on the previous algorithms of Ye et al. (NIPS, 2009) whose complexities are respectively $$O(\textit{nL} |D|)$$O(nL|D|), $$O\left( n |\varGamma | L^2 \ell _{\max }^2\right) $$On|Γ|L2ℓmax2 and $$O(\textit{nL} |D|)$$O(nL|D|), where $$|\varGamma |$$|Γ| is the number of input patterns. In addition, we give an efficient algorithm for sampling, and revisit the case of MAP with non-positive weights.

[1]  Thomas G. Dietterich,et al.  Training conditional random fields via gradient tree boosting , 2004, ICML.

[2]  V. Thorsson,et al.  HMMSTR: a hidden Markov model for local sequence-structure correlations in proteins. , 2000, Journal of molecular biology.

[3]  Wee Sun Lee,et al.  Semi-Markov Conditional Random Field with High-Order Features , 2011 .

[4]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[5]  Pushmeet Kohli,et al.  Minimizing sparse higher order energy functions of discrete variables , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Thomas Hofmann,et al.  Support vector machine learning for interdependent and structured output spaces , 2004, ICML.

[7]  Uzi Vishkin,et al.  Recursive Star-Tree Parallel Data Structure , 1993, SIAM J. Comput..

[8]  Michael D. Vose,et al.  A Linear Algorithm For Generating Random Numbers With a Given Distribution , 1991, IEEE Trans. Software Eng..

[9]  Nan Ye,et al.  Conditional random field with high-order dependencies for sequence labeling and segmentation , 2014, J. Mach. Learn. Res..

[10]  Xuanjing Huang,et al.  Sparse higher order conditional random fields for improved sequence labeling , 2009, ICML '09.

[11]  Nikos Komodakis,et al.  Beyond pairwise energies: Efficient optimization for higher-order MRFs , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Dan Wu,et al.  Conditional Random Fields with High-Order Features for Sequence Labeling , 2009, NIPS.

[13]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.