论文信息 - Iterative Viterbi A* Algorithm for K-Best Sequential Decoding - 字舞流文

Iterative Viterbi A* Algorithm for K-Best Sequential Decoding

Sequential modeling has been widely used in a variety of important applications including named entity recognition and shallow parsing. However, as more and more real time large-scale tagging applications arise, decoding speed has become a bottleneck for existing sequential tagging algorithms. In this paper we propose 1-best A*, 1-best iterative A*, k-best A* and k-best iterative Viterbi A* algorithms for sequential decoding. We show the efficiency of these proposed algorithms for five NLP tagging tasks. In particular, we show that iterative Viterbi A* decoding can be several times or orders of magnitude faster than the state-of-the-art algorithm for tagging tasks with a large number of labels. This algorithm makes real-time large-scale tagging applications with thousands of labels feasible.

Yi Chang | Zhiheng Huang | Anlei Dong | Bo Long | Jean-François Crespo | Su-Lin Wu | Sathiya Keerthi | Zhiheng Huang | S. Keerthi | Anlei Dong | Yi Chang | Bo Long | J. Crespo | Su-Lin Wu

[1] Christopher Raphael,et al. Coarse-to-Fine Dynamic Programming , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2] Nils J. Nilsson,et al. A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[3] David Chiang,et al. Better k-best Parsing , 2005, IWPT.

[4] Trevor Cohn. Efficient Inference in Large Conditional Random Fields , 2006, ECML.

[5] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .

[6] Yasuhiro Fujiwara,et al. Efficient Staggered Decoding for Sequence Labeling , 2010, ACL.

[7] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8] Andrew W. Moore,et al. Fast inference and learning in large-state-space HMMs , 2005, ICML '05.

[9] Jun'ichi Tsujii,et al. Efficient HPSG Parsing with Supertagging and CFG-Filtering , 2007, IJCAI.

[10] Dan Klein,et al. Optimal Graph Search with Iterated Graph Cuts , 2011, AAAI.

[11] Fernando Pereira,et al. Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[12] Noah A. Smith,et al. Proceedings of EMNLP , 2007 .

[13] David A. McAllester,et al. The Generalized A* Architecture , 2007, J. Artif. Intell. Res..

[14] Emma L. Tonkin. Proceedings of ECDL , 2007 .

[15] Michael Collins,et al. Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[16] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17] Gary Geunbae Lee,et al. Efficient Inference of CRFs for Large-Scale Natural Language Data , 2009, ACL.

[18] Dan Klein,et al. K-Best A* Parsing , 2009, ACL.

[19] Jun'ichi Tsujii,et al. Bidirectional Inference with the Easiest-First Strategy for Tagging Sequence Data , 2005, HLT.

[20] Dan Klein,et al. A* Parsing: Fast Exact Viterbi Parse Selection , 2003, NAACL.

[21] Andrew J. Viterbi,et al. Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[22] Daniele P. Radicioni,et al. CarpeDiem: Optimizing the Viterbi Algorithm and Applications to Supervised Sequential Learning , 2009, J. Mach. Learn. Res..

[23] David Ellis,et al. Multilevel Coarse-to-Fine PCFG Parsing , 2006, NAACL.

[24] Shay B. Cohen,et al. Proceedings of ACL , 2013 .