论文信息 - Torch-Struct: Deep Structured Prediction Library

Torch-Struct: Deep Structured Prediction Library

The literature on structured prediction for NLP describes a rich collection of distributions and algorithms over sequences, segmentations, alignments, and trees; however, these algorithms are difficult to utilize in deep learning frameworks. We introduce Torch-Struct, a library for structured prediction designed to take advantage of and integrate with vectorized, auto-differentiation based frameworks. Torch-Struct includes a broad collection of probabilistic structures accessed through a simple and flexible distribution-based API that connects to any deep learning model. The library utilizes batched, vectorized operations and exploits auto-differentiation to produce readable, fast, and testable code. Internally, we also include a number of general-purpose optimizations to provide cross-algorithm efficiency. Experiments show significant performance gains over fast baselines and case-studies demonstrate the benefits of the library. Torch-Struct is available at this https URL.

Alexander M. Rush

[1] Dustin Tran,et al. TensorFlow Distributions , 2017, ArXiv.

[2] Alexander M. Rush,et al. Learning Neural Templates for Text Generation , 2018, EMNLP.

[3] Noah D. Goodman,et al. Pyro: Deep Universal Probabilistic Programming , 2018, J. Mach. Learn. Res..

[4] Zhifei Li,et al. First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests , 2009, EMNLP.

[5] Samuel R. Bowman,et al. ListOps: A Diagnostic Dataset for Latent Tree Learning , 2018, NAACL.

[6] L. Baum,et al. Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .

[7] James H. Martin,et al. Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[8] Xavier Carreras,et al. Structured Prediction Models via the Matrix-Tree Theorem , 2007, EMNLP.

[9] Joshua Goodman,et al. Semiring Parsing , 1999, CL.

[10] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[11] Noah A. Smith,et al. Dyna: a declarative language for implementing dynamic programs , 2004, ACL 2004.

[12] Eric P. Xing,et al. Turbo Parsers: Dependency Parsing by Approximate Variational Inference , 2010, EMNLP.

[13] Dan Klein,et al. Neural CRF Parsing , 2015, ACL.

[14] Arthur Mensch,et al. Differentiable Dynamic Programming for Structured Prediction and Attention , 2018, ICML.

[15] Yue Zhang,et al. NCRF++: An Open-source Neural Sequence Labeling Toolkit , 2018, ACL.

[16] Hermann Ney,et al. HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[17] Simo Särkkä,et al. Temporal Parallelization of Bayesian Filters and Smoothers , 2019, ArXiv.

[18] Alexander M. Rush,et al. Structured Attention Networks , 2017, ICLR.

[19] Sven Behnke,et al. PyStruct: learning structured prediction in python , 2014, J. Mach. Learn. Res..

[20] Tadao Kasami,et al. An Efficient Recognition and Syntax-Analysis Algorithm for Context-Free Languages , 1965 .

[21] Jason Eisner,et al. Bilexical Grammars and their Cubic-Time Parsing Algorithms , 2000 .

[22] Andrew McCallum,et al. Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[23] Simo Särkkä,et al. Temporal Parallelization of Bayesian Smoothers , 2021, IEEE Transactions on Automatic Control.

[24] Armand Joulin,et al. Cooperative Learning of Disjoint Syntax and Semantics , 2019, NAACL.

[25] Michael I. Jordan,et al. Factorial Hidden Markov Models , 1995, Machine Learning.

[26] Vaibhava Goel,et al. Self-Critical Sequence Training for Image Captioning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Jason Eisner,et al. Inside-Outside and Forward-Backward Algorithms Are Just Backprop (tutorial paper) , 2016, SPNLP@EMNLP.

[28] Thomas L. Griffiths,et al. A fully Bayesian approach to unsupervised part-of-speech tagging , 2007, ACL.

[29] Christus,et al. A General Method Applicable to the Search for Similarities in the Amino Acid Sequence of Two Proteins , 2022 .

[30] Christopher D. Manning,et al. Efficient, Feature-based, Conditional Random Field Parsing , 2008, ACL.

[31] Wang Ling,et al. Learning to Compose Words into Sentences with Reinforcement Learning , 2016, ICLR.

[32] William W. Cohen,et al. Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[33] Fernando Pereira,et al. Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[34] Ryan P. Adams,et al. Composing graphical models with neural networks for structured representations and fast inference , 2016, NIPS.

[35] Haichen Shen,et al. TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018 .

[36] Hermann Ney,et al. Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.