An Algebra for Feature Extraction

Though feature extraction is a necessary first step in statistical NLP, it is often seen as a mere preprocessing step. Yet, it can dominate computation time, both during training, and especially at deployment. In this paper, we formalize feature extraction from an algebraic perspective. Our formalization allows us to define a message passing algorithm that can restructure feature templates to be more computationally efficient. We show via experiments on text chunking and relation extraction that this restructuring does indeed speed up feature extraction in practice by reducing redundant computation.

[1]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[2]  Bartosz Broda,et al.  Fextor: A Feature Extraction Framework for Natural Language Processing: A Case Study in Word Sense Disambiguation, Relation Recognition and Anaphora Resolution , 2013, Computational Linguistics - Applications.

[3]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[4]  Adam Lopez,et al.  Translation as Weighted Deduction , 2009, EACL.

[5]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[6]  Fabio Somenzi,et al.  Logic synthesis and verification algorithms , 1996 .

[7]  Brian Roark,et al.  Generalized Algorithms for Constructing Statistical Language Models , 2003, ACL.

[8]  Noah A. Smith,et al.  Compiling Comp Ling: Weighted Dynamic Programming and the Dyna Language , 2005, HLT.

[9]  Andrew McCallum,et al.  FACTORIE: Probabilistic Programming via Imperatively Defined Factor Graphs , 2009, NIPS.

[10]  Alessandro Moschitti,et al.  High-Order Low-Rank Tensors for Semantic Role Labeling , 2015, HLT-NAACL.

[11]  Andrew McCallum,et al.  An Introduction to Conditional Random Fields , 2010, Found. Trends Mach. Learn..

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Noah A. Smith,et al.  Structured Sparsity in Structured Prediction , 2011, EMNLP.

[14]  Joshua Goodman,et al.  Semiring Parsing , 1999, CL.

[15]  Regina Barzilay,et al.  Low-Rank Tensors for Scoring Dependency Structures , 2014, ACL.

[16]  Mark Dredze,et al.  Improved Relation Extraction with Feature-Rich Compositional Embedding Models , 2015, EMNLP.

[17]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[18]  Noah A. Smith,et al.  Cube Summing, Approximate Inference with Non-Local Features, and Dynamic Programming without Semirings , 2009, EACL.

[19]  Ming-Wei Chang,et al.  IllinoisSL: A JAVA Library for Structured Prediction , 2015, ArXiv.

[20]  Ariadna Quattoni,et al.  Low-Rank Regularization for Sparse Conjunctive Feature Spaces: An Application to Named Entity Classification , 2015, ACL.

[21]  Dan Roth,et al.  Learning with Feature Description Logics , 2002, ILP.

[22]  Parisa Kordjamshidi,et al.  EDISON: Feature Extraction for NLP, Simplified , 2016, LREC.

[23]  J. Golan Semirings and their applications , 1999 .

[24]  Christopher D. Manning,et al.  Learning Distributed Representations for Structured Output Prediction , 2014, NIPS.

[25]  Jianfeng Gao,et al.  Scalable training of L1-regularized log-linear models , 2007, ICML '07.

[26]  Andrew McCallum,et al.  Learning Dynamic Feature Selection for Fast Sequential Prediction , 2015, ACL.

[27]  Mark Dredze,et al.  Combining Word Embeddings and Feature Embeddings for Fine-grained Relation Extraction , 2015, HLT-NAACL.

[28]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[29]  Zhifei Li,et al.  First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests , 2009, EMNLP.

[30]  Jian Su,et al.  Exploring Various Knowledge in Relation Extraction , 2005, ACL.

[31]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks , 1999, Information Science and Statistics.