Realization of common statistical methods in computational linguistics with functional automata

In this paper we present the functional automata as a general framework for representation, training and exploring of various statistical models as LLM’s, HMM’s, CRF’s, etc. Our contribution is a new construction that allows the representation of the derivatives of a function given by a functional automaton. It preserves the natural representation of the functions and the standard product and sum operations of real numbers. In the same time it requires no additional overhead for the standard dynamic programming techniques that yield the computation of a functional value.

[1]  Stoyan Mihov,et al.  Extraction of Spelling Variations from Language Structure for Noisy Text Correction , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[2]  Jorge Nocedal,et al.  Global Convergence Properties of Conjugate Gradient Methods for Optimization , 1992, SIAM J. Optim..

[3]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[4]  Klaus U. Schulz,et al.  Using Automated Error Profiling of Texts for Improved Selection of Correction Candidates for Garbled Tokens , 2007, Australian Conference on Artificial Intelligence.

[5]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[6]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[7]  Ulrich Reffle Efficiently generating correction suggestions for garbled tokens of historical language , 2011, Nat. Lang. Eng..

[8]  Jeffrey D. Ullman,et al.  Introduction to Automata Theory, Languages and Computation , 1979 .

[9]  Zhifei Li,et al.  First- and Second-Order Expectation Semirings with Applications to Minimum-Risk Training on Translation Forests , 2009, EMNLP.

[10]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11]  Jason Eisner,et al.  Parameter Estimation for Probabilistic Finite-State Transducers , 2002, ACL.

[12]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[13]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[14]  Biing-Hwang Juang,et al.  Hidden Markov Models for Speech Recognition , 1991 .

[15]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[16]  Klaus U. Schulz,et al.  Fast Selection of Small and Precise Candidate Sets from Dictionaries for Text Correction Tasks , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[17]  J. Omura,et al.  On the Viterbi decoding algorithm , 1969, IEEE Trans. Inf. Theory.

[18]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[19]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[20]  Mark S. Sweetnam,et al.  Natural language processing and early-modern dirty data: applying IBM Languageware to the 1641 depositions , 2012, Lit. Linguistic Comput..