Probabilistic finite-state machines - part II

Probabilistic finite-state machines are used today in a variety of areas in pattern recognition or in fields to which pattern recognition is linked. In part I of this paper, we surveyed these objects and studied their properties. In this part, we study the relations between probabilistic finite-state automata and other well-known devices that generate strings like hidden Markov models and n-grams and provide theorems, algorithms, and properties that represent a current state of the art of these objects.

[1]  Enrique Vidal,et al.  Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Pierre Dupont,et al.  Smoothing Probabilistic Automata: An Error-Correcting Approach , 2000, ICGI.

[3]  Francisco Casacuberta,et al.  Comparison Between the Inside-Outside Algorithm and the Viterbi Algorithm for Stochastic Context-Free Grammars , 1996, SSPR.

[4]  Richard K. Belew,et al.  Stochastic Context-Free Grammar Induction with a Genetic Algorithm Using Local Search , 1996, FOGA.

[5]  Enrique Vidal,et al.  Using knowledge to improve N-gram language modelling through the MGGI methodology , 1996, ICGI.

[6]  Vladimir Solmon,et al.  The estimation of stochastic context-free grammars using the Inside-Outside algorithm , 2003 .

[7]  Ana L. N. Fred,et al.  Computation of Substring Probabilities in Stochastic Grammars , 2000, ICGI.

[8]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[9]  Ferran Plà,et al.  Shallow Parsing using Specialized HMMs , 2002, J. Mach. Learn. Res..

[10]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[11]  Ian H. Witten,et al.  The zero-frequency problem: Estimating the probabilities of novel events in adaptive text compression , 1991, IEEE Trans. Inf. Theory.

[12]  Mehryar Mohri,et al.  The Design Principles of a Weighted Finite-State Transducer Library , 2000, Theor. Comput. Sci..

[13]  V. Balasubramanian Equivalence and Reduction of Hidden Markov Models , 1993 .

[14]  Francisco Casacuberta Statistical estimation of stochastic context-free grammars , 1995, Pattern Recognit. Lett..

[15]  Takeshi Koshiba,et al.  Learning Deterministic even Linear Languages From Positive Examples , 1997, Theor. Comput. Sci..

[16]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[17]  Colin de la Higuera,et al.  Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality , 2000, ICML.

[18]  Yoshua Bengio,et al.  Experiments on the Application of IOHMMs to Model Financial Returns Series * , 2002 .

[19]  David A. McAllester,et al.  On the Convergence Rate of Good-Turing Estimators , 2000, COLT.

[20]  Francisco Casacuberta,et al.  Some Statistical-Estimation Methods for Stochastic Finite-State Transducers , 2004, Machine Learning.

[21]  Matthew Young-Lai,et al.  Stochastic Grammatical Inference of Text Database Structure , 2000, Machine Learning.

[22]  Francisco Casacuberta Inference of Finite-State Transducers by Using Regular Grammars and Morphisms , 2000, ICGI.

[23]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[24]  Yaser Al-Onaizan,et al.  Translation with Finite-State Devices , 1998, AMTA.

[25]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[26]  Srinivas Bangalore,et al.  Head-Transducer Models for Speech Translation and Their Automatic Acquisition from Bilingual Data , 2004, Machine Translation.

[27]  David Llorens Piñana Suavizado de autómatas y traductores finitos estocásticos , 2000 .

[28]  Leslie G. Valiant,et al.  Cryptographic Limitations on Learning Boolean Formulae and Finite Automata , 1993, Machine Learning: From Theory to Applications.

[29]  Dana Ron,et al.  On the learnability and usage of acyclic probabilistic finite automata , 1995, COLT '95.

[30]  Fred J. Maryanski,et al.  Properties of stochastic syntax-directed translation schemata , 1979, International Journal of Computer & Information Sciences.

[31]  Mark-Jan Nederhof,et al.  Practical Experiments with Regular Approximation of Context-Free Languages , 1999, CL.

[32]  Srinivas Bangalore,et al.  A Finite-State Approach to Machine Translation , 2001, NAACL.

[33]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[34]  Francisco Casacuberta Finite-state transducers for speech-input translation , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[35]  Alexander Clark,et al.  PAC-learnability of Probabilistic Deterministic Finite State Automata , 2004, J. Mach. Learn. Res..

[36]  Enrique Vidal,et al.  Language Simplification through Error-Correcting and Grammatical Inference Techniques , 2004, Machine Learning.

[37]  Pierre Dupont,et al.  Stochastic Grammatical Inference with Multinomial Tests , 2002, ICGI.

[38]  H. Damasio,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence: Special Issue on Perceptual Organization in Computer Vision , 1998 .

[39]  Richard M. Schwartz,et al.  An Omnifont Open-Vocabulary OCR System for English and Arabic , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[40]  Francisco Casacuberta,et al.  Finite State Language Models Smoothed Using n-Grams , 2002, Int. J. Pattern Recognit. Artif. Intell..

[41]  J. Picone,et al.  Continuous speech recognition using hidden Markov models , 1990, IEEE ASSP Magazine.

[42]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[43]  José Oncina,et al.  Learning deterministic regular grammars from stochastic samples in polynomial time , 1999, RAIRO Theor. Informatics Appl..

[44]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[45]  Francisco Casacuberta,et al.  Inference of finite-state transducers from regular languages , 2005, Pattern Recognit..

[46]  Jorge Calera-Rubio,et al.  Stochastic Inference of Regular Tree Languages , 2004, Machine Learning.

[47]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[48]  Srinivas Bangalore,et al.  Stochastic Finite-State Models for Spoken Language Machine Translation , 2000, Machine Translation.

[49]  Francisco Casacuberta Some Relations Among Stochastic Finite State Networks Used in Automatic Speech Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Jason Eisner,et al.  Parameter Estimation for Probabilistic Finite-State Transducers , 2002, ACL.

[51]  Alexander Clark,et al.  Shallow Parsing Using Probabilistic Grammatical Inference , 2002, ICGI.

[52]  Gianfranco Bilardi,et al.  Language learning from stochastic input , 1992, COLT '92.

[53]  Pierre Dupont,et al.  Links between probabilistic automata and hidden Markov models: probability distributions, learning models and induction algorithms , 2005, Pattern Recognit..

[54]  Rajesh Parekh,et al.  Learning DFA from Simple Examples , 1997, Machine Learning.

[55]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[56]  Leonard Pitt,et al.  The minimum consistent DFA problem cannot be approximated within any polynomial , 1993, JACM.

[57]  Fernando Pereira,et al.  Weighted finite-state transducers in speech recognition , 2002, Comput. Speech Lang..

[58]  Encarna Segarra,et al.  INDUCTIVE LEARNING OF FINITE-STATE TRANSDUCERS FOR THE INTERPRETATION OF UNIDIMENSIONAL OBJECTS , 1990 .

[59]  Franck Thollard Improving Probabilistic Grammatical Inference Core Algorithms with Post-processing Techniques , 2001, ICML.

[60]  Joshua Goodman,et al.  A bit of progress in language modeling , 2001, Comput. Speech Lang..

[61]  Erkki Mäkinen Inferring Finite Transducers , 2003, J. Braz. Comput. Soc..

[62]  Colin de la Higuera,et al.  Learning Languages with Help , 2002, ICGI.

[63]  Robert McNaughton,et al.  Algebraic decision procedures for local testability , 1974, Mathematical systems theory.

[64]  Rémi Gilleron,et al.  PAC Learning with Simple Examples , 1996, STACS.

[65]  C. S. Wetherell,et al.  Probabilistic Languages: A Review and Some Open Questions , 1980, CSUR.

[66]  Naoki Abe,et al.  Predicting Protein Secondary Structure Using Stochastic Tree Grammars , 1997, Machine Learning.

[67]  Rafael Llobet,et al.  Computer-Aided Prostate Cancer Detection in Ultrasonographic Images , 2003, IbPRIA.

[68]  Takeshi Koshiba,et al.  Inferring pure context-free languages from positive data , 2000, Acta Cybern..

[69]  James J. Horning,et al.  A Procedure for Grammatical Inference , 1971, IFIP Congress.

[70]  Alon Orlitsky,et al.  Always Good Turing: Asymptotically Optimal Probability Estimation , 2003, Science.

[71]  Enrique Vidal,et al.  Inference of k-Testable Languages in the Strict Sense and Application to Syntactic Pattern Recognition , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[72]  Yuji Takada Grammatical Interface for Even Linear Languages Based on Control Sets , 1988, Inf. Process. Lett..

[73]  E. Mark Gold,et al.  Complexity of Automaton Identification from Given Data , 1978, Inf. Control..

[74]  Andreas Stolcke,et al.  Inducing Probabilistic Grammars by Bayesian Model Merging , 1994, ICGI.

[75]  Francisco Casacuberta,et al.  Local Languages, the Succesor Method, and a Step Towards a General Methodology for the Inference of Regular Grammars , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Hermann Ney,et al.  Integrated Handwriting Recognition And Interpretation Using Finite-State Models , 2004, Int. J. Pattern Recognit. Artif. Intell..

[77]  Hermann Ney,et al.  Some approaches to statistical and finite-state speech-to-speech translation , 2004, Comput. Speech Lang..

[78]  Yasubumi Sakakibara,et al.  Learning context-free grammars from structural data in polynomial time , 1988, COLT '88.

[79]  Samuel Eilenberg,et al.  Automata, languages, and machines. A , 1974, Pure and applied mathematics.

[80]  Juan Miguel Vilar,et al.  Improve the Learning of Subsequential Transducers by Using Alignments and Dictionaries , 2000, ICGI.

[81]  Francisco Casacuberta,et al.  Submission to ICGI-2000 Computational complexity of problems on probabilistic grammars and transducers , 2007 .

[82]  Francisco Casacuberta,et al.  A Statistical-Estimation Method for Stochastic Finite-State Transducers Based on Entropy Measures , 2000, SSPR/SPR.

[83]  R. C. Underwood,et al.  Stochastic context-free grammars for tRNA modeling. , 1994, Nucleic acids research.

[84]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[85]  Francisco Casacuberta,et al.  Machine Translation with Inferred Stochastic Finite-State Transducers , 2004, Computational Linguistics.

[86]  Yechezkel Zalcstein,et al.  Locally Testable Languages , 1972, J. Comput. Syst. Sci..

[87]  Francisco Casacuberta,et al.  Architectures for Speech-to-Speech Translation Using Finite-state Models , 2002, Speech-to-Speech Translation@ACL.

[88]  Pierre Dupont,et al.  Using Symbol Clustering to Improve Probabilistic Automaton Inference , 1998, ICGI.

[89]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[90]  Rémi Gilleron,et al.  PAC Learning under Helpful Distributions , 1997, RAIRO Theor. Informatics Appl..

[91]  Ronitt Rubinfeld,et al.  On the learnability of discrete distributions , 1994, STOC '94.

[92]  Colin de la Higuera,et al.  Identification in the Limit with Probability One of Stochastic Deterministic Finite Automata , 2000, ICGI.

[93]  Francisco Casacuberta Maximum mutual information and conditional maximum likelihood estimation of stochastic regular syntax-directed translation schemes , 1996, ICGI.

[94]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[95]  Francisco Casacuberta Growth Transformations for Probability Functions of Stochastic Grammars , 1996, Int. J. Pattern Recognit. Artif. Intell..

[96]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[97]  A. N. V. Rao,et al.  Approximating grammar probabilities: solution of a conjecture , 1986, JACM.

[98]  Naoki Abe,et al.  On the computational complexity of approximating distributions by probabilistic automata , 1990, Machine Learning.

[99]  Baden Hugheis,et al.  Robustness in language and speech technology , 2004 .