Learning Classes of Probabilistic Automata

Probabilistic finite automata (PFA) model stochastic languages, i.e. probability distributions over strings. Inferring PFA from stochastic data is an open field of research. We show that PFA are identifiable in the limit with probability one. Multiplicity automata (MA) is another device to represent stochastic languages. We show that a MA may generate a stochastic language that cannot be generated by a PFA, but we show also that it is undecidable whether a MA generates a stochastic language. Finally, we propose a learning algorithm for a subclass of PFA, called PRFA.

[1]  L. Györfi Principles of nonparametric learning , 2002 .

[2]  E. Mark Gold,et al.  Language Identification in the Limit , 1967, Inf. Control..

[3]  Mariëlle Stoelinga,et al.  An Introduction to Probabilistic Automata , 2002, Bull. EATCS.

[4]  Gábor Lugosi,et al.  Pattern Classification and Learning Theory , 2002 .

[5]  J. C. Jackson Learning Functions Represented as Multiplicity Automata , 1997 .

[6]  Francesco Bergadano,et al.  Learning Behaviors of Automata from Multiplicity and Equivalence Queries , 1994, SIAM J. Comput..

[7]  José Oncina,et al.  Learning Stochastic Regular Grammars by Means of a State Merging Method , 1994, ICGI.

[8]  G. Hardy,et al.  An Introduction to the Theory of Numbers , 1938 .

[9]  José Oncina,et al.  Learning deterministic regular grammars from stochastic samples in polynomial time , 1999, RAIRO Theor. Informatics Appl..

[10]  Éric D. Taillard,et al.  A heuristic column generation method for the heterogeneous fleet VRP , 1999, RAIRO Oper. Res..

[11]  Pierre Dupont,et al.  Learning Probabilistic Residual Finite State Automata , 2002, ICGI.

[12]  Pierre Baldi,et al.  Bioinformatics - the machine learning approach (2. ed.) , 2000 .

[13]  Naoki Abe,et al.  On the computational complexity of approximating distributions by probabilistic automata , 1990, Machine Learning.

[14]  Dana Angluin,et al.  Queries and concept learning , 1988, Machine Learning.

[15]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[16]  Andrew McCallum,et al.  Information Extraction with HMM Structures Learned by Stochastic Optimization , 2000, AAAI/IAAI.

[17]  Ronitt Rubinfeld,et al.  On the learnability of discrete distributions , 1994, STOC '94.

[18]  François Denis,et al.  Residual Languages and Probabilistic Automata , 2003, ICALP.

[19]  Colin de la Higuera,et al.  Identification in the Limit with Probability One of Stochastic Deterministic Finite Automata , 2000, ICGI.

[20]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[21]  Vincent D. Blondel,et al.  Undecidable Problems for Probabilistic Automata of Fixed Dimension , 2003, Theory of Computing Systems.

[22]  Eyal Kushilevitz,et al.  On the applications of multiplicity automata in learning , 1996, Proceedings of 37th Conference on Foundations of Computer Science.

[23]  Colin de la Higuera,et al.  Probabilistic DFA Inference using Kullback-Leibler Divergence and Minimality , 2000, ICML.