Learning Probabilistic Residual Finite State Automata

We introduce a new class of probabilistic automata: Probabilistic Residual Finite State Automata. We show that this class can be characterized by a simple intrinsic property of the stochastic languages they generate (the set of residual languages is finitely generated by residuals) and that it admits canonical minimal forms. We prove that there are more languages generated by PRFA than by Probabilistic Deterministic Finite Automata (PDFA). We present a first inference algorithm using this representation and we show that stochastic languages represented by PRFA can be identified from a characteristic sample if words are provided with their probabilities of appearance in the target language.