LP Distance and Equivalence of Probabilistic Automata

This paper presents an exhaustive analysis of the problem of computing the Lp distance of two probabilistic automata. It gives efficient exact and approximate algorithms for computing these distances for p even and proves the problem to be NP-hard for all odd values of p, thereby completing previously known hardness results. It further proves the hardness of approximating the Lp distance of two probabilistic automata for odd values of p. Similar techniques to those used for computing the Lp distance also yield efficient algorithms for computing the Hellinger distance of two unambiguous probabilistic automata both exactly and approximately. A problem closely related to the computation of a distance between probabilistic automata is that of testing their equivalence. This paper also describes an efficient algorithm for testing the equivalence of two arbitrary probabilistic automata A1 and A2 in time O(|Σ|(|A1| + |A2|)3), a significant improvement over the previously best reported algorithm for this problem.

[1]  Azaria Paz,et al.  Probabilistic automata , 2003 .

[2]  Arto Salomaa,et al.  Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[3]  Mehryar Mohri,et al.  Generic e-Removal and Input e-Normalization Algorithms for Weighted Transducers , 2002, Int. J. Found. Comput. Sci..

[4]  J. Håstad Clique is hard to approximate withinn1−ε , 1999 .

[5]  Wen-Guey Tzeng,et al.  A Polynomial-Time Algorithm for the Equivalence of Probabilistic Automata , 1992, SIAM J. Comput..

[6]  Flemming Topsøe,et al.  Some inequalities for information divergence and related measures of discrimination , 2000, IEEE Trans. Inf. Theory.

[7]  Christian N. S. Pedersen,et al.  The consensus string problem and the complexity of comparing hidden Markov models , 2002, J. Comput. Syst. Sci..

[8]  Jean Berstel,et al.  Rational series and their languages , 1988, EATCS monographs on theoretical computer science.

[9]  Mehryar Mohri,et al.  On the Computation of Some Standard Distances Between Probabilistic Automata , 2006, CIAA.

[10]  J. Håstad Clique is hard to approximate within n 1-C , 1996 .

[11]  Kostyantyn Archangelsky Efficient Algorithm for Checking Multiplicity Equivalence for the Finite Z-Sigma*-Automata , 2002, Developments in Language Theory.

[12]  Joshua Goodman,et al.  Parsing Inside-Out , 1998, ArXiv.

[13]  José D. P. Rolim,et al.  Proceedings of the 27th International Colloquium on Automata, Languages and Programming , 2000 .

[14]  Francisco Casacuberta,et al.  Submission to ICGI-2000 Computational complexity of problems on probabilistic grammars and transducers , 2007 .

[15]  Arto Salomaa,et al.  Automata-Theoretic Aspects of Formal Power Series , 1978, Texts and Monographs in Computer Science.

[16]  Mehryar Mohri,et al.  Semiring Frameworks and Algorithms for Shortest-Distance Problems , 2002, J. Autom. Lang. Comb..

[17]  Mehryar Mohri,et al.  Efficient Computation of the Relative Entropy of Probabilistic Automata , 2006, LATIN.

[18]  Mehryar Mohri,et al.  On the Computation of the Relative Entropy of Probabilistic Automata , 2008, Int. J. Found. Comput. Sci..

[19]  Sean R. Eddy,et al.  Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids , 1998 .

[20]  Marcel Paul Schützenberger,et al.  On the Definition of a Family of Automata , 1961, Inf. Control..

[21]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[22]  Jonas Holmerin,et al.  Clique Is Hard to Approximate within n1-o(1) , 2000, ICALP.

[23]  Jarkko Kari,et al.  Digital Images and Formal Languages , 1997, Handbook of Formal Languages.

[24]  Mariëlle Stoelinga,et al.  An Introduction to Probabilistic Automata , 2002, Bull. EATCS.