3-Way Composition of Weighted Finite-State Transducers

Composition of weighted transducers is a fundamental algorithm used in many applications, including for computing complex edit-distances between automata, or string kernels in machine learning, or to combine different components of a speech recognition, speech synthesis, or information extraction system. We present a generalization of the composition of weighted transducers, 3-way composition, which is dramatically faster in practice than the standard composition algorithm when combining more than two transducers. The worst-case complexity of our algorithm for composing three transducers T 1 , T 2 , and T 3 resulting in T, is O(|T| Q min (d(T 1 ) d(T 3 ), d(T 2 )) + |T| E ), where |·| Q denotes the number of states, |·| E the number of transitions, and d(·) the maximum out-degree. As in regular composition, the use of perfect hashing requires a pre-processing step with linear-time expected complexity in the size of the input transducers. In many cases, this approach significantly improves on the complexity of standard composition. Our algorithm also leads to a dramatically faster composition in practice. Furthermore, standard composition can be obtained as a special case of our algorithm. We report the results of several experiments demonstrating this improvement. These theoretical and empirical improvements significantly enhance performance in the applications already mentioned.

[1]  Arto Salomaa,et al.  Automata-Theoretic Aspects of Formal Power Series , 1978, Texts and Monographs in Computer Science.

[2]  Emmanuel Roche,et al.  Finite-State Language Processing , 1997 .

[3]  Arto Salomaa,et al.  Semirings, Automata, Languages , 1985, EATCS Monographs on Theoretical Computer Science.

[4]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[5]  Dan Jurafsky,et al.  Statistical Natural Language Processing , 2010, Encyclopedia of Machine Learning.

[6]  Jarkko Kari,et al.  Digital Images and Formal Languages , 1997, Handbook of Formal Languages.

[7]  Samuel Eilenberg,et al.  Automata, languages, and machines. A , 1974, Pure and applied mathematics.

[8]  Mehryar Mohri Edit-Distance Of Weighted Automata: General Definitions And Algorithms , 2003, Int. J. Found. Comput. Sci..

[9]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[10]  Thomas Sudkamp,et al.  Languages and Machines , 1988 .

[11]  Arto Salomaa,et al.  Semirings, Automata and Languages , 1985 .

[12]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[13]  Mehryar Mohri,et al.  Rational Kernels: Theory and Algorithms , 2004, J. Mach. Learn. Res..

[14]  Slava M. Katz,et al.  Estimation of probabilities from sparse data for the language model component of a speech recognizer , 1987, IEEE Trans. Acoust. Speech Signal Process..

[15]  Dominique Perrin Combinatorics on words , 1981 .

[16]  Yves Schabes,et al.  Speech Recognition by Composition of Weighted Finite Automata , 1997 .

[17]  Fernando Pereira,et al.  Weighted Automata in Text and Speech Processing , 2005, ArXiv.