Inference of finite-state transducers from regular languages

Finite-state transducers are models that are being used in different areas of pattern recognition and computational linguistics. One of these areas is machine translation, where the approaches that are based on building models automatically from training examples are becoming more and more attractive. Finite-state transducers are very adequate to be used in constrained tasks where training samples of pairs of sentences are available. A technique to infer finite-state transducers is proposed in this work. This technique is based on formal relations between finite-state transducers and finite-state grammars. Given a training corpus of input-output pairs of sentences, the proposed approach uses statistical alignment methods to produce a set of conventional strings from which a stochastic finite-state grammar is inferred. This grammar is finally transformed into a resulting finite-state transducer. The proposed methods are assessed through series of machine translation experiments within the framework of the EUTRANS project.

[1]  Shyr-Shen Yu,et al.  Local languages , 1998, Int. J. Comput. Math..

[2]  Enrique Vidal,et al.  Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Ronald Rosenfeld,et al.  The CMU Statistical Language Modeling Toolkit and its use in the 1994 ARPA CSR Evaluation , 1995 .

[4]  Alex Park,et al.  FST-based recognition techniques for multi-lingual and multi-domain spontaneous speech , 2001, INTERSPEECH.

[5]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[6]  Francisco Casacuberta,et al.  Acoustic and syntactical modeling in the ATROS system , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  Yaser Al-Onaizan,et al.  Translation with Finite-State Devices , 1998, AMTA.

[8]  Samuel Eilenberg,et al.  Automata, languages, and machines. A , 1974, Pure and applied mathematics.

[9]  Juan Miguel Vilar,et al.  Improve the Learning of Subsequential Transducers by Using Alignments and Dictionaries , 2000, ICGI.

[10]  Encarna Segarra,et al.  INDUCTIVE LEARNING OF FINITE-STATE TRANSDUCERS FOR THE INTERPRETATION OF UNIDIMENSIONAL OBJECTS , 1990 .

[11]  PietraVincent J. Della,et al.  The mathematics of statistical machine translation , 1993 .

[12]  Alexander Seward Transducer optimizations for tight-coupled decoding , 2001, INTERSPEECH.

[13]  Francisco Casacuberta Maximum mutual information and conditional maximum likelihood estimation of stochastic regular syntax-directed translation schemes , 1996, ICGI.

[14]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[15]  Hermann Ney,et al.  Speech-to-speech translation based on finite-state transducers , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[16]  Srinivas Bangalore,et al.  Stochastic Finite-State Models for Spoken Language Machine Translation , 2000, Machine Translation.

[17]  Enrique Vidal,et al.  Learning Subsequential Transducers for Pattern Recognition Interpretation Tasks , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Victor Zue,et al.  Context-dependent probabilistic hierarchical sublexical modelling using finite state transducers , 2001, INTERSPEECH.

[19]  King-Sun Fu,et al.  Syntactic Pattern Recognition And Applications , 1968 .

[20]  Francisco Casacuberta,et al.  Local Languages, the Succesor Method, and a Step Towards a General Methodology for the Inference of Regular Grammars , 1987, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[21]  Francisco Casacuberta,et al.  The EuTrans Spoken Language Translation System , 2004, Machine Translation.

[22]  Alberto Sanfeliu,et al.  Structural Pattern Analysis , 1990 .

[23]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[24]  Erkki Mäkinen Inferring Finite Transducers , 2003, J. Braz. Comput. Soc..

[25]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[26]  Ewan Klein,et al.  Proceedings of the 38th Annual Meeting of the Association for Computational Linguistics , 2000, ACL 2000.

[27]  Francisco Casacuberta,et al.  Grammatical Inference and Automatic Speech Recognition , 1995 .

[28]  Francisco Casacuberta Inference of Finite-State Transducers by Using Regular Grammars and Morphisms , 2000, ICGI.

[29]  Edwin D. Mares,et al.  On S , 1994, Stud Logica.

[30]  Srinivas Bangalore,et al.  Finite-state models for lexical reordering in spoken language translation , 2000, INTERSPEECH.