New language models using phrase structures extracted from parse trees

This paper proposes a new speech recognition scheme using three linguistic constraints. Multi-class composite bigram models [1] are used in the first and second passes to reflect word-neighboring characteristics as an extension of conventional word n-gram models. Trigram models with constituent boundary markers and word pattern models are both used in the third pass to utilize phrasal constraints and headword cooccurrences, respectively. These two models are made using a training text corpus with phrase structures given by an examplebased Transfer-Driven Machine Translation (TDMT) parser [2]. Speech recognition experiments show that the new recognition scheme reduces word errors 9.50% from the conventional scheme by using word-neighboring characteristics, that is only the multi-class composite bigram models.

[1]  Yoshinori Sagisaka,et al.  Multi-class composite N-gram based on connection direction , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Mari Ostendorf,et al.  HMM topology design using maximum likelihood successive state splitting , 1997, Comput. Speech Lang..

[3]  T. Takezawa Speech and Language Database for Speech Translation Research in ATR , 1998 .

[4]  Frederick Jelinek,et al.  Exploiting Syntactic Structure for Language Modeling , 1998, ACL.

[5]  Hitoshi Iida,et al.  Constituent Boundary Parsing for Example-Based Machine Translation , 1994, COLING.

[6]  Ronald Rosenfeld,et al.  Trigger-based language models: a maximum entropy approach , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.