Preprocessing for Unification Parsing of Spoken Language

Wordgraphs are structures that may be output by speech recognisers. We discuss various methods for turning wordgraphs into smaller structures. One of these methods is novel; this method relies on a new kind of determinization of acyclic weighted finite automata that is language-preserving but not fully weight-preserving, and results in smaller automata than in the case of traditional determinization of weighted finite automata. We present empirical data comparing the respective methods. The methods are relevant for systems in which wordgraphs form the input to kinds of syntactic analysis that are very time consuming, such as unification parsing.

[1]  Bernard Lang,et al.  The Structure of Shared Forests in Ambiguous Parsing , 1989, ACL.

[2]  Michael J. Fischer,et al.  The String-to-String Correction Problem , 1974, JACM.

[3]  Raffaele Giancarlo,et al.  Shrinking language models by robust approximation , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[4]  J. Brzozowski Canonical regular expressions and minimal state graphs for definite events , 1962 .

[5]  Stuart M. Shieber,et al.  Using Restriction to Extend Parsing Algorithms for Complex-Feature-Based Formalisms , 1985, ACL.

[6]  Jan W. Amtrup,et al.  Time Mapping with Hypergraphs , 1998, COLING-ACL.

[7]  Gertjan van Noord Treatment of ε-moves in subset construction , 1998 .

[8]  Éric Villemonte de la Clergerie,et al.  Subsumption-oriented Push-Down Automata , 1992, PLILP.

[9]  Mitch Weintraub,et al.  Large-vocabulary dictation using SRI's DECIPHER speech recognition system: progressive search techniques , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[10]  Mehryar Mohri,et al.  Finite-State Transducers in Language and Speech Processing , 1997, CL.

[11]  Hans-Ulrich Krieger,et al.  A Bag of Useful Techniques for Efficient and Robust Parsing , 1999, ACL.

[12]  Raffaele Giancarlo,et al.  On the Determinization of Weighted Finite Automata , 1998, ICALP.

[13]  Volker Steinbiss,et al.  The Philips automatic train timetable information system , 1995, Speech Commun..

[14]  Tao Jiang,et al.  Minimal NFA Problems are Hard , 1991, SIAM J. Comput..