Detecting acoustic morphemes in lattices for spoken language understanding

Current methods for training statistical language models for recognition and understanding require large annotated corpora. The collection, transcription and labeling of such corpora is a major bottleneck for creating new applications and for refinements of existing ones. Thus, it is of great interest to develop methods for automatically learning vocabulary, grammar and semantics from a speech corpus without transcriptions. In this paper we report on an experiment where acoustic morphemes are automatically acquired from the output of a task-independent phone recognizer. The utility of these units is experimentally evaluated for call-type classification in the ’How may I help you?’ task. Detected occurrences of the acoustic morphemes in the lattice output provide the basis for the classification of the test sentences. Using lattices, we achieve a reduction of from the false rejection rate using best paths, albeit with a reduction in the correct classification performance from that baseline.

[1]  Eluned S. Parris,et al.  Recurrent substrings and data fusion for language recognition , 1998, ICSLP.

[2]  Gérard Chollet,et al.  Toward ALISP: A proposal for Automatic Language Independent Speech Processing , 1999 .

[3]  Alex Pentland,et al.  Learning words from sights and sounds: a computational model , 2002, Cogn. Sci..

[4]  A.L. Gorin,et al.  An experiment in spoken language acquisition , 1992, IEEE Trans. Speech Audio Process..

[5]  Frédéric Bimbot,et al.  Inference of variable-length linguistic and acoustic units by multigrams , 1997, Speech Commun..

[6]  Stefan Harbeck,et al.  Multigrams for language identification , 1999, EUROSPEECH.

[7]  Lawrence K. Saul,et al.  Robust numeric recognition in spoken language dialogue , 2000, Speech Commun..

[8]  Giuseppe Riccardi,et al.  Stochastic language adaptation over time and state in natural spoken dialog systems , 2000, IEEE Trans. Speech Audio Process..

[9]  Fernando Pereira,et al.  Efficient general lattice generation and rescoring , 1999, EUROSPEECH.

[10]  A. Gorin On automated language acquisition , 1989 .

[11]  Allen L. Gorin,et al.  Knowledge collection for natural language spoken dialog systems , 1999, EUROSPEECH.

[12]  Treebank Penn,et al.  Linguistic Data Consortium , 1999 .

[13]  Roberto Pieraccini,et al.  Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[14]  Giuseppe Riccardi,et al.  Automatic acquisition of salient grammar fragments for call-type classification , 1997, EUROSPEECH.

[15]  Mehryar Mohri,et al.  Network optimizations for large-vocabulary speech recognition , 1999, Speech Commun..

[16]  Giuseppe Riccardi,et al.  How may I help you? , 1997, Speech Commun..

[17]  Marilyn A. Walker,et al.  Learning to Predict Problematic Situations in a Spoken Dialogue System: Experiments with How May I Help You? , 2000, ANLP.