MULTIMODAL LEARNING INTERFACES

While significant advances have been made in recent years to continuously expand and improve speech recognition performance, speech recognition systems have still not found broad acceptance in everyday life. In searching to eliminate their shortcomings, we have begun to focus our efforts on producing a sensible and useful user interface, rather than a better recognizer alone. Such useful speech interfaces should not only recognize speech but also

[1]  Alexander H. Waibel,et al.  Knowing who to listen to in speech recognition: visually guided beamforming , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[2]  A. Waibel,et al.  See me, hear me: integrating automatic speech recognition and lip-reading , 1994, ICSLP.

[3]  Minh Tue Vo,et al.  Incremental learning using the time delay neural network , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[4]  Finn Dag Buø,et al.  JANUS 93: towards spontaneous speech translation , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Ulrich Bodenhausen,et al.  A connectionist recognizer for on-line cursive handwriting recognition , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Carolyn Penstein Rosé,et al.  Recent Advances in JANUS: A Speech Translation System , 1993, TMI.

[7]  Alex Waibel,et al.  Integrating time alignment and neural networks for high performance continuous speech recognition , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[8]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[9]  Alex Waibel,et al.  A multimodal human-computer interface: combination of speech and gesture recognition , 1996 .

[10]  M.,et al.  Janus: Towards Multilingual Spoken Language Translation , 1995 .

[11]  Alexander H. Waibel,et al.  The Use of Dynamic Writing Information in a Connectionist On-Line Cursive Handwriting Recognition System , 1994, NIPS.

[12]  H. Martin Hunke,et al.  Locating and Tracking of Human Faces with Neural Networks , 1994 .

[13]  Alexander H. Waibel,et al.  See Me, Hear Me: Integrating Automatic Speech Recognition and Lip-reading , 1994 .

[14]  P. Haffner,et al.  Multi-State Time Delay Neural Networks for Continuous Speech Recognition , 1991 .

[15]  Isabelle Guyon,et al.  Design of a neural network character recognizer for a touch terminal , 1991, Pattern Recognit..

[16]  JANUS : TOWARDS MULTILINGUAL SPOKEN LANGUAGE TRANSLATION , 2022 .

[17]  TEM: An Overview , 2022 .