HUMAN-COMPUTER AUDIOVISUAL INTERFACE

The concept of audiovisual interface between human and stochastic process modeling and analysis software is investigated. Examples revealing the advantages of audiovisual interface over audio-only interface are given.

[1]  Ara V. Nefian,et al.  Speaker independent audio-visual continuous speech recognition , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[2]  Richard A. Bolt,et al.  “Put-that-there”: Voice and gesture at the graphics interface , 1980, SIGGRAPH '80.

[3]  Kevin P. Murphy,et al.  A coupled HMM for audio-visual speech recognition , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Richard A. Bolt,et al.  Multi-modal natural dialogue , 1992, CHI '92.

[5]  Yoni Bauduin,et al.  Audio-Visual Speech Recognition , 2004 .

[6]  Satoshi Nakamura,et al.  Real time face detection for multimodal speech recognition , 2002, Proceedings. IEEE International Conference on Multimedia and Expo.

[7]  Allen Newell,et al.  The psychology of human-computer interaction , 1983 .

[8]  J. Gregory Trafton,et al.  Finding the FOO: a pilot study for a multimodal interface , 2003, SMC'03 Conference Proceedings. 2003 IEEE International Conference on Systems, Man and Cybernetics. Conference Theme - System Security and Assurance (Cat. No.03CH37483).

[9]  Mark Billinghurst,et al.  Put that where? voice and gesture at the graphics interface , 1998, COMG.

[10]  Lars Bretzner,et al.  A Prototype System for Computer Vision Based Human Computer Interaction , 2001 .

[11]  Islam Shdaifat Design of a visual front end for audio-visual speech recognition , 2005 .

[12]  Guozhong Dai,et al.  A primitive-based architecture of multimodal interface(PBA-MMI) , 1997, 1997 IEEE International Conference on Intelligent Processing Systems (Cat. No.97TH8335).

[13]  David G. Stork,et al.  Speechreading by Humans and Machines , 1996 .

[14]  Sharon L. Oviatt,et al.  When do we interact multimodally?: cognitive load and multimodal communication patterns , 2004, ICMI '04.

[15]  Sharon L. Oviatt,et al.  Ten myths of multimodal interaction , 1999, Commun. ACM.

[16]  Richard A. Bolt,et al.  The human interface: Where people and computers meet , 1984 .

[17]  Kevin P. Murphy,et al.  Dynamic Bayesian Networks for Audio-Visual Speech Recognition , 2002, EURASIP J. Adv. Signal Process..

[18]  L.M. Encarnacao,et al.  Guest Editors' Introduction: Perceptual Multimodal Interfaces , 2003, IEEE Computer Graphics and Applications.

[19]  Farzin Deravi,et al.  A review of speech-based bimodal recognition , 2002, IEEE Trans. Multim..

[20]  Ara V. Nefian,et al.  Audio-visual continuous speech recognition using a coupled hidden Markov model , 2002, INTERSPEECH.

[21]  Ivan Marsic,et al.  A system for medical consultation and education using multimodal human/machine communication , 1998, IEEE Transactions on Information Technology in Biomedicine.

[22]  Kristinn R. Thórisson,et al.  Integrating Simultaneous Input from Speech, Gaze, and Hand Gestures , 1991, AAAI Workshop on Intelligent Multimedia Interfaces.

[23]  Juergen Luettin,et al.  Hierarchical discriminant features for audio-visual LVCSR , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[24]  Tsuhan Chen,et al.  Real-time lip-synch face animation driven by human voice , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[25]  Zhilin Wu,et al.  Audio-visual continuous speech recognition using MPEG-4 compliant visual features , 2002, Proceedings. International Conference on Image Processing.

[26]  Yacine Bellik,et al.  Media integration in multimodal interfaces , 1997, Proceedings of First Signal Processing Society Workshop on Multimedia Signal Processing.

[27]  Ivan Marsic,et al.  Issues in measuring the benefits of multimodal interfaces , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.