论文信息 - Integrated multimodal human-computer interface and augmented reality for interactive display applications

Integrated multimodal human-computer interface and augmented reality for interactive display applications

We describe new systems for improved integrated multimodal human-computer interaction and augmented reality for a diverse array of applications, including future advanced cockpits, tactical operations centers, and others. We have developed an integrated display system featuring: speech recognition of multiple concurrent users equipped with both standard air- coupled microphones and novel throat-coupled sensors (developed at Army Research Labs for increased noise immunity); lip reading for improving speech recognition accuracy in noisy environments, three-dimensional spatialized audio for improved display of warnings, alerts, and other information; wireless, coordinated handheld-PC control of a large display; real-time display of data and inferences from wireless integrated networked sensors with on-board signal processing and discrimination; gesture control with disambiguated point-and-speak capability; head- and eye- tracking coupled with speech recognition for 'look-and-speak' interaction; and integrated tetherless augmented reality on a wearable computer. The various interaction modalities (speech recognition, 3D audio, eyetracking, etc.) are implemented a 'modality servers' in an Internet-based client-server architecture. Each modality server encapsulates and exposes commercial and research software packages, presenting a socket network interface that is abstracted to a high-level interface, minimizing both vendor dependencies and required changes on the client side as the server's technology improves.

[1] Charles Chien,et al. Wireless Sensor Networks for Area Monitoring and Integrated Vehicle Health Management Applications , 1999 .

[2] Reinhold Behringer,et al. Improving registration precision through visual horizon silhouette matching , 1999 .

[3] F L Wightman,et al. Localization using nonindividualized head-related transfer functions. , 1993, The Journal of the Acoustical Society of America.

[4] Thomas S. Huang,et al. Real-time lip tracking and bimodal continuous speech recognition , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[5] J. Blauert. Spatial Hearing: The Psychophysics of Human Sound Localization , 1983 .