GWINDOWS: Towards Robust Perception-Based UI

Perceptual user interfaces promise modes of fluid computer-human interaction that complement the mouse and keyboard, and have been especially motivated in non-desktop scenarios, such as kiosks or smart rooms. Such interfaces, however, have been slow to see use for a variety of reasons, including the computational burden they impose, a lack of robustness outside the laboratory, unreasonable calibration demands, and a shortage of sufficiently compelling applications. We have tackled some of these difficulties by using a fast stereo vision algorithm for recognizing hand positions and gestures. Our system uses two inexpensive video cameras to extract depth information. This depth information enhances automatic object detection and tracking robustness, and may also be used in applications. We demonstrate the algorithm in combination with speech recognition to perform several basic window management tasks, report on a user study probing the ease of using the system, and discuss the implications of such a system for future user interfaces.

[1]  A. D. Fisk,et al.  Age-related differences in movement control: adjusting submovement structure to optimize performance. , 1997, The journals of gerontology. Series B, Psychological sciences and social sciences.

[2]  Alex Pentland,et al.  Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[3]  Takeo Kanade,et al.  Development of a video-rate stereo machine , 1995, Proceedings 1995 IEEE/RSJ International Conference on Intelligent Robots and Systems. Human Robot Interaction and Cooperative Robots.

[4]  Katsuhiko Sakaue,et al.  Utilization of stereo disparity and optical flow information for human interaction , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[5]  Y. Guiard Asymmetric division of labor in human skilled bimanual action: the kinematic chain as a model. , 1987, Journal of motor behavior.

[6]  Alex Pentland,et al.  The ALIVE system: wireless, full-body interaction with autonomous agents , 1997, Multimedia Systems.

[7]  Paul Kabbash,et al.  The “prince” technique: Fitts' law and selection using area cursors , 1995, CHI '95.

[8]  A T Welford,et al.  Signal, Noise, Performance, and Age , 1981, Human factors.

[9]  Eric Horvitz,et al.  Principles of mixed-initiative user interfaces , 1999, CHI '99.

[10]  Y. J. Tejwani,et al.  Robot vision , 1989, IEEE International Symposium on Circuits and Systems,.

[11]  W. Buxton,et al.  A study in two-handed input , 1986, CHI '86.

[12]  Nebojsa Jojic,et al.  Detection and estimation of pointing gestures in dense disparity maps , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[13]  F. Kjeldsen,et al.  Visual interpretation for hand gestures a s a practical in-terface modality , 1997 .

[14]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Noëlle Carbonell,et al.  An experimental study of future “natural” multimodal human-computer interaction , 1993, CHI '93.

[16]  F. Blanchard-Fields,et al.  Perspectives On Cognitive Change In Adulthood and Aging , 1996 .

[17]  Andy Cockburn,et al.  Gesture navigation: an alternative 'back' for the future , 2002, CHI Extended Abstracts.

[18]  Shumin Zhai,et al.  The “Silk Cursor”: investigating transparency for 3D target acquisition , 1994, CHI '94.

[19]  Timothy A. Salthouse,et al.  Theoretical Perspectives on Cognitive Aging , 1991 .

[20]  Eric Horvitz,et al.  A computational architecture for conversation , 1999 .

[21]  Harold Fox,et al.  Evaluating look-to-talk: a gaze-aware interface in a collaborative environment , 2002, CHI Extended Abstracts.

[22]  William T. Freeman,et al.  Television control by hand gestures , 1994 .

[23]  James A. Landay,et al.  Implications for a gesture design tool , 1999, CHI '99.

[24]  Krishna Bharat,et al.  Making computers easier for older adults to use: area cursors and sticky icons , 1997, CHI.

[25]  Trevor Darrell,et al.  Integrated Person Tracking Using Stereo, Color, and Pattern Detection , 2000, International Journal of Computer Vision.

[26]  François Bérard The Perceptual Window: Head Motion as a New Input Stream , 1999, INTERACT.