A gesture control system for intuitive 3D interaction with virtual objects

We present a system for interacting with 3D objects in a 3D virtual environment. Using the notion that a typical head-mounted display (HMD) does not cover the user's entire face, we use a fiducial marker placed on the HMD to locate the user's exposed facial skin. Using this information, a skin model is built and combined with the depth information obtained from a stereo camera. The information when used in tandem allows the position of the user's hands to be detected and tracked in real time. Once both hands are located, our system allows the user to manipulate the object with five degrees of freedom (translation in x-, y-, and z- axis with roll and yaw rotations) in virtual three-dimensional space using a series of intuitive hand gestures. Copyright © 2009 John Wiley & Sons, Ltd. Manipulating a 3D object in a Virtual 3D space. Left: the user manipulating the 3D object with an intuitive set of hand gestures. Right: the virtual 3D space ship displayed in the user's HMD being controlled in real time by the user.

[1]  Shree K. Nayar,et al.  What Can Be Known about the Radiometric Response from Images? , 2002, ECCV.

[2]  Vladimir Vezhnevets,et al.  A Survey on Pixel-Based Skin Color Detection Techniques , 2003 .

[3]  Frank M. Candocia A least squares approach for the joint domain and range registration of images , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[4]  Hirokazu Kato,et al.  Collaborative augmented reality , 2002, CACM.

[5]  Hirokazu Kato,et al.  Marker tracking and HMD calibration for a video-based augmented reality conferencing system , 1999, Proceedings 2nd IEEE and ACM International Workshop on Augmented Reality (IWAR'99).

[6]  Ronald M. Baecker,et al.  Readings in human-computer interaction : a multidisciplinary approach , 1988 .

[7]  Gary Bradski,et al.  Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .

[8]  Gregory Ward Larson,et al.  LogLuv Encoding for Full-Gamut, High-Dynamic Range Images , 1998, J. Graphics, GPU, & Game Tools.

[9]  S. Mann,et al.  Determining camera response functions from comparagrams of images with their raw datafile counterparts , 2004, Proceedings of 2004 International Symposium on Intelligent Multimedia, Video and Speech Processing, 2004..

[10]  Min C. Shin,et al.  Does colorspace transformation make any difference on skin detection? , 2002, Sixth IEEE Workshop on Applications of Computer Vision, 2002. (WACV 2002). Proceedings..

[11]  Katashi Nagao,et al.  The world through the computer: computer augmented interaction with real world environments , 1995, UIST '95.