A computer vision system for on-screen item selection by finger pointing

Pointing at planar surfaces such as TV and computer monitors or projection screens can be a useful mode of interaction between humans and machines. To a large extent what seems to hinder the use of vision in such practical applications is the difficulty of the computational task, which is typically defined as 3-D reconstruction from uncalibrated 2-D images of a non-static scene. We describe below two designs where, using one or two cameras, the target of pointing on a flat monitor or screen is identified without 3-D inference, using only image morphing and line intersection. This is accomplished by registering the images with the target plane. When used to identify a pointing target on a surface hidden from the camera (e.g., a computer monitor which supports the camera itself as in most PC configurations), we add aperture(s) coplanar with the target surface in front of the camera(s). We describe experimental results showing a fully automated procedure for pointing target detection with high accuracy. The simplicity of our method and its robustness, as well as the relative accuracy of our results, can make pointing a practical means of human-machine interaction.

[1]  Brendan J. Frey,et al.  Detection and tracking of faces and facial features , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[2]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[3]  Vladimir Pavlovic,et al.  Gestural interface to a visual computing environment for molecular biologists , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[4]  Damian M. Lyons,et al.  A Line-Scan Computer Vision Algorithm for Identifying Human Body Features , 1999, Gesture Workshop.

[5]  David Kortenkamp,et al.  Recognizing and Interpreting Gestures within the Context of an Intelligent Robot Control Architecture , 1995 .

[6]  Roberto Cipolla,et al.  Human-robot interface by pointing with uncalibrated stereo vision , 1996, Image Vis. Comput..

[7]  Helge J. Ritter,et al.  GREFIT: Visual Recognition of Hand Postures , 1999, Gesture Workshop.

[8]  Nebojsa Jojic,et al.  Detection and estimation of pointing gestures in dense disparity maps , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[9]  Nassir Navab,et al.  Relative affine structure: theory and application to 3D reconstruction from perspective views , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Yasuhito Suenaga,et al.  "Finger-Pointer": Pointing interface by image processing , 1994, Comput. Graph..

[11]  Michael J. Swain,et al.  Gesture recognition using the Perseus architecture , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Daphna Weinshall,et al.  From Reference Frames to Reference Planes: Multi-View Parallax Geometry and Applications , 1998, ECCV.

[13]  P. Anandan,et al.  Direct recovery of shape from multiple views: a parallax based approach , 1994, Proceedings of 12th International Conference on Pattern Recognition.