In many practical situations, a desirable user interface to a computer system should have a model of where a person is looking at and what he/she is paying attention to. This is particularly important if a system is providing multi-modal communication cues, speech, gesture, lipreading , etc., 2, 3, 8] and the system must identify, whether the cues are aimed at it, or at someone else in the room. This paper describes a system that identiies user focus of attention by visually determining where a person is looking. While other attempts at gaze tracking usually assume a xed or limited location of a per-son's face, the approach presented here allows for complete freedom of movement in a room. The gaze-tracking system, uses several connectionist modules, that track a person's face using a software controlled pan-tilt camera with zoom and identiies the focus of attention from the orientation and direction of the face.
[1]
Dean A. Pomerleau,et al.
Neural Network Perception for Mobile Robot Guidance
,
1993
.
[3]
Alex Waibel,et al.
Face locating and tracking for human-computer interaction
,
1994,
Proceedings of 1994 28th Asilomar Conference on Signals, Systems and Computers.
[4]
Azriel Rosenfeld,et al.
Computer Vision
,
1988,
Adv. Comput..
[5]
H. Martin Hunke,et al.
Locating and Tracking of Human Faces with Neural Networks
,
1994
.
[6]
Alexander H. Waibel,et al.
Toward movement-invariant automatic lip-reading and speech recognition
,
1995,
1995 International Conference on Acoustics, Speech, and Signal Processing.
[7]
Alexander H. Waibel,et al.
Knowing who to listen to in speech recognition: visually guided beamforming
,
1995,
1995 International Conference on Acoustics, Speech, and Signal Processing.