Computers Seeing People

AI researchers are interested in building intelligent machines that can interact with them as they interact with each other. Science fiction writers have given us these goals in the form of HAL in 2001: A Space Odyssey and Commander Data in Star Trek: The Next Generation. However, at present, our computers are deaf, dumb, and blind, almost unaware of the environment they are in and of the user who interacts with them. In this article, I present the current state of the art in machines that can see people, recognize them, determine their gaze, understand their facial expressions and hand gestures, and interpret their activities. I believe that by building machines with such abilities for perceiving, people will take us one step closer to building HAL and Commander Data.

[1]  Norbert Krüger,et al.  Face Recognition by Elastic Bunch Graph Matching , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Thomas S. Huang,et al.  Face detection with information-based maximum discrimination , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[3]  A. Young,et al.  Handbook of Research on Face Processing , 1989 .

[4]  Gilad Adiv,et al.  Determining Three-Dimensional Motion and Structure from Optical Flow Generated by Several Moving Objects , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Alex Pentland,et al.  Real-time self-calibrating stereo person tracking using 3-D shape estimation from blob features , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[6]  Alex Pentland,et al.  A vision system for observing and extracting facial action parameters , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Alex Pentland,et al.  LAFTER: lips and face real time tracker , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  L SwetsDaniel,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996 .

[9]  Joachim M. Buhmann,et al.  Distortion Invariant Object Recognition in the Dynamic Link Architecture , 1993, IEEE Trans. Computers.

[10]  Lawrence Sirovich,et al.  Application of the Karhunen-Loeve Procedure for the Characterization of Human Faces , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[12]  R. Krauss,et al.  Do conversational hand gestures communicate? , 1991, Journal of personality and social psychology.

[13]  Harry Wechsler,et al.  The FERET database and evaluation procedure for face-recognition algorithms , 1998, Image Vis. Comput..

[14]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Thomas S. Huang,et al.  Final Report To NSF of the Planning Workshop on Facial Expression Understanding , 1992 .

[16]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[17]  Aaron F. Bobick,et al.  A state-based technique for the summarization and recognition of gesture , 1995, Proceedings of IEEE International Conference on Computer Vision.

[18]  Irfan Essa,et al.  A System for Tracking and Recognizing Multiple People with Multiple Cameras , 1998 .

[19]  Kira Hall,et al.  Proceedings of the Sixteenth Annual Meeting of the Berkeley Linguistics Society, February 16-19, 1990 : general session and parasession on the legacy of Grice , 1990 .

[20]  J. P. Foley,et al.  Gesture and Environment , 1942 .

[21]  P. Anandan,et al.  Hierarchical Model-Based Motion Estimation , 1992, ECCV.

[22]  Vishvjit S. Nalwa,et al.  A guided tour of computer vision , 1993 .

[23]  Vicki Bruce,et al.  Processing Images of Faces , 1992 .

[24]  Alex Pentland,et al.  Visually Controlled Graphics , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Justine Cassell,et al.  Gesture and Ground , 1990 .

[26]  Takeo Kanade,et al.  Visual Tracking of High DOF Articulated Structures: an Application to Human Hand Tracking , 1994, ECCV.

[27]  Alex Pentland,et al.  View-based and modular eigenspaces for face recognition , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Aaron F. Bobick,et al.  Computers Seeing Action , 1996, BMVC.

[29]  Norbert Krüger,et al.  Determination of face position and pose with a learned representation based on labelled graphs , 1997, Image Vis. Comput..

[30]  Irfan Essa,et al.  Tracking facial motion , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[31]  Alex Pentland,et al.  Coupled hidden Markov models for complex action recognition , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[32]  Alex Pentland,et al.  Space-time gestures , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Roberto Brunelli,et al.  Face Recognition: Features Versus Templates , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Rama Chellappa,et al.  Human and machine recognition of faces: a survey , 1995, Proc. IEEE.

[35]  David C. Hogg,et al.  An efficient method for contour tracking using active shape models , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[36]  Alex Pentland,et al.  Modeling, tracking and interactive animation of faces and heads//using input from video , 1996, Proceedings Computer Animation '96.

[37]  Mubarak Shah,et al.  Motion-based recognition a survey , 1995, Image Vis. Comput..

[38]  Alex Pentland,et al.  Real-Time American Sign Language Recognition Using Desk and Wearable Computer Based Video , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Norbert Krüger,et al.  Face recognition by elastic bunch graph matching , 1997, Proceedings of International Conference on Image Processing.

[40]  Rama Chellappa,et al.  A feature based approach to face recognition , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[41]  P. Ekman,et al.  The Repertoire of Nonverbal Behavior: Categories, Origins, Usage, and Coding , 1969 .

[42]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[43]  Thomas S. Huang,et al.  Object detection using hierarchical MRF and MAP estimation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  David C. Brogan,et al.  Animating human athletics , 1995, SIGGRAPH.

[45]  Randal C. Nelson,et al.  Detecting activities , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[47]  Michael C. Burl,et al.  Finding Faces in Cluttered Scenes Using Labeled Random Graph Matching. , 1995, ICCV 1995.

[48]  Ashok Samal,et al.  Automatic recognition and analysis of human faces and facial expressions: a survey , 1992, Pattern Recognit..

[49]  Juyang Weng,et al.  Using Discriminant Eigenfeatures for Image Retrieval , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  A. Kendon Movement coordination in social interaction: some examples described. , 1970, Acta psychologica.

[51]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[52]  Alex Pentland,et al.  Task-Specific Gesture Analysis in Real-Time Using Interpolated Views , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[53]  Alex Pentland,et al.  A Unified Approach for Physical and Geometric Modeling for Graphics and Animation , 1992, Comput. Graph. Forum.

[54]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[55]  Alexander H. Waibel,et al.  A real-time face tracker , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[56]  Takeo Kanade,et al.  Rotation invariant neural network-based face detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[57]  Larry S. Davis,et al.  Computing spatio-temporal representations of human faces , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[58]  A. U.S. Causal Analysis for Visual Gesture Understanding , 1995 .

[59]  James W. Davis,et al.  Real-time recognition of activity using temporal templates , 1996, Proceedings Third IEEE Workshop on Applications of Computer Vision. WACV'96.

[60]  Alex Pentland,et al.  ALIVE: Artificial Life Interactive Video Environment , 1994, AAAI.

[61]  V. Bruce Face recognition : a special issue of the European journal of cognitive psychology , 1991 .

[62]  Kyu Ho Park,et al.  Automatic human face location in a complex background using motion and color information , 1996, Pattern Recognit..

[63]  Les E. Atlas,et al.  The challenge of spoken language systems: research directions for the nineties , 1995, IEEE Trans. Speech Audio Process..

[64]  David Beymer,et al.  Face recognition from one example view , 1995, Proceedings of IEEE International Conference on Computer Vision.

[65]  Alex Pentland,et al.  Facial expression recognition using a dynamic model and motion energy , 1995, Proceedings of IEEE International Conference on Computer Vision.

[66]  Hyeonjoon Moon,et al.  The FERET evaluation methodology for face-recognition algorithms , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[67]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[68]  James W. Davis,et al.  An appearance-based representation of action , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[69]  M. Studdert-Kennedy Hand and Mind: What Gestures Reveal About Thought. , 1994 .

[70]  E. Vesterinen,et al.  Affective Computing , 2009, Encyclopedia of Biometrics.

[71]  Michael J. Black,et al.  The robust estimation of multiple motions: Affine and piecewise smooth flow fields , 1993 .

[72]  Alex Pentland,et al.  Recursive estimation of structure and motion using relative orientation constraints , 1993, Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[73]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[74]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[75]  Rama Chellappa,et al.  Face recognition using discriminant eigenvectors , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[76]  G. Lakoff,et al.  Metaphors We Live by , 1982 .

[77]  J. Hodgins ANIMATING HUMAN MOTION , 1998 .

[78]  A. Young,et al.  Aspects of face processing , 1986 .

[79]  Norman I. Badler,et al.  Final Report to Nsf of the Standards for Facial Animation Workshop Final Report to Nsf of the Standards for Facial Animation Workshop , 1994 .

[80]  Michael J. Black,et al.  Tracking and recognizing rigid and non-rigid facial motions using local parametric models of image motion , 1995, Proceedings of IEEE International Conference on Computer Vision.

[81]  Alex Pentland,et al.  Probabilistic visual learning for object detection , 1995, Proceedings of IEEE International Conference on Computer Vision.

[82]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[83]  James W. Davis,et al.  The KidsRoom: A Perceptually-Based Interactive and Immersive Story Environment , 1999, Presence.

[84]  Ioannis A. Kakadiaris,et al.  Active part-decomposition, shape and motion estimation of articulated objects: a physics-based approach , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[85]  H. Buxton,et al.  Advanced visual surveillance using Bayesian networks , 1997 .

[86]  Takeo Kanade,et al.  Computer recognition of human faces , 1980 .

[87]  Takeo Kanade,et al.  Rotation Invariant Neural Network-Based Face Detection , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).