A Survey of Computer Vision-Based Human Motion Capture

A comprehensive survey of computer vision-based human motion capture literature from the past two decades is presented. The focus is on a general overview based on a taxonomy of system functionalities, broken down into four processes: initialization, tracking, pose estimation, and recognition. Each process is discussed and divided into subprocesses and/or categories of methods to provide a reference to describe and compare the more than 130 publications covered by the survey. References are included throughout the paper to exemplify important issues and their relations to the various methods. A number of general assumptions used in this research field are identified and the character of these assumptions indicates that the research field is still in an early stage of development. To evaluate the state of the art, the major application areas are identified and performances are analyzed in light of the methods presented in the survey. Finally, suggestions for future research directions are offered.

[1]  É. Marey,et al.  Animal mechanism : a treatise on terrestrial and aerial locomotion , 2022 .

[2]  Ming-Kuei Hu,et al.  Visual pattern recognition by moment invariants , 1962, IRE Trans. Inf. Theory.

[3]  W. Köhler Gestalt psychology , 1967 .

[4]  G. Johansson Visual motion perception. , 1975, Scientific American.

[5]  J. O'Rourke,et al.  Model-based image analysis of human motion using constraint propagation , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  David C. Hogg Model-based vision: a program to see a walking person , 1983, Image Vis. Comput..

[7]  Koichiro Akita,et al.  Image sequence analysis of real world human motion , 1984, Pattern Recognit..

[8]  S. Sumi Upside-down Presentation of the Johansson Moving Light-Spot Pattern , 1984, Perception.

[9]  Hsi-Jian Lee,et al.  Determination of 3D human body postures from a single view , 1985, Comput. Vis. Graph. Image Process..

[10]  Yoshiaki Shirai,et al.  Detection of the movements of persons from a sparse sequence of TV images , 1983, Pattern Recognition.

[11]  Andrew Bernat,et al.  Security Applications Of Computer Motion Detection , 1987, Other Conferences.

[12]  Yee-Hong Yang,et al.  A region based approach for human body motion analysis , 1987, Pattern Recognit..

[13]  Yee-Hong Yang,et al.  Human body motion segmentation in a complex scene , 1987, Pattern Recognit..

[14]  R. Okafor Maximum likelihood estimation from incomplete data , 1987 .

[15]  Geoffrey D. Sullivan,et al.  Model-based Recognition of Human Posture using Single Synthetic Images , 1989, Alvey Vision Conference.

[16]  Masanobu Yamamoto,et al.  Human motion analysis based on a robot arm model , 1991, Proceedings. 1991 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Yee-Hong Yang,et al.  Log-Tracker: an Attribute-Based Approach to Tracking Human Body Motion , 1991, Int. J. Pattern Recognit. Artif. Intell..

[18]  J. Sklansky,et al.  Segmentation of people in motion , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[19]  Yuhua Luo,et al.  An automatic rotoscopy system for human motion based on a biomechanic graphical model , 1992, Comput. Graph..

[20]  T. M. Kepple MOVE3D-software for analyzing human motion , 1992, Proceedings of the Johns Hopkins National Search for Computing Applications to Assist Persons with Disabilities.

[21]  Hsi-Jian Lee,et al.  Knowledge-guided visual perception of 3-D human gait from a single image sequence , 1992, IEEE Trans. Syst. Man Cybern..

[22]  Juhui Wang,et al.  Human motion analysis with detection of subpart deformations , 1992, Electronic Imaging.

[23]  Francisco J. Perales,et al.  A system for human motion matching between synthetic and real images based on a biomechanic graphical model , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[24]  M. Rossi,et al.  Tracking and counting moving people , 1994, Proceedings of 1st International Conference on Image Processing.

[25]  R. Nelson,et al.  Low level recognition of human motion (or how to get your man without finding his body parts) , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[26]  Minoru Asada,et al.  MDL-based spatiotemporal segmentation from motion in a long image sequence , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[27]  J. Aggarwal,et al.  Lower limb kinematics of human walking with the medial axis transformation , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[28]  Edward H. Adelson,et al.  Analyzing and recognizing walking figures in XYT , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[29]  David C. Hogg,et al.  An Eecient Method for Contour Tracking Using Active Shape Models , 1994 .

[30]  David C. Hogg,et al.  An efficient method for contour tracking using active shape models , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[31]  Jake K. Aggarwal,et al.  Articulated and elastic non-rigid motion: a review , 1994, Proceedings of 1994 IEEE Workshop on Motion of Non-rigid and Articulated Objects.

[32]  Gang Xu,et al.  Tracking Human Body Motion Based on a Stick Figure Model , 1994, J. Vis. Commun. Image Represent..

[33]  Yee-Hong Yang,et al.  First Sight: A Human Body Outline Labeling System , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Carlo S. Regazzoni,et al.  Human body modelling for people localization and tracking from real image sequences , 1995 .

[35]  Ioannis A. Kakadiaris,et al.  3D human body model acquisition from multiple views , 1995, Proceedings of IEEE International Conference on Computer Vision.

[36]  Aaron F. Bobick,et al.  Recognition of human body motion using phase space constraints , 1995, Proceedings of IEEE International Conference on Computer Vision.

[37]  Mubarak Shah,et al.  Motion-based recognition a survey , 1995, Image Vis. Comput..

[38]  Pietro Perona,et al.  Monocular tracking of the human arm in 3D , 1995, Proceedings of IEEE International Conference on Computer Vision.

[39]  T D Albright,et al.  Visual motion perception. , 1995, Proceedings of the National Academy of Sciences of the United States of America.

[40]  Jake K. Aggarwal,et al.  Tracking human motion in an indoor environment , 1995, Proceedings., International Conference on Image Processing.

[41]  Matthew Turk,et al.  Visual interaction with lifelike characters , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[42]  Michael J. Swain,et al.  Gesture recognition using the Perseus architecture , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[43]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[44]  Christopher R. Wren,et al.  Real-Time 3-D Tracking of the Human Body , 1996 .

[45]  Kazuo Kyuma,et al.  Computer vision for computer games , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[46]  James W. Davis,et al.  An appearance-based representation of action , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[47]  Alex Pentland,et al.  Staying Alive: A Virtual Reality Visualization Tool for Cancer Patients , 1996 .

[48]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[49]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[50]  Trevor Darrell,et al.  A novel environment for situated vision and behavior , 1994 .

[51]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Jake K. Aggarwal,et al.  Tracking human motion using multiple cameras , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[53]  Ramesh C. Jain,et al.  Reality modeling and visualization from multiple video sequences , 1996, IEEE Computer Graphics and Applications.

[54]  R. Jain,et al.  Estimation of articulated motion using kinematically constrained mixture densities , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[55]  James W. Davis,et al.  The Representation and Recognition of Action Using Temporal Templates , 1997, CVPR 1997.

[56]  Karl Rohr,et al.  Human Movement Analysis Based on Explicit Motion Models , 1997 .

[57]  J. Ohya,et al.  Real-time estimation of human body posture from monocular thermal images , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[58]  Alex Pentland,et al.  Perceptive Spaces for Performance and Entertainment Untethered Interaction Using Computer Vision and Audition , 1997, Appl. Artif. Intell..

[59]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[60]  Michel Dhome,et al.  Human body limbs tracking by multi-ocular vision , 1997 .

[61]  Christoph Bregler,et al.  Learning and recognizing human dynamics in video sequences , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[62]  Vladimir Pavlovic,et al.  Visual Interpretation of Hand Gestures for Human-Computer Interaction: A Review , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[63]  H. Nagel,et al.  Tracking of persons in monocular image sequences , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[64]  Tomaso A. Poggio,et al.  Pedestrian detection using wavelet templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[65]  Joachim Denzler,et al.  Model based extraction of articulated objects in image sequences for gait analysis , 1997, Proceedings of International Conference on Image Processing.

[66]  Norman I. Badler,et al.  Virtual humans for animation, ergonomics, and simulation , 1997, Proceedings IEEE Nonrigid and Articulated Motion Workshop.

[67]  Pascal Fua,et al.  Human Body Modeling and Motion Analysis From Video Sequences , 1998 .

[68]  Nadia Magnenat-Thalmann,et al.  Modelling and Motion Capture Techniques for Virtual Environments , 1998, Lecture Notes in Computer Science.

[69]  Masahiko Yachida,et al.  Multiple-view-based tracking of multiple humans , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[70]  David Hansel,et al.  A model driven 3D image interpretation system applied to person detection in video images , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[71]  Alex Pentland,et al.  Dynamic models of human motion , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[72]  Larry S. Davis,et al.  Visual Surveillance of Human Activity , 1998, ACCV.

[73]  Masanori Yamada,et al.  A new robust real-time method for extracting human silhouettes from color images , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[74]  Jiang Yu Zheng,et al.  A model based approach in extracting and generating human motion , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[75]  Peter Nordlund,et al.  Figure-ground segmentation using multiple cues , 1998 .

[76]  Hironobu Fujiyoshi,et al.  Real-time human motion analysis by image skeletonization , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[77]  Patrick Bouthemy,et al.  Complex object tracking by visual servoing based on 2D image motion , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[78]  Yi Li,et al.  Human posture recognition using multi-scale morphological method and Kalman motion estimation , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[79]  Larry S. Davis,et al.  Ghost: a human body part labeling system using silhouettes , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[80]  Pascal Fua,et al.  Local and Global Skeleton Fitting Techniques for Optical Motion Capture , 1998, CAPTECH.

[81]  D. Thalmann,et al.  Local and Global Skeleton Fitting Techniques for Optical Motion Capture , Modeling and Motion Capture Techniques for Virtual Environments , 1998 .

[82]  Andrea Bottino,et al.  Toward Non-intrusive Motion Capture , 1998, ACCV.

[83]  Atsushi Nakazawa,et al.  Human tracking using distributed vision systems , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[84]  Noboru Ohnishi,et al.  Cue circles: image feature for measuring 3-D motion of articulated objects using sequential image pair , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[85]  PeopleIsmail,et al.  W 4 : Who ? When ? Where ? What ? A Real Time System for Detecting and Tracking , 1998 .

[86]  James W. Davis,et al.  Virtual PAT: A Virtual Personal Aerobics Trainer , 1998 .

[87]  Helen C. Shen,et al.  3-D Reconstruction of Multipart Self-Occluding Objects , 1998, ACCV.

[88]  Larry S. Davis,et al.  W/sup 4/: Who? When? Where? What? A real time system for detecting and tracking people , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[89]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[90]  Michael Isard,et al.  Active Contours , 2000, Springer London.

[91]  J. Crowley Recognizing Motion Using Local Appearance , 1998 .

[92]  Takuya Kondo,et al.  Skill recognition , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[93]  Kee Chang Lee,et al.  Virtual Stage: A Location-Based Karaoke System , 1998, IEEE Multim..

[94]  Christian Wöhler,et al.  Motion-based recognition of pedestrians , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[95]  Ben Delaney On the Trail of the Shadow Woman: The Mystery of Motion Capture , 1998, IEEE Computer Graphics and Applications.

[96]  Jed Lengyel,et al.  The Convergence of Graphics and Vision , 1998, Computer.

[97]  Helen C. Shen,et al.  A 3D Reconstruction System for Human Body Modeling , 1998, CAPTECH.

[98]  Pietro Perona,et al.  Reach out and touch space (motion learning) , 1998, Proceedings Third IEEE International Conference on Automatic Face and Gesture Recognition.

[99]  Takeo Kanade,et al.  Constructing virtual worlds using dense stereo , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[100]  Ioannis A. Kakadiaris,et al.  Vision-based animation of digital humans , 1998, Proceedings Computer Animation '98 (Cat. No.98EX169).

[101]  Wei Sun,et al.  Virtual people: capturing human models to populate virtual worlds , 1999, Proceedings Computer Animation 1999.

[102]  Vladimir Pavlovic,et al.  A dynamic Bayesian network approach to figure tracking using learned dynamic models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[103]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[104]  Olivier D. Faugeras,et al.  3D articulated models and multi-view tracking with silhouettes , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[105]  Andrew Blake,et al.  Classification of human body motion , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[106]  Claudio S. Pinhanez,et al.  Using Computer Vision to Control a Reactive Computer Graphics Character in a Theater Play , 1999, ICVS.

[107]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[108]  P. Johansen,et al.  Proceedings of The 11th Scandinavian Conference on Image Analysis , 2007 .

[109]  Thomas B. Moeslund,et al.  Summaries of 107 Computer Vision-Based Human Motion Capture Papers , 1999 .

[110]  Takashi Totsuka,et al.  Torque-based recursive filtering approach to the recovery of 3D articulated motion from image sequences , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[111]  Adrian Hilton Towards model-based capture of a persons shape, appearance and motion , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[112]  David A. Forsyth,et al.  Finding people by sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[113]  S. Gong,et al.  Tracking hybrid 2D-3D human models from multiple views , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[114]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[115]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[116]  Masahiko Yachida,et al.  Posture estimation using structure and motion models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[117]  Bernard F. Buxton,et al.  An improved algorithm for reconstruction of the surface of the human body from 3D scanner data using local B-spline patches , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[118]  Josep Amat,et al.  Stereoscopic system for human body tracking in natural scenes , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[119]  R. Plankers,et al.  Automated body modeling from video sequences , 1999, Proceedings IEEE International Workshop on Modelling People. MPeople'99.

[120]  Larry S. Davis,et al.  Real-time periodic motion detection, analysis, and applications , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[121]  Ryohei Nakatsu,et al.  Virtual Metamorphosis , 1999, IEEE Multim..

[122]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[123]  Michel Dhome,et al.  Tracking of Human Limbs by Multiocular Vision , 1999, Comput. Vis. Image Underst..

[124]  Jake K. Aggarwal,et al.  Human Motion Analysis: A Review , 1999, Comput. Vis. Image Underst..

[125]  Thomas B. Moeslund,et al.  3D human pose estimation using 2D-Data and an alternative phase space representation , 2000 .

[126]  Thomas B. Moeslund,et al.  Multiple cues used in model-based human motion capture , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[127]  Nebojsa Jojic,et al.  Detection and estimation of pointing gestures in dense disparity maps , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[128]  Yi Li,et al.  Extraction of parametric human model for posture recognition using genetic algorithm , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[129]  Stephen J. McKenna,et al.  Tracking interacting people , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[130]  Wei Sun,et al.  Whole-body modelling of people from multiview images to populate virtual worlds , 2000, The Visual Computer.

[131]  Kazuhiko Takahashi,et al.  Human body postures from trinocular camera images , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[132]  Rómer Rosales,et al.  Learning and synthesizing human body motion and posture , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[133]  Yoshiaki Shirai,et al.  Tracking a person with 3-D motion by integrating optical flow and depth , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[134]  Mubarak Shah,et al.  A virtual 3D blackboard: 3D finger tracking using a single camera , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[135]  Takashi Totsuka,et al.  Constraint-conscious smoothing framework for the recovery of 3D articulated motion from image sequences , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[136]  Alex Pentland,et al.  Understanding purposeful human motion , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[137]  Michael J. Black,et al.  A framework for modeling the appearance of 3D articulated figures , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[138]  Yoshito Ohta,et al.  Human action tracking guided by key-frames , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[139]  Gerhard Rigoll,et al.  Person tracking in real-world scenarios using statistical methods , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[140]  Thomas B. Moeslund Interacting with a Virtual World Through Motion Capture , 2001 .

[141]  Larry S. Davis,et al.  Backpack: Detection of People Carrying Objects Using Silhouettes , 2001, Comput. Vis. Image Underst..

[142]  A. ADoefaa,et al.  ? ? ? ? f ? ? ? ? ? , 2003 .