"Continuous-state Graphical Models for Object Localization, Pose Estimation and Tracking"

Reasoning about pose and motion of objects, based on images or video, is an important task for many machine vision applications. Estimating the pose of articulated objects such as people and animals is particularly challenging due to the complexity of the possible poses yet has applications in computer vision, medicine, biology, animation, and entertainment. Realistic natural scenes, object motion, noise in the image observations, incomplete evidence that arises from occlusions, and high dimensionality of the pose itself are all challenges that need to be addressed. In this thesis we propose a class of approaches that model objects using continuous-state graphical models. We show that these approaches can be used to effectively model complex objects by allowing tractable and robust inference algorithms that are able to infer pose of these objects in the presence of realistic appearance variations and articulations. We use continuous-state graphical models to model both rigid and articulated object structures; where nodes correspond to parts of objects and edges represent the constraints between parts encoded as statistical distributions. For rigid objects, these constraints can model spatial and temporal relationships between parts; for articulated objects kinematic, inter-penetration and occlusion relationships. Localization, pose estimation, and tracking can then be formulated as inference in these graphical models. This has a number of advantages over more traditional methods. First, these models allow inference algorithms that scale linearly with the number of body parts by breaking up the high-dimensional search for pose into a number of lower-dimensional collaborative searches. Secondly, partial occlusions can be dealt with robustly by propagating spatial information between parts. Thirdly, "bottom-up" information can be incorporated directly and effectively into the inference process, helping the algorithm to recover from transient tracking failures. We show that these hierarchical continuous-state graphical models can be used to solve the challenging problem of inferring the 3D pose of the person from a single monocular image.

[1]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[2]  Rómer Rosales,et al.  Learning and synthesizing human body motion and posture , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[3]  Mohan M. Trivedi,et al.  Human Body Model Acquisition and Motion Capture Using Voxel Data , 2002, AMDO.

[4]  Tomaso A. Poggio,et al.  A Trainable System for Object Detection , 2000, International Journal of Computer Vision.

[5]  David J. Fleet,et al.  People tracking using hybrid Monte Carlo filtering , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Cristian Sminchisescu,et al.  Learning Joint Top-Down and Bottom-up Processes for 3D Visual Inference , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Finn V. Jensen,et al.  Bayesian Networks and Decision Graphs , 2001, Statistics for Engineering and Information Science.

[8]  Stephen J. McKenna,et al.  Tracking interacting people , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[9]  Cristian Sminchisescu,et al.  Discriminative density propagation for 3D human motion estimation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[10]  Pietro Perona,et al.  A Probabilistic Approach to Object Recognition Using Local Photometry and Global Geometry , 1998, ECCV.

[11]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[12]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[13]  Greg Mori,et al.  Guiding model search using segmentation , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[14]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[15]  Dorin Comaniciu,et al.  An Algorithm for Data-Driven Bandwidth Selection , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Larry S. Davis,et al.  3-D model-based tracking of humans in action: a multi-view approach , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Shihong Lao,et al.  A fast and robust 3D head pose and gaze estimation system , 2008, 2008 8th IEEE International Conference on Automatic Face & Gesture Recognition.

[18]  Aaron Hertzmann,et al.  Eurographics/ Acm Siggraph Symposium on Computer Animation (2006) Learning a Correlated Model of Identity and Pose-dependent Body Shape Variation for Real-time Synthesis , 2022 .

[19]  Stefano Corazza,et al.  Accurately measuring human movement using articulated ICP with soft-joint constraints and a repository of articulated models , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[21]  Cristian Sminchisescu,et al.  Building Roadmaps of Local Minima of Visual Models , 2002, ECCV.

[22]  Rama Chellappa,et al.  Measuring human movement for biomechanical applications using markerless motion capture , 2006, Electronic Imaging.

[23]  Nicholas R. Howe,et al.  Silhouette lookup for monocular 3D pose tracking , 2007, Image Vis. Comput..

[24]  Yi Li,et al.  Extraction of parametric human model for posture recognition using genetic algorithm , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[25]  James M. Rehg,et al.  Singularity analysis for articulated object tracking , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[26]  Dragomir Anguelov,et al.  Object Pose Detection in Range Scan Data , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[27]  Tosiyasu L. Kunii,et al.  A functional model for constructive solid geometry , 1985, The Visual Computer.

[28]  Dragomir Anguelov,et al.  A General Algorithm for Approximate Inference and Its Application to Hybrid Bayes Nets , 1999, UAI.

[29]  Neil J. Gordon,et al.  A tutorial on particle filters for online nonlinear/non-Gaussian Bayesian tracking , 2002, IEEE Trans. Signal Process..

[30]  Michael J. Black,et al.  Representing cyclic human motion using functional analysis , 2005, Image Vis. Comput..

[31]  Long Zhu,et al.  A Hierarchical Compositional System for Rapid Object Detection , 2005, NIPS.

[32]  C. D. Kemp,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[33]  Dariu Gavrila,et al.  The Visual Analysis of Human Movement: A Survey , 1999, Comput. Vis. Image Underst..

[34]  Takeo Kanade,et al.  A real time system for robust 3D voxel reconstruction of human motions , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[35]  David A. Forsyth,et al.  Computational Studies of Human Motion: Part 1, Tracking and Motion Synthesis , 2005, Found. Trends Comput. Graph. Vis..

[36]  Takeo Kanade,et al.  Image-based spatio-temporal modeling and view interpolation of dynamic events , 2005, TOGS.

[37]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[38]  Ying Zhu,et al.  Tracking Complex Objects Using Graphical Object Models , 2004, IWCM.

[39]  Jitendra Malik,et al.  Recovering human body configurations: combining segmentation and recognition , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[40]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[41]  L. Rabiner,et al.  An introduction to hidden Markov models , 1986, IEEE ASSP Magazine.

[42]  D. Huttenlocher,et al.  A unified spatio-temporal articulated model for tracking , 2004, CVPR 2004.

[43]  Todd Andrew Stephenson Conditional Gaussian Mixtures , 2003 .

[44]  Yi Li,et al.  Human posture recognition using multi-scale morphological method and Kalman motion estimation , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[45]  Brendan J. Frey,et al.  A comparison of algorithms for inference and learning in probabilistic graphical models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[46]  Mun Wai Lee,et al.  Human Upper Body Pose Estimation in Static Images , 2004, ECCV.

[47]  Michael J. Black,et al.  Gibbs likelihoods for Bayesian tracking , 2004, CVPR 2004.

[48]  Vladimir Pavlovic,et al.  A dynamic Bayesian network approach to figure tracking using learned dynamic models , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[49]  Hans-Peter Seidel,et al.  Interacting and Annealing Particle Filters: Mathematics and a Recipe for Applications , 2007, Journal of Mathematical Imaging and Vision.

[50]  Radu Horaud,et al.  Articulated Motion Capture from 3-D Points and Normals , 2005, BMVC.

[51]  Edmond Boyer,et al.  A hybrid approach for computing visual hulls of complex objects , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[52]  Rina Dechter,et al.  Bucket elimination: A unifying framework for probabilistic inference , 1996, UAI.

[53]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[54]  Steven R. Waterhouse,et al.  Classification and Regression using Mixtures of Experts , 1997 .

[55]  J. Yedidia An Idiosyncratic Journey Beyond Mean Field Theory , 2000 .

[56]  James M. Rehg,et al.  A multiple hypothesis approach to figure tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[57]  Sidharth Bhatia,et al.  Tracking loose-limbed people , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[58]  Antonio Torralba,et al.  Sharing features: efficient boosting procedures for multiclass object detection , 2004, CVPR 2004.

[59]  H. Shum,et al.  Learning A Highly Structured Motion Model for 3D Human Tracking , 2002 .

[60]  Alexei A. Efros,et al.  Putting Objects in Perspective , 2006, CVPR.

[61]  Luc Van Gool,et al.  Monocular Tracking with a Mixture of View-Dependent Learned Models , 2006, AMDO.

[62]  William T. Freeman,et al.  Bayesian Reconstruction of 3D Human Motion from Single-Camera Video , 1999, NIPS.

[63]  Jianbo Shi,et al.  Bottom-up Recognition and Parsing of the Human Body , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  Antonio Torralba,et al.  Context-based vision system for place and object recognition , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[65]  W. Eric L. Grimson,et al.  Simultaneous pose recovery and camera registration from multiple views of a walking person , 2007, Image Vis. Comput..

[66]  Yi Li,et al.  A relaxation algorithm for real-time multiple view 3D-tracking , 2002, Image Vis. Comput..

[67]  Erik B. Sudderth Graphical models for visual object recognition and tracking , 2006 .

[68]  Sidharth Bhatia,et al.  3D Human Limb Detection using Space Carving and Multi-View Eigen Models , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[69]  Cordelia Schmid,et al.  Learning to Parse Pictures of People , 2002, ECCV.

[70]  Matthew Brand,et al.  Shadow puppetry , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[71]  Dimitris N. Metaxas,et al.  Learning to Reconstruct 3 D Human Motion from Bayesian Mixtures of Experts . A Probabilistic Discriminative Approach , 2004 .

[72]  Dariu Gavrila,et al.  Real-time object detection for "smart" vehicles , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[73]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine-mediated learning.

[74]  Rin-ichiro Taniguchi,et al.  Real-time human motion analysis and IK-based human figure control , 2000, Proceedings Workshop on Human Motion.

[75]  Ian McGraw,et al.  Residual Belief Propagation: Informed Scheduling for Asynchronous Message Passing , 2006, UAI.

[76]  David A. Forsyth,et al.  Finding and tracking people from the bottom up , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[77]  Jitendra Malik,et al.  Tracking people with twists and exponential maps , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[78]  Pietro Perona,et al.  A Bayesian hierarchical model for learning natural scene categories , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[79]  Edmond Boyer On Using Silhouettes for Camera Calibration , 2006, ACCV.

[80]  Gang Hua,et al.  Learning to estimate human pose with data driven belief propagation , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[81]  Jiebo Luo,et al.  Body Localization in Still Images Using Hierarchical Models and Hybrid Search , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[82]  Michael J. Black,et al.  A Quantitative Evaluation of Video-based 3D Person Tracking , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[83]  Andrew Zisserman,et al.  Learning Layered Pictorial Structures from Video , 2004, ICVGIP.

[84]  Jitendra Malik,et al.  Recovering human body configurations using pairwise constraints between parts , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[85]  R. Quandt A New Approach to Estimating Switching Regressions , 1972 .

[86]  Jake K. Aggarwal,et al.  Simultaneous tracking of multiple body parts of interacting persons , 2006, Comput. Vis. Image Underst..

[87]  Nicu Sebe,et al.  Fast spatial pattern discovery integrating boosting with constellations of contextual descriptors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[88]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[89]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[90]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset for Evaluation of Articulated Human Motion , 2006 .

[91]  Takeo Kanade,et al.  Three-dimensional scene flow , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[92]  Gang Hua,et al.  Tracking articulated body by dynamic Markov network , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[93]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[94]  Ioannis A. Kakadiaris,et al.  Model-based estimation of 3D human motion with occlusion based on active multi-viewpoint selection , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[95]  M. Lee,et al.  Proposal maps driven MCMC for estimating human body pose in static images , 2004, CVPR 2004.

[96]  Michael J. Black,et al.  Detailed Human Shape and Pose from Images , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[97]  Pietro Perona,et al.  A sparse object category model for efficient learning and exhaustive recognition , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[98]  Michael J. Black,et al.  A framework for the robust estimation of optical flow , 1993, 1993 (4th) International Conference on Computer Vision.

[99]  Michel Dhome,et al.  Tracking of Human Limbs by Multiocular Vision , 1999, Comput. Vis. Image Underst..

[100]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[101]  L. Goddard Information Theory , 1962, Nature.

[102]  F. Sebastian Grassia,et al.  Practical Parameterization of Rotations Using the Exponential Map , 1998, J. Graphics, GPU, & Game Tools.

[103]  Pascal Fua,et al.  Robust People Tracking with Global Trajectory Optimization , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[104]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[105]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[106]  Rómer Rosales,et al.  Inferring body pose without tracking body parts , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[107]  Ankur Agarwal,et al.  Recovering 3D human pose from monocular images , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[108]  Andrew Zisserman,et al.  A Boundary-Fragment-Model for Object Detection , 2006, ECCV.

[109]  Michael J. Black,et al.  Cardboard people: a parameterized model of articulated image motion , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[110]  Michael Isard,et al.  Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[111]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[112]  N. Metropolis,et al.  The Monte Carlo method. , 1949 .

[113]  Larry S. Davis,et al.  A Robust Background Subtraction and Shadow Detection , 1999 .

[114]  Michael I. Jordan,et al.  Learning Graphical Models with Mercer Kernels , 2002, NIPS.

[115]  Andrew Blake,et al.  Tracking through singularities and discontinuities by random sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[116]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[117]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[118]  Mannes Poel,et al.  Comparison of silhouette shape descriptors for example-based human pose recovery , 2006, 7th International Conference on Automatic Face and Gesture Recognition (FGR06).

[119]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[120]  Ahmed M. Elgammal,et al.  Simultaneous Inference of View and Body Pose using Torus Manifolds , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[121]  David J. Fleet,et al.  3D People Tracking with Gaussian Process Dynamical Models , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[122]  Daniel P. Huttenlocher,et al.  Pictorial Structures for Object Recognition , 2004, International Journal of Computer Vision.

[123]  Richard Scheines,et al.  Learning the Structure of Linear Latent Variable Models , 2006, J. Mach. Learn. Res..

[124]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[125]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[126]  Maja J. Mataric,et al.  Markerless kinematic model and motion capture from volume sequences , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[127]  Nevin Lianwen Zhang,et al.  Exploiting Causal Independence in Bayesian Network Inference , 1996, J. Artif. Intell. Res..

[128]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[129]  Gang Hua,et al.  Measurement integration under inconsistency for robust tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[130]  David J. Fleet,et al.  Stochastic Tracking of 3D Human Figures Using 2D Image Motion , 2000, ECCV.

[131]  Cristian Sminchisescu,et al.  Estimating Articulated Human Motion with Covariance Scaled Sampling , 2003, Int. J. Robotics Res..

[132]  Tony X. Han,et al.  Efficient Nonparametric Belief Propagation with Application to Articulated Body Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[133]  Daniel P. Huttenlocher,et al.  Beyond trees: common-factor models for 2D human pose recovery , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[134]  Dorin Comaniciu,et al.  Component Fusion for Face Detection in the Presence of Heteroscedastic Noise , 2003, DAGM-Symposium.

[135]  Pietro Perona,et al.  Recognition by Probabilistic Hypothesis Construction , 2004, ECCV.

[136]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[137]  David J. Fleet,et al.  Multifactor Gaussian process models for style-content separation , 2007, ICML '07.

[138]  Gabriela Csurka,et al.  Visual categorization with bags of keypoints , 2002, eccv 2004.

[139]  Pietro Perona,et al.  Object class recognition by unsupervised scale-invariant learning , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[140]  Inderjit S. Dhillon,et al.  Clustering on the Unit Hypersphere using von Mises-Fisher Distributions , 2005, J. Mach. Learn. Res..

[141]  Ken Shoemake,et al.  Animating rotation with quaternion curves , 1985, SIGGRAPH.

[142]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[143]  Sebastian Thrun,et al.  SCAPE: shape completion and animation of people , 2005, SIGGRAPH 2005.

[144]  James M. Rehg,et al.  A Modular Approach to the Analysis and Evaluation of Particle Filters for Figure Tracking , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[145]  Eugene Charniak,et al.  Statistical language learning , 1997 .

[146]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[147]  Alex Pentland,et al.  Probabilistic Visual Learning for Object Representation , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[148]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[149]  Michael J. Black,et al.  The Dense Estimation of Motion and Appearance in Layers , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[150]  Antonio Torralba,et al.  Using the Forest to See the Trees: A Graphical Model Relating Features, Objects, and Scenes , 2003, NIPS.

[151]  Christian P. Robert,et al.  Monte Carlo Statistical Methods , 2005, Springer Texts in Statistics.

[152]  A. Elgammal,et al.  Inferring 3D body pose from silhouettes using activity manifold learning , 2004, CVPR 2004.

[153]  Martin A. Fischler,et al.  The Representation and Matching of Pictorial Structures , 1973, IEEE Transactions on Computers.

[154]  Dan Roth,et al.  Learning to detect objects in images via a sparse, part-based representation , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[155]  Andrew Zisserman,et al.  Learning Layered Motion Segmentation of Video , 2005, ICCV.

[156]  R. Plankers,et al.  Articulated soft objects for video-based body modeling , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[157]  Larry S. Davis,et al.  Learned Models for Estimation of Rigid and Articulated Human Motion from Stationary or Moving Camera , 2004, International Journal of Computer Vision.

[158]  David A. Forsyth,et al.  Probabilistic Methods for Finding People , 2001, International Journal of Computer Vision.

[159]  Ramakant Nevatia,et al.  Bayesian human segmentation in crowded situations , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[160]  Cristian Sminchisescu,et al.  Kinematic jump processes for monocular 3D human tracking , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[161]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[162]  Pietro Perona,et al.  Unsupervised Learning of Models for Recognition , 2000, ECCV.

[163]  Alexei A. Efros,et al.  Discovering object categories in image collections , 2005 .

[164]  George Kollios,et al.  BoostMap: A method for efficient approximate similarity rankings , 2004, CVPR 2004.

[165]  James M. Coughlan,et al.  Finding Deformable Shapes Using Loopy Belief Propagation , 2002, ECCV.

[166]  David A. Forsyth,et al.  Finding people by sampling , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[167]  Tomaso A. Poggio,et al.  Example-Based Object Detection in Images by Components , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[168]  Y. LeCun,et al.  Learning methods for generic object recognition with invariance to pose and lighting , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[169]  David J. Fleet,et al.  Priors for people tracking from small training sets , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[170]  Xavier Boyen,et al.  Tractable Inference for Complex Stochastic Processes , 1998, UAI.

[171]  Ramakant Nevatia,et al.  Tracking multiple humans in complex situations , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[172]  Pascal Fua,et al.  3D Human Body Tracking Using Deterministic Temporal Motion Models , 2004, ECCV.

[173]  Michael Isard,et al.  BraMBLe: a Bayesian multiple-blob tracker , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[174]  Matheen Siddiqui,et al.  Robust real-time upper body limb detection and tracking , 2006, VSSN '06.

[175]  D. Marr,et al.  Representation and recognition of the spatial organization of three-dimensional shapes , 1978, Proceedings of the Royal Society of London. Series B. Biological Sciences.

[176]  Margrit Betke,et al.  Detecting Instances of Shape Classes That Exhibit Variable Structure , 2006, ECCV.

[177]  Sudipta N. Sinha,et al.  Camera network calibration from dynamic silhouettes , 2004, CVPR 2004.

[178]  Wim Wiegerinck,et al.  Variational Approximations between Mean Field Theory and the Junction Tree Algorithm , 2000, UAI.

[179]  Luc Van Gool,et al.  Full body tracking from multiple views using stochastic sampling , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[180]  Michael J. Black,et al.  Predicting 3D People from 2D Pictures , 2006, AMDO.

[181]  Hans-Peter Seidel,et al.  A system for articulated tracking incorporating a clothing model , 2007, Machine Vision and Applications.

[182]  Nando de Freitas,et al.  Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks , 2000, UAI.

[183]  Ian D. Reid,et al.  Automatic partitioning of high dimensional search spaces associated with articulated body motion capture , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[184]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[185]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[186]  Jenq-Neng Hwang,et al.  Nonparametric multivariate density estimation: a comparative study , 1994, IEEE Trans. Signal Process..

[187]  Michael Isard,et al.  PAMPAS: real-valued graphical models for computer vision , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[188]  Ankur Agarwal,et al.  Learning to track 3D human motion from silhouettes , 2004, ICML.

[189]  David J. C. Mackay,et al.  Introduction to Monte Carlo Methods , 1998, Learning in Graphical Models.

[190]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[191]  Michael Isard,et al.  Active Contours , 2000, Springer London.

[192]  M. Isard,et al.  Automatic Camera Calibration from a Single Manhattan Image , 2002, ECCV.

[193]  William T. Freeman,et al.  Efficient Multiscale Sampling from Products of Gaussian Mixtures , 2003, NIPS.

[194]  Hans-Hellmut Nagel,et al.  Tracking Persons in Monocular Image Sequences , 1999, Comput. Vis. Image Underst..

[195]  Rui Li,et al.  Monocular Tracking of 3D Human Motion with a Coordinated Mixture of Factor Analyzers , 2006, ECCV.

[196]  Thomas Hofmann,et al.  Probabilistic Latent Semantic Analysis , 1999, UAI.

[197]  S. Sain Multivariate locally adaptive density estimation , 2002 .

[198]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[199]  Andrew Blake,et al.  Articulated body motion capture by annealed particle filtering , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[200]  Tony Lindeberg,et al.  Feature Detection with Automatic Scale Selection , 1998, International Journal of Computer Vision.

[201]  R. Srinivasan Importance Sampling: Applications in Communications and Detection , 2010 .

[202]  Edmond Boyer,et al.  Fusion of multiview silhouette cues using a space occupancy grid , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[203]  Patrick Pérez,et al.  Maintaining multimodality through mixture tracking , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[204]  W. Eric L. Grimson,et al.  Simultaneous Pose Estimation and Camera Calibration from Multiple Views , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[205]  Ian D. Reid,et al.  Articulated Body Motion Capture by Stochastic Search , 2005, International Journal of Computer Vision.

[206]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[207]  Michael J. Black,et al.  Learning the Statistics of People in Images and Video , 2003, International Journal of Computer Vision.

[208]  Long Zhu,et al.  Unsupervised Learning of a Probabilistic Grammar for Object Detection and Parsing , 2006, NIPS.

[209]  Takeo Kanade,et al.  Shape-from-silhouette of articulated objects and its use for human body kinematics estimation and motion capture , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[210]  David A. Forsyth,et al.  Using temporal coherence to build models of animals , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[211]  Van Nostrand,et al.  Error Bounds for Convolutional Codes and an Asymptotically Optimum Decoding Algorithm , 1967 .

[212]  Alex Pentland,et al.  Pfinder: real-time tracking of the human body , 1996, Proceedings of the Second International Conference on Automatic Face and Gesture Recognition.

[213]  Jitendra Malik,et al.  Estimating Human Body Configurations Using Shape Context Matching , 2002, ECCV.

[214]  Michael I. Mandel,et al.  Distributed Occlusion Reasoning for Tracking with Nonparametric Belief Propagation , 2004, NIPS.

[215]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[216]  Nicholas R. Howe,et al.  Boundary Fragment Matching and Articulated Pose Under Occlusion , 2006, AMDO.

[217]  Ankur Agarwal,et al.  3D human pose from silhouettes by relevance vector regression , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[218]  Ankur Agarwal,et al.  Monocular Human Motion Capture with a Mixture of Regressors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[219]  Hans-Peter Seidel,et al.  Learning for Multi-view 3D Tracking in the Context of Particle Filters , 2006, ISVC.

[220]  Adrian Hilton,et al.  Model-based multiple view reconstruction of people , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[221]  Bodo Rosenhahn,et al.  Nonparametric Density Estimation for Human Pose Tracking , 2006, DAGM-Symposium.

[222]  Cristian Sminchisescu,et al.  Hyperdynamics Importance Sampling , 2002, ECCV.

[223]  Michael Isard,et al.  The CONDENSATION Algorithm - Conditional Density Propagation and Applications to Visual Tracking , 1996, NIPS.

[224]  James J. Little,et al.  A Boosted Particle Filter: Multitarget Detection and Tracking , 2004, ECCV.

[225]  N. Gordon,et al.  Novel approach to nonlinear/non-Gaussian Bayesian state estimation , 1993 .

[226]  Wray L. Buntine Operations for Learning with Graphical Models , 1994, J. Artif. Intell. Res..

[227]  C. Sminchisescu,et al.  Variational mixture smoothing for non-linear dynamical systems , 2004, CVPR 2004.

[228]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[229]  Trevor Darrell,et al.  Inferring 3D structure with a statistical image-based shape model , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[230]  Richard Scheines,et al.  Bayesian learning of measurement and structural models , 2006, ICML.

[231]  Lorenzo Torresani,et al.  Learning Motion Style Synthesis from Perceptual Observations , 2006, NIPS.

[232]  Jun S. Liu,et al.  Sequential Imputations and Bayesian Missing Data Problems , 1994 .

[233]  Larry S. Davis,et al.  Multi-camera Tracking and Segmentation of Occluded People on Ground Plane Using Search-Guided Particle Filtering , 2006, ECCV.

[234]  David A. Forsyth,et al.  Automatic Annotation of Everyday Movements , 2003, NIPS.

[235]  Michael Isard,et al.  Attractive People: Assembling Loose-Limbed Models using Non-parametric Belief Propagation , 2003, NIPS.

[236]  Ramakant Nevatia,et al.  Human Pose Tracking Using Multi-level Structured Models , 2006, ECCV.

[237]  Sergey Ioffe,et al.  Human tracking with mixtures of trees , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[238]  Bodo Rosenhahn,et al.  A System for Marker-Less Human Motion Estimation , 2005, DAGM-Symposium.

[239]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[240]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[241]  Rómer Rosales,et al.  Learning Body Pose via Specialized Maps , 2001, NIPS.

[242]  Michael I. Jordan Graphical Models , 1998 .

[243]  Stephen J. McKenna,et al.  Human Pose Estimation Using Learnt Probabilistic Region Similarities and Partial Configurations , 2004, ECCV.

[244]  Michael Isard,et al.  Nonparametric belief propagation , 2010, Commun. ACM.

[245]  D. Haussler,et al.  Protein modeling using hidden Markov models: analysis of globins , 1993, [1993] Proceedings of the Twenty-sixth Hawaii International Conference on System Sciences.