Probabilistic Tracking and Reconstruction of 3D Human Motion in Monocular Video Sequences

Thetrackingandreconstructionof articulatedhumanmotionin 3D is a problemthat hasattracteda greatdealof interestin thelastyears.A systemthatrecovers3D body posefrom videosequences hasapplicationsin vision-basedhuman-computer interaction, marker-lessmotion capture,animation,surveillanceandentertainment suchas computergames. Thefast,non-linearmotionandcomplicatedappearanceof humansandthelarge numberdegreesof freedomof thehumanbodymake thetrackingproblema difficult one. To addresstheseproblems,a systemfor trackingandreconstructionof human motion in 3D shouldpossessthe following: A strongmodel for the appearanceof humansin images;amodelof how peoplemove;andaneffectivestrategy for searching for the right posein eachtime step. In previously presentedsystems,the most commonway of addressingtheseissueshasbeento constrainthe problemdomain. The appearanceof humanscould be constrainedby assumingcertainclothing anda largecontrastbetweenthehumanandthebackground.Furthermore,by addingmore cameraviews,moreinformationaboutthe3D poseof thehumancanbeextractedand ambiguitiesreduced,thusmakingtheproblemeasier . The goal of this thesisis to investigateto which extent the general problemof tracking and reconstructinghumanmotion can be solved, using only a monocular cameraview. Thus, no assumptionsof the appearanceof either the humanor the backgroundareintroduced. Thethesismakesthreecontributions:A probabilisticframework for thearticulated trackingof humanfiguresin 3D; afilter-basedlearnedmodelof humanappearancein imagesandimagesequences; andthreedifferenttypesof modelsof humanmotion, intendedto constrainthesearchin eachtimestepof thetracking.Successful tracking resultsusingthehumanappearancemodelandall threemotionmodelsarepresented. Amongthequestionsleft openis theissueof initialization, a difficult problemin the high-dimensional searchspaceof anarticulatedmodelin 3D. Thecontributionsof this thesisprovideasmallsteponthewaytowardsrobustand accuratearticulated3D trackingof humansin monocularsequences.

[1]  Yang Song,et al.  Towards detection of human motion , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[2]  Michael Isard,et al.  ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework , 1998, ECCV.

[3]  Michael Isard,et al.  Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking , 2000, ECCV.

[4]  Michael J. Black,et al.  Parameterized Modeling and Recognition of Activities , 1999, Comput. Vis. Image Underst..

[5]  Daniel L. Ruderman,et al.  Origins of scaling in natural images , 1996, Vision Research.

[6]  James O. Ramsay Functional Data Analysis , 2005 .

[7]  D. Field,et al.  Natural image statistics and efficient coding. , 1996, Network.

[8]  Alex Pentland,et al.  Pfinder: Real-Time Tracking of the Human Body , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  D. McNeill Hand and Mind: What Gestures Reveal about Thought , 1992 .