Thetrackingandreconstructionof articulatedhumanmotionin 3D is a problemthat hasattracteda greatdealof interestin thelastyears.A systemthatrecovers3D body posefrom videosequences hasapplicationsin vision-basedhuman-computer interaction, marker-lessmotion capture,animation,surveillanceandentertainment suchas computergames. Thefast,non-linearmotionandcomplicatedappearanceof humansandthelarge numberdegreesof freedomof thehumanbodymake thetrackingproblema difficult one. To addresstheseproblems,a systemfor trackingandreconstructionof human motion in 3D shouldpossessthe following: A strongmodel for the appearanceof humansin images;amodelof how peoplemove;andaneffectivestrategy for searching for the right posein eachtime step. In previously presentedsystems,the most commonway of addressingtheseissueshasbeento constrainthe problemdomain. The appearanceof humanscould be constrainedby assumingcertainclothing anda largecontrastbetweenthehumanandthebackground.Furthermore,by addingmore cameraviews,moreinformationaboutthe3D poseof thehumancanbeextractedand ambiguitiesreduced,thusmakingtheproblemeasier . The goal of this thesisis to investigateto which extent the general problemof tracking and reconstructinghumanmotion can be solved, using only a monocular cameraview. Thus, no assumptionsof the appearanceof either the humanor the backgroundareintroduced. Thethesismakesthreecontributions:A probabilisticframework for thearticulated trackingof humanfiguresin 3D; afilter-basedlearnedmodelof humanappearancein imagesandimagesequences; andthreedifferenttypesof modelsof humanmotion, intendedto constrainthesearchin eachtimestepof thetracking.Successful tracking resultsusingthehumanappearancemodelandall threemotionmodelsarepresented. Amongthequestionsleft openis theissueof initialization, a difficult problemin the high-dimensional searchspaceof anarticulatedmodelin 3D. Thecontributionsof this thesisprovideasmallsteponthewaytowardsrobustand accuratearticulated3D trackingof humansin monocularsequences.
[1]
Yang Song,et al.
Towards detection of human motion
,
2000,
Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).
[2]
Michael Isard,et al.
ICONDENSATION: Unifying Low-Level and High-Level Tracking in a Stochastic Framework
,
1998,
ECCV.
[3]
Michael Isard,et al.
Partitioned Sampling, Articulated Objects, and Interface-Quality Hand Tracking
,
2000,
ECCV.
[4]
Michael J. Black,et al.
Parameterized Modeling and Recognition of Activities
,
1999,
Comput. Vis. Image Underst..
[5]
Daniel L. Ruderman,et al.
Origins of scaling in natural images
,
1996,
Vision Research.
[6]
James O. Ramsay.
Functional Data Analysis
,
2005
.
[7]
D. Field,et al.
Natural image statistics and efficient coding.
,
1996,
Network.
[8]
Alex Pentland,et al.
Pfinder: Real-Time Tracking of the Human Body
,
1997,
IEEE Trans. Pattern Anal. Mach. Intell..
[9]
D. McNeill.
Hand and Mind: What Gestures Reveal about Thought
,
1992
.