A Hierarchical System for Recognition, Tracking and Pose Estimation

This paper presents a new system for recognition, tracking and pose estimation of people in video sequences. It is based on the wavelet transform from the upper body part and uses Support Vector Machines (SVM) for classification. Recognition is carried out hierarchically by first recognizing people and then individual characters. The characteristic features that best discriminate one person from another are learned automatically. Tracking is solved via a particle filter that utilizes the SVM output and a first order kinematic model to obtain a robust scheme that successfully handles occlusion, different poses and camera zooms. For pose estimation a collection of SVM classifiers is evaluated to detect specific, learned poses.

[1]  Thomas S. Huang,et al.  Human face detection in a complex background , 1994, Pattern Recognit..

[2]  Tomaso A. Poggio,et al.  People recognition and pose estimation in image sequences , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.

[3]  Tomaso A. Poggio,et al.  Image representations for object detection using kernel classifiers , 2000 .

[4]  Michael Isard,et al.  A mixed-state condensation tracker with automatic model-switching , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[5]  Michael Isard,et al.  BraMBLe: a Bayesian multiple-blob tracker , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[6]  Luc Van Gool,et al.  Efficient pedestrian detection : a test case for SVM based categorization , 2002 .

[7]  Takeo Kanade,et al.  A statistical method for 3D object detection applied to faces and cars , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[8]  M. Burl,et al.  Face Localization via Shape Statistics , 1995 .

[9]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  Cordelia Schmid,et al.  Face Detection and Tracking in a Video by Propagating Detection Probabilities , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Dariu Gavrila,et al.  Multimodal Shape Tracking with Point Distribution Models , 2002, DAGM-Symposium.

[12]  Tomaso A. Poggio,et al.  Pedestrian detection using wavelet templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[14]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[15]  Takeo Kanade,et al.  Neural Network-Based Face Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Paul A. Viola,et al.  Robust Real-time Object Detection , 2001 .