Automatic Tracking and Labeling of Human Activities in a Video Sequence

This paper presents a novel approach for tracking multiple objects and a statistical learning approach for detection of human activities in a video sequence. For the tracking, a rigid transformation invariant appearance model combining color and edge information of the detected blob is proposed. For the activity detection, each activity label is regarded as a hypothesis. Given some labeled sequences, a group of features are first extracted from motion trajectories of each detected object and the likelihood of each feature under that hypothesis is calculated. A dynamic programming-based training algorithm is applied to get an optimal classifier for each feature. Then it selects the classifiers with the most discriminative power and combines them to form a stronger classifier. This algorithm complies with criterion so that it is guaranteed to achieve a specified detection rate as well as a minimized false alarm rate. Results on dataset 1show the effectiveness of the proposed algorithm.

[1]  Pramod K. Varshney,et al.  A Tight Upper Bound on the Bayesian Probability of Error , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Ronald L. Rivest,et al.  Introduction to Algorithms, Second Edition , 2001 .

[3]  Yifan Shi,et al.  P-Net: A Representation for Partially-Sequenced, Multi-stream Activity , 2003, 2003 Conference on Computer Vision and Pattern Recognition Workshop.

[4]  Thomas S. Huang,et al.  JPDAF based HMM for real-time contour tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Ramakant Nevatia,et al.  Large-scale event detection using semi-hidden Markov models , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Harry L. Van Trees,et al.  Detection, Estimation, and Modulation Theory, Part I , 1968 .

[7]  Jitendra Malik,et al.  Learning a discriminative classifier using shape context distances , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[8]  Isaac Cohen,et al.  Inference of human postures by classification of 3D human body shape , 2003, 2003 IEEE International SOI Conference. Proceedings (Cat. No.03CH37443).

[9]  S. Intille,et al.  Recognizing planned, multi-person action , 2022 .

[10]  H. V. Trees Detection, Estimation, And Modulation Theory , 2001 .

[11]  Clifford Stein,et al.  Introduction to Algorithms, 2nd edition. , 2001 .

[12]  H. Buxton,et al.  Advanced visual surveillance using Bayesian networks , 1997 .

[13]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[14]  Junji Yamato,et al.  Recognizing human action in time-sequential images using hidden Markov model , 1992, Proceedings 1992 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[15]  Ronald L. Rivest,et al.  Introduction to Algorithms , 1990 .

[16]  Gérard G. Medioni,et al.  Continuous tracking within and across camera streams , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[17]  Larry S. Davis,et al.  Probabilistic framework for segmenting people under occlusion , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[18]  Seong-Whan Lee,et al.  Multiple people tracking using an appearance model based on temporal color , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.