Automatic tracking of human motion in indoor scenes across multiple synchronized video streams

This paper presents a comprehensive framework for tracking moving humans in an indoor environment from sequences of synchronized monocular grayscale images captured from multiple fixed cameras. The proposed framework consists of three main modules: Single View Tracking (SVT), Multiple View Transition Tracking (MVTT), and Automatic Camera Switching (ACS). Bayesian classification schemes based on motion analysis of human features are used to track (spatially and temporally) a subject image of interest between consecutive frames. The automatic camera switching module predicts the position of the subject along a spatial-temporal domain, and then, selects the camera which provides the best view and requires the least switching to continue tracking. Limited degrees of occlusion are tolerated within the system. Tracking is based upon the images of upper human, bodies captured from various viewing angles, and non-human moving objects are excluded using Principal Component Analysis (PCA). Experimental results are presented to evaluate the performance of the tracking system.

[1]  Seiji Inokuchi,et al.  CAD-based object tracking with distributed monocular camera for security monitoring , 1994, Proceedings of 1994 IEEE 2nd CAD-Based Vision Workshop.

[2]  Jake K. Aggarwal,et al.  Tracking human motion using multiple cameras , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[3]  J. Sklansky,et al.  Segmentation of people in motion , 1991, Proceedings of the IEEE Workshop on Visual Motion.

[4]  Ramesh C. Jain,et al.  Multiple perspective interactive video , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[5]  Jake K. Aggarwal,et al.  Tracking human motion in an indoor environment , 1995, Proceedings., International Conference on Image Processing.

[6]  Ramesh C. Jain,et al.  An architecture for multiple perspective interactive video , 1995, MULTIMEDIA '95.