Moving people tracking with detection by latent semantic analysis for visual surveillance applications

The latent semantic analysis (LSA) has been widely used in the fields of computer vision and pattern recognition. Most of the existing works based on LSA focus on behavior recognition and motion classification. In the applications of visual surveillance, accurate tracking of the moving people in surveillance scenes, is regarded as one of the preliminary requirement for other tasks such as object recognition or segmentation. However, accurate tracking is extremely hard under challenging surveillance scenes where similarity among multiple objects or occlusion among multiple objects occurs. Usual temporal Markov chain based tracking algorithms suffer from the ‘tracking error accumulation problem’. The accumulated errors can finally make the tracking to drift from the target. To handle the problem of tracking drift, some authors have proposed the idea of using detection along with tracking as an effective solution. However, many of the critical issues still remain unsettled in these detection based tracking algorithms. In this paper, we propose a novel moving people tracking with detection based on (probabilistic) LSA. By employing a novel ‘twin-pipeline’ training framework to find the latent semantic topics of ‘moving people’, the proposed detection can effectively detect the interest points on moving people in different indoor and outdoor environments with camera motion. Since the detected interest points on different body parts can be used to locate the position of moving people more accurately, by combining the detection with incremental subspace learning based tracking, the proposed algorithms resolves the problem of tracking drift during each target appearance update process. In addition, due to the time independent processing mechanism of detection, the proposed method is also able to handle the error accumulation problem. The detection can calibrate the tracking errors during updating of each state of the tracking algorithm. Extensive, experiments on various surveillance environments using different benchmark datasets have proved the accuracy and robustness of the proposed tracking algorithm. Further, the experimental comparison results clearly show that the proposed tracking algorithm outperforms the well known tracking algorithms such as ISL, AMS and WSL algorithms. Furthermore, the speed performance of the proposed method is also satisfactory for realistic surveillance applications.

[1]  Horst Bischof,et al.  Semi-supervised On-Line Boosting for Robust Tracking , 2008, ECCV.

[2]  Ivan Laptev,et al.  Density-aware person detection and tracking in crowds , 2011, ICCV.

[3]  Fabrice Souvannavong,et al.  Improved Video Content Indexing by Multiple Latent Semantic Analysis , 2004, CIVR.

[4]  Cordelia Schmid,et al.  Scale & Affine Invariant Interest Point Detectors , 2004, International Journal of Computer Vision.

[5]  Serge J. Belongie,et al.  Behavior recognition via sparse spatio-temporal features , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[6]  Juan Carlos Niebles,et al.  Unsupervised Learning of Human Action Categories Using Spatial-Temporal Words , 2006, BMVC.

[7]  Barbara Caputo,et al.  Recognizing human actions: a local SVM approach , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[8]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[9]  Ian D. Reid,et al.  Stable multi-target tracking in real-time surveillance video , 2011, CVPR 2011.

[10]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[11]  Ming-Hsuan Yang,et al.  Incremental Learning for Robust Visual Tracking , 2008, International Journal of Computer Vision.

[12]  Shai Avidan,et al.  Ensemble Tracking , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Dorin Comaniciu,et al.  Kernel-Based Object Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Michael Lindenbaum,et al.  Sequential Karhunen-Loeve basis extraction and its application to images , 1998, Proceedings 1998 International Conference on Image Processing. ICIP98 (Cat. No.98CB36269).

[16]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[17]  Cordelia Schmid,et al.  Evaluation of Interest Point Detectors , 2000, International Journal of Computer Vision.

[18]  Junseok Kwon,et al.  Visual tracking decomposition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Mubarak Shah,et al.  Foreground Segmentation in Surveillance Scenes Containing a Door , 2007, 2007 IEEE International Conference on Multimedia and Expo.

[20]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[22]  Roberto Cipolla,et al.  Extracting Spatiotemporal Interest Points using Global Information , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Kyoung Mu Lee,et al.  Visual tracking via geometric particle filtering on the affine group with optimal importance functions , 2009, CVPR.

[24]  Shai Avidan,et al.  Support Vector Tracking , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[26]  Jiri Matas,et al.  P-N learning: Bootstrapping binary classifiers by structural constraints , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[27]  Fabrice Souvannavong,et al.  Enhancing latent semantic analysis video object retrieval with structural information , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[28]  David J. Fleet,et al.  Robust Online Appearance Models for Visual Tracking , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Thomas Hofmann,et al.  Probabilistic latent semantic indexing , 1999, SIGIR '99.

[30]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.

[31]  Robert T. Collins,et al.  Mean-shift blob tracking through scale space , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[32]  Benoit Huet,et al.  Structurally enhanced latent semantic analysis for video object retrieval , 2005 .

[33]  Tsuhan Chen,et al.  Semantic-Shift for Unsupervised Object Detection , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[34]  Michael J. Black,et al.  EigenTracking: Robust Matching and Tracking of Articulated Objects Using a View-Based Representation , 1996, International Journal of Computer Vision.