Multi-camera spatio-temporal fusion and biased sequence-data learning for security surveillance

We present a framework for multi-camera video surveillance. The framework consists of three phases: detection, representation, and recognition. The detection phase handles multi-source spatio-temporal data fusion for efficiently and reliably extracting motion trajectories from video. The representation phase summarizes raw trajectory data to construct hierarchical, invariant, and content-rich descriptions of the motion events. Finally, the recognition phase deals with event classification and identification on the data descriptors. Because of space limits, we describe only briefly how we detect and represent events, but we provide in-depth treatment on the third phase: event recognition. For effective recognition, we devise a sequence-alignment kernel function to perform sequence data learning for identifying suspicious events. We show that when the positive training instances (i.e., suspicious events) are significantly outnumbered by the negative training instances (benign events), then SVMs (or any other learning methods) can suffer a high incidence of errors. To remedy this problem, we propose the kernel boundary alignment (KBA) algorithm to work with the sequence-alignment kernel. Through empirical study in a parking-lot surveillance setting, we show that our spatio-temporal fusion scheme and biased sequence-data learning method are highly effective in identifying suspicious events.

[1]  Takeo Kanade,et al.  A System for Video Surveillance and Monitoring , 2000 .

[2]  H.F. Durrant-Whyte,et al.  A new approach for filtering nonlinear systems , 1995, Proceedings of 1995 American Control Conference - ACC'95.

[3]  Tieniu Tan,et al.  Visual Vehicle Tracking Using An Improved EKF , 2002 .

[4]  Vassilios Morellas,et al.  Two Examples of Indoor and Outdoor Surveillance Systems: Motivation, Design, and Testing , 2002 .

[5]  Larry S. Davis,et al.  Model-based object pose in 25 lines of code , 1992, International Journal of Computer Vision.

[6]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Thomas Ertl,et al.  Computer Graphics - Principles and Practice, 3rd Edition , 2014 .

[8]  Padhraic Smyth,et al.  Pattern discovery in sequences under a Markov assumption , 2002, KDD.

[9]  Xinhua Zhuang,et al.  Pose estimation from corresponding point data , 1989, IEEE Trans. Syst. Man Cybern..

[10]  Keinosuke Fukunaga,et al.  Introduction to statistical pattern recognition (2nd ed.) , 1990 .

[11]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[12]  Jeffrey E. Boyd,et al.  MPI-Video infrastructure for dynamic environments , 1998, Proceedings. IEEE International Conference on Multimedia Computing and Systems (Cat. No.98TB100241).

[13]  Christopher J. C. Burges,et al.  Geometry and invariance in kernel based methods , 1999 .

[14]  Jason Weston,et al.  Mismatch string kernels for discriminative protein classification , 2004, Bioinform..

[15]  C. Watkins Dynamic Alignment Kernels , 1999 .

[16]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[17]  David Haussler,et al.  Using the Fisher Kernel Method to Detect Remote Protein Homologies , 1999, ISMB.

[18]  BozkayaTolga,et al.  Distance-based indexing for high-dimensional metric spaces , 1997 .

[19]  Steven K. Feiner,et al.  Computer graphics: principles and practice (2nd ed.) , 1990 .

[20]  Edward Y. Chang,et al.  Adaptive Feature-Space Conformal Transformation for Imbalanced-Data Learning , 2003, ICML.

[21]  Ronald Azuma,et al.  Predictive tracking for augmented reality , 1995 .

[22]  Si Wu,et al.  Improving support vector machine classifiers by modifying kernel functions , 1999, Neural Networks.

[23]  Gang Xu,et al.  Epipolar Geometry in Stereo, Motion and Object Recognition , 1996, Computational Imaging and Vision.

[24]  Gerald Farin,et al.  Curves and surfaces for computer aided geometric design , 1990 .

[25]  Greg Welch,et al.  An Introduction to Kalman Filter , 1995, SIGGRAPH 2001.

[26]  Geoffrey D. Sullivan,et al.  Filter for Car Tracking Based on Acceleration and Steering Angle , 1996, BMVC.

[27]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[28]  Bernhard Schölkopf,et al.  Dynamic Alignment Kernels , 2000 .

[29]  Kenichi Kanatani Optimal Homography Computation with a Reliability Measure , 1998, MVA.

[30]  Yoshua Bengio,et al.  Markovian Models for Sequential Data , 2004 .

[31]  Ioannis Pavlidis,et al.  Urban surveillance systems: from the laboratory to the commercial world , 2001, Proc. IEEE.

[32]  Robert Grover Brown,et al.  Introduction to random signal analysis and Kalman filtering , 1983 .

[33]  Olivier Faugeras,et al.  Three-Dimensional Computer Vision , 1993 .

[34]  Pramod K. Varshney,et al.  Multisensor surveillance systems based on image and video data , 2002, Proceedings. International Conference on Image Processing.

[35]  Jason Weston,et al.  Mismatch String Kernels for SVM Protein Classification , 2002, NIPS.

[36]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[37]  Ramin Zabih,et al.  Bayesian multi-camera surveillance , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[38]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[39]  O. Faugeras Three-dimensional computer vision: a geometric viewpoint , 1993 .

[40]  Gerald E. Farin,et al.  Curves and surfaces for computer-aided geometric design - a practical guide, 4th Edition , 1997, Computer science and scientific computing.

[41]  Fadi Dornaika,et al.  Object Pose: The Link between Weak Perspective, Paraperspective, and Full Perspective , 1997, International Journal of Computer Vision.

[42]  Eleazar Eskin,et al.  The Spectrum Kernel: A String Kernel for SVM Protein Classification , 2001, Pacific Symposium on Biocomputing.

[43]  N E Manos,et al.  Stochastic Models , 1960, Encyclopedia of Social Network Analysis and Mining. 2nd Ed..

[44]  Lily Lee,et al.  Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  David Haussler,et al.  Probabilistic kernel regression models , 1999, AISTATS.

[46]  Thad Starner,et al.  Visual Recognition of American Sign Language Using Hidden Markov Models. , 1995 .

[47]  Harvey Cohn,et al.  Conformal Mapping on Riemann Surfaces , 1967 .

[48]  David G. Stork,et al.  Pattern Classification , 1973 .