Multi-camera activity correlation analysis

We propose a novel approach for modelling correlations between activities in a busy public space captured by multiple non-overlapping and uncalibrated cameras. In our approach, each camera view is automatically decomposed into semantic regions, across which different spatio-temporal activity patterns are observed. A novel Cross Canonical Correlation Analysis (xCCA) framework is formulated to detect and quantify temporal and causal relationships between regional activities within and across camera views. The approach accomplishes three tasks: (1) estimate the spatial and temporal topology of the camera network; (2) facilitate more robust and accurate person re-identification; (3) perform global activity modelling and video temporal segmentation by linking visual evidence collected across camera views. Our approach differs from the state of the art in that it does not rely on either intra or inter camera tracking. It therefore can be applied to even the most challenging video surveillance settings featured with severe occlusions and extremely low spatial and temporal resolutions. Its effectiveness is demonstrated using 153 hours of videos from 8 cameras installed in a busy underground station.

[1]  Shaogang Gong,et al.  Minimum Cuts of A Time-Varying Background , 2006, BMVC.

[2]  W. Eric L. Grimson,et al.  Correspondence-Free Activity Analysis and Scene Modeling in Multiple Camera Views , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Shaogang Gong,et al.  Multi-camera Matching using Bi-Directional Cumulative Brightness Transfer Functions , 2008, BMVC.

[4]  Mubarak Shah,et al.  Appearance modeling for tracking in multiple non-overlapping cameras , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[5]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[6]  W. Eric L. Grimson,et al.  Learning Semantic Scene Models by Trajectory Analysis , 2006, ECCV.

[7]  S. Gong,et al.  Global Abnormal Behaviour Detection Using a Network of CCTV Cameras , 2008 .

[8]  Tieniu Tan,et al.  Principal axis-based correspondence between multiple cameras for people tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[10]  Pietro Perona,et al.  Self-Tuning Spectral Clustering , 2004, NIPS.

[11]  Shaogang Gong,et al.  Activity based surveillance video content modelling , 2008, Pattern Recognit..

[12]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Shaogang Gong,et al.  Scene Segmentation for Behaviour Correlation , 2008, ECCV.

[14]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[15]  Lily Lee,et al.  Monitoring Activities from Multiple Video Streams: Establishing a Common Coordinate Frame , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Dimitrios Makris,et al.  Bridging the gaps between cameras , 2004, CVPR 2004.

[17]  W. Eric L. Grimson,et al.  Inference of non-overlapping camera network topology by measuring statistical dependence , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[18]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[19]  Mubarak Shah,et al.  Tracking across multiple cameras with disjoint views , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[20]  Jeff Dean,et al.  Time Series , 2009, Encyclopedia of Database Systems.