Janus - Multi Source Event Detection and Collection System for Effective Surveillance of Criminal Activity

Recent technological advances provide the opportunity to use large amounts of multimedia data from a multitude of sensors with different modalities (e.g., video, text) for the detection and characterization of criminal activity. Their integration can compensate for sensor and modality deficiencies by using data from other available sensors and modalities. However, building such an integrated system at the scale of neighborhood and cities is challenging due to the large amount of data to be considered and the need to ensure a short response time to potential criminal activity. In this paper, we present a system that enables multi-modal data collection at scale and automates the detection of events of interest for the surveillance and reconnaissance of criminal activity. The proposed system showcases novel analytical tools that fuse multimedia data streams to automatically detect and identify specific criminal events and activities. More specifically, the system detects and analyzes series of incidents (an incident is an occurrence or artifact relevant to a criminal activity extracted from a single media stream) in the spatiotemporal domain to extract events (actual instances of criminal events) while cross-referencing multimodal media streams and incidents in time and space to provide a comprehensive view to a human operator while avoiding information overload. We present several case studies that demonstrate how the proposed system can provide law enforcement personnel with forensic and real time tools to identify and track potential criminal activity. Keywords— multi-source, multi-modal event detection, law enforcement, criminal activity, surveillance, security, safety

[1]  Ramakant Nevatia,et al.  Event Detection and Analysis from Video Streams , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Kentaro Toyama,et al.  Geographic location tags on digital images , 2003, ACM Multimedia.

[3]  Marcel Worring,et al.  Multimodal Video Indexing : A Review of the State-ofthe-art , 2001 .

[4]  Barak Fishbain,et al.  Spatial, Temporal, and Interchannel Image Data Fusion for Long-Distance Terrestrial Observation Systems , 2008 .

[5]  Chen Li,et al.  SKIF-P: a point-based indexing and ranking of web documents for spatial-keyword search , 2012, GeoInformatica.

[6]  Ramakant Nevatia,et al.  Robust Object Tracking by Hierarchical Association of Detection Responses , 2008, ECCV.

[7]  He Ma,et al.  GRVS: a georeferenced video search engine , 2009, MM '09.

[8]  Chen Li,et al.  Hybrid Indexing and Seamless Ranking of Spatial and Textual Features of Web Documents , 2010, DEXA.

[9]  Johannes D. Krijnders,et al.  CASSANDRA: audio-video sensor fusion for aggression detection , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[10]  Daniel P. Huttenlocher,et al.  Location Recognition Using Prioritized Feature Matching , 2010, ECCV.

[11]  Marc Gelgon,et al.  Building and tracking hierarchical geographical & temporal partitions for image collection management on mobile devices , 2005, MULTIMEDIA '05.

[12]  Ord,et al.  Presidential Early Career Awards for Scientists and Engineers , 2014 .

[13]  Roger Zimmermann,et al.  Design and implementation of geo-tagged video search framework , 2010, J. Vis. Commun. Image Represent..

[14]  Sergio A. Velastin,et al.  Intelligent distributed surveillance systems: a review , 2005 .

[15]  Ramakant Nevatia,et al.  Segmentation and Tracking of Multiple Humans in Crowded Environments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Ram Nevatia,et al.  Automatic Tracking and Labeling of Human Activities in a Video Sequence , 2004 .

[17]  Gérard G. Medioni,et al.  Persistent Objects Tracking Across Multiple Non Overlapping Cameras , 2005, 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05) - Volume 1.

[18]  L. Yaroslavsky Processing and Fusion of Thermal and Video Sequences for Terrestrial Long Range Observation Systems , 2004 .

[19]  Ramakant Nevatia,et al.  Detection and Tracking of Multiple, Partially Occluded Humans by Bayesian Combination of Edgelet based Part Detectors , 2007, International Journal of Computer Vision.

[20]  Cyrus Shahabi,et al.  Temporal-Textual Retrieval: Time and Keyword Search in Web Documents , 2012, Int. J. Next Gener. Comput..

[21]  Ramakant Nevatia,et al.  Online Learned Discriminative Part-Based Appearance Models for Multi-human Tracking , 2012, ECCV.

[22]  Philip S. Yu,et al.  Catch the moment: maintaining closed frequent itemsets over a data stream sliding window , 2006, Knowledge and Information Systems.

[23]  Gérard G. Medioni,et al.  Towards a practical PTZ face detection and tracking system , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[24]  Richard Szeliski,et al.  Building Rome in a day , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[25]  Steven M. Seitz,et al.  Scene Segmentation Using the Wisdom of Crowds , 2008, ECCV.

[26]  Nicu Sebe,et al.  Large-scale image and video search: Challenges, technologies, and trends , 2010, J. Vis. Commun. Image Represent..

[27]  Ramakant Nevatia,et al.  High performance object detection by collaborative learning of Joint Ranking of Granules features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[28]  Roger Zimmermann,et al.  Viewable scene modeling for geospatial video search , 2008, ACM Multimedia.