Video representation and suspicious event detection using semantic technologies

Due to the widespread deployment of Surveillance Systems and IoT applications, the amount of surveillance data is massively on the rise. Storing and analyzing video surveillance data is a significant challenge, requiring video interpretation and event detection along with related context. Low-level features from multimedia content are extracted and represented in symbolic form. These features include shape, texture, and color information of the multimedia content. In this work, a methodology is proposed, which extracts the salient features and properties using machine learning techniques typical of the surveillance domain, and represents the information using a domain ontology tailored explicitly for the detection of certain activities. An ontology is developed to include concepts and properties which may be applicable in the domain of surveillance and its applications. Extracted features are represented as Linked Data using an ontology. The proposed approach is validated with actual implementation and is thus evaluated by recognizing suspicious activity in an open parking space. The suspicious activity detection is formalized through inference rules and SPARQL queries. Eventually, Semantic Web Technology has proven to be a remarkable toolchain to interpret videos, thus opening novel possibilities for video scene representation, and detection of complex events, without any human involvement. As per the best of our knowledge about the literature of this domain, we claim that there is no existing method that can represent frame-level information of a video in structured representation and perform event detection, reducing storage and enhancing semantically-aided retrieval of video data. A video dataset of six different, and unusual, suspicious activities has also been built, which can be useful to solve problems related to activity recognition in other smart parking scenarios.

[1]  Giuseppe Polese,et al.  EDCAR: A knowledge representation framework to enhance automatic video surveillance , 2019, Expert Syst. Appl..

[2]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[3]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[4]  H. Lan,et al.  SWRL : A semantic Web rule language combining OWL and ruleML , 2004 .

[5]  Ioan Marius Bilasco,et al.  Events Detection Using a Video-Surveillance Ontology and a Rule-Based Approach , 2014, ECCV Workshops.

[6]  Ezzeddine Zagrouba,et al.  Abnormal behavior recognition for intelligent video surveillance systems: A review , 2018, Expert Syst. Appl..

[7]  Gian Luca Foresti,et al.  Event classification for automatic visual-based surveillance of parking lots , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[8]  Christian Wolf,et al.  Glimpse Clouds: Human Activity Recognition from Unstructured Feature Points , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[9]  Jens Lehmann,et al.  DBpedia: A Nucleus for a Web of Open Data , 2007, ISWC/ASWC.

[10]  Lexing Xie,et al.  Event Mining in Multimedia Streams , 2008, Proceedings of the IEEE.

[11]  Miguel A. Patricio,et al.  Ontology-based context representation and reasoning for object tracking and scene interpretation in video , 2011, Expert Syst. Appl..

[12]  Ngoc Thanh Nguyen,et al.  A collaborative algorithm for semantic video annotation using a consensus-based social network analysis , 2015, Expert Syst. Appl..

[13]  Ramakant Nevatia,et al.  An Ontology for Video Event Representation , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[14]  Vivek Kumar Verma,et al.  Counting and Classification of Vehicle Through Virtual Region for Private Parking Solution , 2018 .

[15]  Jia Chen,et al.  Generating Video Descriptions With Latent Topic Guidance , 2019, IEEE Transactions on Multimedia.

[16]  Afshin Dehghan,et al.  GMMCP tracker: Globally optimal Generalized Maximum Multi Clique problem for multiple object tracking , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Irfan A. Essa,et al.  Structure from Statistics - Unsupervised Activity Analysis using Suffix Trees , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Rong Yan,et al.  Can High-Level Concepts Fill the Semantic Gap in Video Retrieval? A Case Study With Broadcast News , 2007, IEEE Transactions on Multimedia.

[19]  Lan Chen,et al.  Semantic based representing and organizing surveillance big data using video structural description technology , 2015, J. Syst. Softw..

[20]  James M. Ferryman,et al.  PETS 2016: Dataset and Challenge , 2016, CVPR Workshops.

[21]  Marcel Worring,et al.  Concept-Based Video Retrieval , 2009, Found. Trends Inf. Retr..

[22]  Tao Mei,et al.  Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[23]  Leslie F. Sikos,et al.  VidOnt: a core reference ontology for reasoning over video scenes* , 2018, J. Inf. Telecommun..

[24]  Khalid Mahmood,et al.  Cloud based sports analytics using semantic web tools and technologies , 2015, 2015 IEEE 4th Global Conference on Consumer Electronics (GCCE).

[25]  Yannis Avrithis,et al.  Personalized Content Retrieval in Context Using Ontological Knowledge , 2007, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Yucheng Liu,et al.  A Surround View Camera Solution for Embedded Systems , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[27]  Yi Yang,et al.  DevNet: A Deep Event Network for multimedia event detection and evidence recounting , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Jianping Fan,et al.  Incorporating Concept Ontology for Hierarchical Video Classification, Annotation, and Visualization , 2007, IEEE Transactions on Multimedia.

[29]  Qiang Ji,et al.  Video event recognition with deep hierarchical context model , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Xuelong Li,et al.  Modality Mixture Projections for Semantic Video Event Detection , 2008, IEEE Transactions on Circuits and Systems for Video Technology.

[31]  Marc Tschentscher,et al.  A simulated car-park environment for the evaluation of video-based on-site parking guidance systems , 2017, 2017 IEEE Intelligent Vehicles Symposium (IV).

[32]  Martin J. O'Connor,et al.  A Method for Representing and Querying Temporal Information in OWL , 2010, BIOSTEC.

[33]  David M. W. Powers,et al.  Knowledge-Driven Video Information Retrieval with LOD: From Semi-Structured to Structured Video Metadata , 2015, ESAIR@CIKM.

[34]  Tahir Nawaz,et al.  PETS 2014: Dataset and challenge , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[35]  Leslie F. Sikos A Novel Approach to Multimedia Ontology Engineering for Automated Reasoning over Audiovisual LOD Datasets , 2016, ACIIDS.

[36]  Mark A. Musen,et al.  The protégé project: a look back and a look forward , 2015, SIGAI.

[37]  John R. Smith,et al.  Large-scale concept ontology for multimedia , 2006, IEEE MultiMedia.

[38]  Leslie F. Sikos Description Logics in Multimedia Reasoning , 2017, Springer International Publishing.

[39]  O. P. Vyas,et al.  Ontology-Based Multi-agent Smart Bike Sharing System (SBSS) , 2018, 2018 IEEE International Conference on Smart Computing (SMARTCOMP).

[40]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  S. Arivazhagan,et al.  Versatile loitering detection based on non-verbal cues using dense trajectory descriptors , 2019 .

[42]  Adel M. Alimi,et al.  A fuzzy ontology: based framework for reasoning in visual video content analysis and indexing , 2011, MDMKDD '11.

[43]  Alexander G. Hauptmann,et al.  Which Information Sources are More Effective and Reliable in Video Search , 2016, SIGIR.

[44]  Benjamin Z. Yao,et al.  Unsupervised learning of event AND-OR grammar and semantics from video , 2011, 2011 International Conference on Computer Vision.

[45]  Christian Morbidoni,et al.  A Collaborative Video Annotation System Based on Semantic Web Technologies , 2012, Cognitive Computation.

[46]  Francisco Falcone,et al.  Ontology Based Road Traffic Management in Emergency Situations , 2014, Ad Hoc Sens. Wirel. Networks.

[47]  Xindong Wu,et al.  Video data mining: semantic indexing and event detection from the association perspective , 2005, IEEE Transactions on Knowledge and Data Engineering.

[48]  Ramakant Nevatia,et al.  VERL: An Ontology Framework for Representing and Annotating Video Events , 2005, IEEE Multim..

[49]  Tim Berners-Lee,et al.  Linked Data - The Story So Far , 2009, Int. J. Semantic Web Inf. Syst..

[50]  Bodo Rosenhahn,et al.  Security Event Recognition for Visual Surveillance , 2018, ArXiv.

[51]  Guizhong Liu,et al.  A semantic framework for video genre classification and event analysis , 2010, Signal Process. Image Commun..