A human-like description of scene events for a proper UAV-based video content analysis

Abstract In Video Surveillance age, the monitoring activity, especially from unmanned vehicles, needs some degree of autonomy in the scenario interpretation. Video Analysis tasks are crucial for the target tracking and recognition; anyway, it would be desirable if a further level of understanding could provide a comprehensive, high-level scene description, by reflecting that human cognitive capability of providing a concise scene description that comes from the analysis of involved objects relationships and actions. This paper presents a smart system to identify mobile scene objects, such as people, vehicles, automatically, by analyzing the videos acquired by drones in flight, along with the activities they carried out, so as to depict what it happens in the scene from a high-level perspective. The system uses Artificial Vision methods to detect and track the mobile objects and the area where they move, and Semantic Web technologies to provide a high-level description of the scenario. Spatio/temporal relations among the tracked objects as well as simple object activities (events) are described. By semantic reasoning, the system is able to connect the simple activities into more complex activities, that better reflect a human-like description of a scenario portion. Tests conducted on several videos, showing scenarios set in different environments, return convincing results which affirm the effectiveness of the proposed approach.

[1]  S. M. Mahbubur Rahman,et al.  Video-based tracking of vehicles using multiple time-spatial images , 2016, Expert Syst. Appl..

[2]  Liming Chen,et al.  Combining ontological and temporal formalisms for composite activity modelling and recognition in smart homes , 2014, Future Gener. Comput. Syst..

[3]  Ran He,et al.  Image Piece Learning for Weakly Supervised Semantic Segmentation , 2017, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[4]  Ihn-Han Bae,et al.  An ontology-based approach to ADL recognition in smart homes , 2014, Future Gener. Comput. Syst..

[5]  Miguel A. Patricio,et al.  Ontology-based context representation and reasoning for object tracking and scene interpretation in video , 2011, Expert Syst. Appl..

[6]  Nannan Li,et al.  Multi-scale analysis of contextual information within spatio-temporal video volumes for anomaly detection , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[7]  Macarena Espinilla,et al.  Ontology-based feature generation to improve accuracy of activity recognition in smart environments , 2018, Comput. Electr. Eng..

[8]  Vittoria Bruni,et al.  An Improvement of Kernel-Based Object Tracking Based on Human Perception , 2014, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[9]  Eduardo Mena,et al.  A formalization for semantic location granules , 2013, Int. J. Geogr. Inf. Sci..

[10]  Giuseppe D’Aniello,et al.  Effective Quality-Aware Sensor Data Management , 2018, IEEE Transactions on Emerging Topics in Computational Intelligence.

[11]  Alessia Saggese,et al.  Semantically Enhanced UAVs to Increase the Aerial Scene Understanding , 2019, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[12]  Tao Gu,et al.  Object relevance weight pattern mining for activity recognition and segmentation , 2010, Pervasive Mob. Comput..

[13]  Georgios Meditskos,et al.  iKnow: Ontology-driven situational awareness for the recognition of activities of daily living , 2017, Pervasive Mob. Comput..

[14]  Jenq-Neng Hwang,et al.  Underwater Fish Tracking for Moving Cameras Based on Deformable Multiple Kernels , 2016, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[15]  Yonghua Zhou,et al.  Parallel computing method of deep belief networks and its application to traffic flow prediction , 2019, Knowl. Based Syst..

[16]  Alessandra Mileo,et al.  Real-time data analytics and event detection for IoT-enabled communication systems , 2017, J. Web Semant..

[17]  Bhushan Nemade Automatic Traffic Surveillance Using Video Tracking , 2016 .

[18]  Alessia Saggese,et al.  Multi-Object Tracking by Flying Cameras Based on a Forward-Backward Interaction , 2018, IEEE Access.

[19]  Yiannis Kompatsiaris,et al.  Activity detection using Sequential Statistical Boundary Detection (SSBD) , 2016, Comput. Vis. Image Underst..

[20]  Yu Zhang,et al.  Recognition of pedestrian activity based on dropped-object detection , 2018, Signal Process..

[21]  Jesús García,et al.  Context-based Information Fusion: A survey and discussion , 2015, Inf. Fusion.

[22]  Hua-Tsung Chen,et al.  Deep learning-based human activity analysis for aerial images , 2017, 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS).

[23]  Alessia Saggese,et al.  A real time algorithm for people tracking using contextual reasoning , 2013, Comput. Vis. Image Underst..

[24]  Pierluigi Ritrovato,et al.  A knowledge-based approach for video event detection using spatio-temporal sliding windows , 2017, 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[25]  Cordelia Schmid,et al.  Actom sequence models for efficient action detection , 2011, CVPR 2011.

[26]  Hui Lin,et al.  Representing place locales using scene elements , 2018, Comput. Environ. Urban Syst..