Normalized Metadata Generation for Human Retrieval Using Multiple Video Surveillance Cameras

Since it is impossible for surveillance personnel to keep monitoring videos from a multiple camera-based surveillance system, an efficient technique is needed to help recognize important situations by retrieving the metadata of an object-of-interest. In a multiple camera-based surveillance system, an object detected in a camera has a different shape in another camera, which is a critical issue of wide-range, real-time surveillance systems. In order to address the problem, this paper presents an object retrieval method by extracting the normalized metadata of an object-of-interest from multiple, heterogeneous cameras. The proposed metadata generation algorithm consists of three steps: (i) generation of a three-dimensional (3D) human model; (ii) human object-based automatic scene calibration; and (iii) metadata generation. More specifically, an appropriately-generated 3D human model provides the foot-to-head direction information that is used as the input of the automatic calibration of each camera. The normalized object information is used to retrieve an object-of-interest in a wide-range, multiple-camera surveillance system in the form of metadata. Experimental results show that the 3D human model matches the ground truth, and automatic calibration-based normalization of metadata enables a successful retrieval and tracking of a human object in the multiple-camera video surveillance system.

[1]  Ramakant Nevatia,et al.  Segmentation and Tracking of Multiple Humans in Crowded Environments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Ferdinand van der Heijden,et al.  Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[3]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[4]  H. S. Wolff,et al.  iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression , 2022, Sensors.

[5]  Joost van de Weijer,et al.  Author Manuscript, Published in "ieee Transactions on Image Processing Edge-based Color Constancy , 2022 .

[6]  Mun Wai Lee,et al.  Semantic video event search for surveillance video , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[7]  Ramakant Nevatia,et al.  Camera calibration from video of a walking human , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Graham D. Finlayson,et al.  Shades of Gray and Colour Constancy , 2004, CIC.

[9]  François Brémond,et al.  A framework for surveillance video indexing and retrieval , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[10]  David Gerónimo Gómez,et al.  Unsupervised Surveillance Video Retrieval Based on Human Action and Appearance , 2014, ICPR.

[11]  Zhengyou Zhang,et al.  A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Roberto Cipolla,et al.  Camera Calibration from Vanishing Points in Image of Architectural Scenes , 1999, BMVC.

[13]  Robert T. Collins,et al.  Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Chuan-Kai Yang,et al.  Video Object Retrieval by Trajectory and Appearance , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[15]  Rogério Schmidt Feris,et al.  Searching surveillance video , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[16]  Sangdoo Yun,et al.  Visual surveillance briefing system: Event-based video retrieval and summarization , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[17]  Antonio Criminisi,et al.  Creating Architectural Models from Images , 1999, Comput. Graph. Forum.

[18]  Andrew Zisserman,et al.  Combining scene and auto-calibration constraints , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[19]  K. P. Chow,et al.  Object-Based Surveillance Video Retrieval System with Real-Time Indexing Methodology , 2007, ICIAR.

[20]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[21]  Gary Bradski,et al.  Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .

[22]  Yanxi Liu,et al.  Surveillance Camera Autocalibration based on Pedestrian Height Distributions , 2011 .

[23]  Zhouyu Fu,et al.  Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[24]  Dan Schonfeld,et al.  Event Analysis Based on Multiple Interactive Motion Trajectories , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[25]  Jason Thornton,et al.  Person attribute search for large-area video surveillance , 2011, 2011 IEEE International Conference on Technologies for Homeland Security (HST).

[26]  Paulo R. S. Mendonça,et al.  Bayesian autocalibration for surveillance , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.