论文信息 - Normalized Metadata Generation for Human Retrieval Using Multiple Video Surveillance Cameras

Normalized Metadata Generation for Human Retrieval Using Multiple Video Surveillance Cameras

Since it is impossible for surveillance personnel to keep monitoring videos from a multiple camera-based surveillance system, an efficient technique is needed to help recognize important situations by retrieving the metadata of an object-of-interest. In a multiple camera-based surveillance system, an object detected in a camera has a different shape in another camera, which is a critical issue of wide-range, real-time surveillance systems. In order to address the problem, this paper presents an object retrieval method by extracting the normalized metadata of an object-of-interest from multiple, heterogeneous cameras. The proposed metadata generation algorithm consists of three steps: (i) generation of a three-dimensional (3D) human model; (ii) human object-based automatic scene calibration; and (iii) metadata generation. More specifically, an appropriately-generated 3D human model provides the foot-to-head direction information that is used as the input of the automatic calibration of each camera. The normalized object information is used to retrieve an object-of-interest in a wide-range, multiple-camera surveillance system in the form of metadata. Experimental results show that the 3D human model matches the ground truth, and automatic calibration-based normalization of metadata enables a successful retrieval and tracking of a human object in the multiple-camera video surveillance system.

[1] Ramakant Nevatia,et al. Segmentation and Tracking of Multiple Humans in Crowded Environments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Ferdinand van der Heijden,et al. Efficient adaptive density estimation per image pixel for the task of background subtraction , 2006, Pattern Recognit. Lett..

[3] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[4] H. S. Wolff,et al. iRun: Horizontal and Vertical Shape of a Region-Based Graph Compression , 2022, Sensors.

[5] Joost van de Weijer,et al. Author Manuscript, Published in "ieee Transactions on Image Processing Edge-based Color Constancy , 2022 .

[6] Mun Wai Lee,et al. Semantic video event search for surveillance video , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[7] Ramakant Nevatia,et al. Camera calibration from video of a walking human , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8] Graham D. Finlayson,et al. Shades of Gray and Colour Constancy , 2004, CIC.

[9] François Brémond,et al. A framework for surveillance video indexing and retrieval , 2008, 2008 International Workshop on Content-Based Multimedia Indexing.

[10] David Gerónimo Gómez,et al. Unsupervised Surveillance Video Retrieval Based on Human Action and Appearance , 2014, ICPR.

[11] Zhengyou Zhang,et al. A Flexible New Technique for Camera Calibration , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[12] Roberto Cipolla,et al. Camera Calibration from Vanishing Points in Image of Architectural Scenes , 1999, BMVC.

[13] Robert T. Collins,et al. Vision-Based Analysis of Small Groups in Pedestrian Crowds , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14] Chuan-Kai Yang,et al. Video Object Retrieval by Trajectory and Appearance , 2015, IEEE Transactions on Circuits and Systems for Video Technology.

[15] Rogério Schmidt Feris,et al. Searching surveillance video , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[16] Sangdoo Yun,et al. Visual surveillance briefing system: Event-based video retrieval and summarization , 2014, 2014 11th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[17] Antonio Criminisi,et al. Creating Architectural Models from Images , 1999, Comput. Graph. Forum.

[18] Andrew Zisserman,et al. Combining scene and auto-calibration constraints , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[19] K. P. Chow,et al. Object-Based Surveillance Video Retrieval System with Real-Time Indexing Methodology , 2007, ICIAR.

[20] W. Eric L. Grimson,et al. Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[21] Gary Bradski,et al. Computer Vision Face Tracking For Use in a Perceptual User Interface , 1998 .

[22] Yanxi Liu,et al. Surveillance Camera Autocalibration based on Pedestrian Height Distributions , 2011 .

[23] Zhouyu Fu,et al. Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[24] Dan Schonfeld,et al. Event Analysis Based on Multiple Interactive Motion Trajectories , 2009, IEEE Transactions on Circuits and Systems for Video Technology.

[25] Jason Thornton,et al. Person attribute search for large-area video surveillance , 2011, 2011 IEEE International Conference on Technologies for Homeland Security (HST).

[26] Paulo R. S. Mendonça,et al. Bayesian autocalibration for surveillance , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.