An efficient and effective method for people detection from top-view depth cameras

The detection of persons from videos is particularly important in many computer vision contexts being an enabling technology for several relevant applications either for security and safety or for business intelligence purposes. The adoption of a depth sensor mounted in a top-view position is often used to achieve high person detection accuracy as it allows to cope effectively with occlusions and difficult lighting conditions. In this paper, we propose a new method for people detection from depth maps produced by sensors mounted in a zenithal position. The method is designed with the aim of providing an optimal trade off between the detection accuracy and the computational complexity. The proposed approach adopts a dynamic background modeling strategy in order to find the objects of interest into the scene; then a lightweight algorithm is used to filter out the noise from the foreground image and to determine the position of the persons into the scene. The experimental analysis carried out on a public and large dataset allowed to demonstrate that the method is fast and accurate. The method has been compared with respect to two different approaches available in the literature for people detection from a depth camera mounted in a zenithal position: an unsupervised method that is fast although not highly accurate, and a supervised one that conversely is very accurate but less computationally efficient. The proposed method allows to achieve comparable accuracy of the supervised approach using very few computational resources, with a reduction of an order of magnitude of the processing times.

[1]  Mario Vento,et al.  Removing Object Reflections in Videos by Global Optimization , 2012, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Shaogang Gong,et al.  Crowd Counting and Profiling: Methodology and Evaluation , 2013, Modeling, Simulation and Visual Analysis of Crowds.

[3]  Mario Vento,et al.  Counting moving persons in crowded scenes , 2013, Machine Vision and Applications.

[4]  Kin Hong Wong,et al.  Human Tracking and Counting Using the KINECT Range Sensor Based on Adaboost and Kalman Filter , 2013, ISVC.

[5]  Shipeng Li,et al.  Texture-assisted Kinect depth inpainting , 2012, 2012 IEEE International Symposium on Circuits and Systems.

[6]  Junjie Yan,et al.  Water Filling: Unsupervised People Counting via Vertical Kinect Sensor , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[7]  Daw-Tung Lin,et al.  A Novel Layer-Scanning Method for Improving Real-Time People Counting , 2013, HCI.

[8]  Mario Vento,et al.  An Experimental Evaluation of Foreground Detection Algorithms in Real Scenes , 2010, EURASIP J. Adv. Signal Process..

[9]  Frantisek Galcík,et al.  Real-Time Depth Map Based People Counting , 2013, ACIVS.

[10]  Mario Vento,et al.  Benchmarking Two Algorithms for People Detection from Top-View Depth Cameras , 2017, ICIAP.

[11]  P. Karpagavalli,et al.  Estimating the density of the people and counting the number of people in a crowd environment for human safety , 2013, 2013 International Conference on Communication and Signal Processing.

[12]  Jakub Nalepa,et al.  Real-Time People Counting from Depth Images , 2015, BDAS.

[13]  Luis Salgado,et al.  Efficient spatio-temporal hole filling strategy for Kinect depth maps , 2012, Electronic Imaging.

[14]  Mohan M. Trivedi,et al.  Detecting Moving Shadows: Algorithms and Evaluation , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Joaquín Salas,et al.  Counting Pedestrians in Bidirectional Scenarios Using Zenithal Depth Images , 2013, MCPR.

[16]  Kentaro Toyama,et al.  Wallflower: principles and practice of background maintenance , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[17]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[18]  Guangming Shi,et al.  Structure guided fusion for depth map inpainting , 2013, Pattern Recognit. Lett..

[19]  Kaoru Hirota,et al.  A Survey of Video-Based Crowd Anomaly Detection in Dense Scenes , 2017, J. Adv. Comput. Intell. Intell. Informatics.

[20]  Mario Vento,et al.  A versatile and effective method for counting people on either RGB or depth overhead cameras , 2015, 2015 IEEE International Conference on Multimedia & Expo Workshops (ICMEW).

[21]  Michael Rauter Reliable Human Detection and Tracking in Top-View Depth Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[22]  Haidi Ibrahim,et al.  Recent survey on crowd density estimation and counting for visual surveillance , 2015, Eng. Appl. Artif. Intell..

[23]  Alberto E. Cerpa,et al.  Energy efficient building environment control strategies using real-time occupancy measurements , 2009, BuildSys '09.

[24]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.