People Counting in Crowded Environment and Re-identification

Nowadays, detecting people and understanding their behaviour automatically is one of the key aspects of modern intelligent video systems. This interest arises from societal needs. Security and Video Analytics, Intelligent Retail Environment and Activities of Daily Living are just a few of the possible applications. The problem remains largely open due to several serious challenges such as occlusion, change of appearance, complex and dynamic background. Nevertheless, in recent years, privacy concerns are arising making these system designs more challenging, also to cope with different worldwide country regulations. Popular sensors for this task are RGB-D cameras because of their availability, reliability and affordability. Studies have demonstrated the great value (both in accuracy and efficiency) of depth camera in coping with severe occlusions among humans and complex background. In particular, RGB-D cameras show their great potential if used in a top-view configuration achieving high performances even in a crowded environment (considering at least 3 people per square meter in the area of the camera) minimizing occlusions and also being the most privacy-compliant approach. The first step in people detection and tracking is the segmentation to retrieve people silhouette, for this reason different methods will be covered in this chapter, ranging from classical handcraft feature based approaches to deep learning techniques. These techniques also solve the nontrivial problem of blob collision, occurring when two or more people are close enough to form a unique blob from the camera point of view. Multilevel segmentation and water filling algorithms will be presented to the reader in this chapter as handcraft feature based, in addition a deep learning approach is also introduced from the literature. In the methods presented in this chapter, the elaboration occurs live (there is no image recording) and occurs on the edge, following an IoT paradigm. Live analysis also strengthens the aforementioned concept of privacy compliance. The last part of this chapter is dedicated to person re-identification (re-id), which is the process to determine if different instances or images of the same person, recorded in different moments, belong to the same subject. Person re-id has many important applications in video surveillance, because it saves human efforts on exhaustively searching for a person from large amounts of video sequences. Identification cameras are widely employed in most of the public places like malls, office buildings, airports, stations and museums. These cameras generally provide enhanced coverage and overlay large geospatial areas because they have non-overlapping fields-of-views. Huge amounts of video data, monitored in real time by law enforcement officers are used after the event for forensic purposes, are provided by these networks. An automated analysis of these data improves significantly the quality of monitoring, in addition to processing the data faster. Handcrafted anthropomorphic features coupled with a machine learning approach will be exploited in this chapter, then a deep leaning approach in comparison is presented. Different metrics are then adopted to evaluate the above algorithms and to compare them.

[1]  J.-P. Renno,et al.  Evaluation of MPEG7 color descriptors for visual surveillance retrieval , 2005, 2005 IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance.

[2]  Emanuele Frontoni,et al.  A vision based algorithm for active robot localization , 2005, 2005 International Symposium on Computational Intelligence in Robotics and Automation.

[3]  Halimah Badioze Zaman,et al.  Review of person re-identification techniques , 2014, IET Comput. Vis..

[4]  Ennio Gambi,et al.  A Depth-Based Fall Detection System Using a Kinect® Sensor , 2014, Sensors.

[5]  Michael Rauter Reliable Human Detection and Tracking in Top-View Depth Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[6]  Jin-Peng Xiang Active learning for person re-identification , 2012, 2012 International Conference on Machine Learning and Cybernetics.

[7]  Brendan J. Frey,et al.  Stel component analysis: Modeling spatial correlations in image class structure , 2009, CVPR.

[8]  Bogdan Kwolek,et al.  Detecting human falls with 3-axis accelerometer and depth sensor , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[9]  Jin Wang,et al.  DeepList: Learning Deep Features With Adaptive Listwise Constraint for Person Reidentification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[10]  Roberto Pierdicca,et al.  Robust and affordable retail customer profiling by vision and radio beacon sensor fusion , 2016, Pattern Recognit. Lett..

[11]  Alessio Del Bue,et al.  Re-identification with RGB-D Sensors , 2012, ECCV Workshops.

[12]  Luc Van Gool,et al.  Depth and Appearance for Mobile Scene Analysis , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[13]  Jian-Huang Lai,et al.  Robust Depth-Based Person Re-Identification , 2017, IEEE Transactions on Image Processing.

[14]  Yuncai Liu,et al.  Person re-identification by fuzzy space color histogram , 2012, Multimedia Tools and Applications.

[15]  Frank Dittrich,et al.  Pixelwise object class segmentation based on synthetic data using an optimized training strategy , 2014, 2014 First International Conference on Networks & Soft Computing (ICNSC2014).

[16]  Shamik Sural,et al.  A hierarchical method combining gait and phase of motion with spatiotemporal model for person re-identification , 2012, Pattern Recognit. Lett..

[17]  Lin Wu,et al.  Deep Linear Discriminant Analysis on Fisher Networks: A Hybrid Architecture for Person Re-identification , 2016, Pattern Recognit..

[18]  Kazutaka Shimada,et al.  Person Identification Using Top-View Image with Depth Information , 2012, SNPD.

[19]  Vittorio Murino,et al.  Custom Pictorial Structures for Re-identification , 2011, BMVC.

[20]  Fabio Roli,et al.  Exploiting Dissimilarity Representations for Person Re-identification , 2011, SIMBAD.

[21]  Zheng Wang,et al.  Zero-Shot Person Re-identification via Cross-View Consistency , 2016, IEEE Transactions on Multimedia.

[22]  Xiang Li,et al.  An enhanced deep feature representation for person re-identification , 2016, 2016 IEEE Winter Conference on Applications of Computer Vision (WACV).

[23]  Fabio Roli,et al.  Multimodal Person Reidentification Using RGB-D Cameras , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Xiaogang Wang,et al.  Learning Deep Feature Representations with Domain Guided Dropout for Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Emanuele Frontoni,et al.  Pervasive System for Consumer Behaviour Analysis in Retail Environments , 2016, VAAM/FFER@ICPR.

[26]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[27]  Shaogang Gong,et al.  Person Re-identification by Video Ranking , 2014, ECCV.

[28]  Emanuele Frontoni,et al.  People Detection and Tracking from an RGB-D Camera in Top-View Configuration: Review of Challenges and Applications , 2017, ICIAP Workshops.

[29]  Hariharan Ravishankar,et al.  Learning and Incorporating Shape Models for Semantic Segmentation , 2017, MICCAI.

[30]  Li-Chen Fu,et al.  Representative Body Points on Top-View Depth Sequences for Daily Activity Recognition , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[31]  Massimo Piccardi,et al.  Height measurement as a session-based biometric for people matching across disjoint camera views , 2005 .

[32]  Stan Z. Li,et al.  Deep Metric Learning for Practical Person Re-Identification , 2014, ArXiv.

[33]  Ramakant Nevatia,et al.  Segmentation and Tracking of Multiple Humans in Crowded Environments , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Ting-En Tseng,et al.  Real-time people detection and tracking for indoor surveillance using multiple top-view depth cameras , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[35]  Shaogang Gong,et al.  Person Re-Identification by Support Vector Ranking , 2010, BMVC.

[36]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Lei Zhang,et al.  Bit-Scalable Deep Hashing With Regularized Similarity Learning for Image Retrieval and Person Re-Identification , 2015, IEEE Transactions on Image Processing.

[38]  Shengcai Liao,et al.  Deep Metric Learning for Person Re-identification , 2014, 2014 22nd International Conference on Pattern Recognition.

[39]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[40]  Ihsan Ullah,et al.  Survey on Deep Learning Techniques for Person Re-Identification Task , 2018, ArXiv.

[41]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[42]  Luciano Oliveira,et al.  Convolutional covariance features: Conception, integration and performance in person re-identification , 2017, Pattern Recognit..

[43]  Xiaoming Liu,et al.  An intelligent video framework for homeland protection , 2007, SPIE Defense + Commercial Sensing.

[44]  Xiaogang Wang,et al.  Shape and Appearance Context Modeling , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[45]  Kaiqi Huang,et al.  Learning Deep Context-Aware Features over Body and Latent Parts for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Metin Ozkan,et al.  People counting system by using kinect sensor , 2015, 2015 International Symposium on Innovations in Intelligent SysTems and Applications (INISTA).

[47]  Emanuele Frontoni,et al.  Person Re-Identification with RGB-D Camera in Top-View Configuration through Multiple Nearest Neighbor Classifiers and Neighborhood Component Features Selection , 2018, Sensors.

[48]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Stefano Messelodi,et al.  Boosting Fisher vector based scoring functions for person re-identification , 2015, Image Vis. Comput..

[50]  Emanuele Frontoni,et al.  A business application of RTLS technology in Intelligent Retail Environment: Defining the shopper's preferred path and its segmentation , 2019, Journal of Retailing and Consumer Services.

[51]  Alessandro Perina,et al.  Person re-identification by symmetry-driven accumulation of local features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[52]  Zhi Zhong,et al.  Robust people counting in crowded environment , 2007, 2007 IEEE International Conference on Robotics and Biomimetics (ROBIO).

[53]  Fakhreddine Ababsa,et al.  3D Human Tracking in a Top View Using Depth Information Recorded by the Xtion Pro-Live Camera , 2013, ISVC.

[54]  Sung-Jea Ko,et al.  Robust people counting system based on sensor fusion , 2012, IEEE Transactions on Consumer Electronics.

[55]  Emanuele Frontoni,et al.  Convolutional Networks for Semantic Heads Segmentation using Top-View Depth Data in Crowded Environment , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[56]  Emanuele Frontoni,et al.  Human activity analysis for in-home fall risk assessment , 2015, 2015 IEEE International Conference on Communication Workshop (ICCW).

[57]  Luigi di Stefano,et al.  People Tracking Using a Time-of-Flight Depth Sensor , 2006, 2006 IEEE International Conference on Video and Signal Based Surveillance.

[58]  Larry S. Davis,et al.  Learning Discriminative Appearance-Based Models Using Partial Least Squares , 2009, 2009 XXII Brazilian Symposium on Computer Graphics and Image Processing.

[59]  Shaogang Gong,et al.  Person re-identification by probabilistic relative distance comparison , 2011, CVPR 2011.

[60]  W. Eric L. Grimson,et al.  Adaptive background mixture models for real-time tracking , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[61]  Junjie Yan,et al.  Water Filling: Unsupervised People Counting via Vertical Kinect Sensor , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[62]  Richard I. Hartley,et al.  Person Reidentification Using Spatiotemporal Appearance , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[63]  Rita Cucchiara,et al.  Learning articulated body models for people re-identification , 2013, MM '13.

[64]  Shiliang Zhang,et al.  Deep Attributes Driven Multi-Camera Person Re-identification , 2016, ECCV.

[65]  José García Rodríguez,et al.  A Review on Deep Learning Techniques Applied to Semantic Segmentation , 2017, ArXiv.

[66]  Emanuele Frontoni,et al.  Robotic retail surveying by deep learning visual and textual data , 2019, Robotics Auton. Syst..

[67]  Emanuele Frontoni,et al.  Modelling and Forecasting Customer Navigation in Intelligent Retail Environments , 2018, J. Intell. Robotic Syst..

[68]  Fakhreddine Ababsa,et al.  Hybrid 3D–2D human tracking in a top view , 2014, Journal of Real-Time Image Processing.

[69]  Hai Tao,et al.  Viewpoint Invariant Pedestrian Recognition with an Ensemble of Localized Features , 2008, ECCV.

[70]  Filip Malawski Top-view people counting in public transportation using Kinect , 2014 .

[71]  Peter H. N. de With,et al.  Employing a RGB-D sensor for real-time tracking of humans across multiple re-entries in a smart environment , 2012, IEEE Transactions on Consumer Electronics.