Crowd Localization From Gaussian Mixture Scoped Knowledge and Scoped Teacher

Crowd localization is to predict each instance head position in crowd scenarios. Since the distance of pedestrians being to the camera are variant, there exists tremendous gaps among scales of instances within an image, which is called the intrinsic scale shift. The core reason of intrinsic scale shift being one of the most essential issues in crowd localization is that it is ubiquitous in crowd scenes and makes scale distribution chaotic. To this end, the paper concentrates on access to tackle the chaos of the scale distribution incurred by intrinsic scale shift. We propose Gaussian Mixture Scope (GMS) to regularize the chaotic scale distribution. Concretely, the GMS utilizes a Gaussian mixture distribution to adapt to scale distribution and decouples the mixture model into sub-normal distributions to regularize the chaos within the sub-distributions. Then, an alignment is introduced to regularize the chaos among sub-distributions. However, despite that GMS is effective in regularizing the data distribution, it amounts to dislodging the hard samples in training set, which incurs overfitting. We assert that it is blamed on the block of transferring the latent knowledge exploited by GMS from data to model. Therefore, a Scoped Teacher playing a role of bridge in knowledge transform is proposed. What’ s more, the consistency regularization is also introduced to implement knowledge transform. To that effect, the further constraints are deployed on Scoped Teacher to derive feature consistence between teacher and student end. With proposed GMS and Scoped Teacher implemented on four mainstream datasets of crowd localization, the extensive experiments demonstrate the superiority of our work. Moreover, comparing with existing crowd locators, our work achieves state-of-the-art via F1-measure comprehensively on four datasets.

[1]  Jianping Shi,et al.  Context-Aware Mixup for Domain Adaptive Semantic Segmentation , 2021, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Qi Wang,et al.  Counting Like Human: Anthropoid Crowd Counting on Modeling the Similarity of Objects , 2022, arXiv.org.

[3]  Maoguo Gong,et al.  Congested Crowd Instance Localization with Dilated Convolutional Swin Transformer , 2021, Neurocomputing.

[4]  Yu Zhou,et al.  Focal Inverse Distance Transform Maps for Crowd Localization , 2021, IEEE Transactions on Multimedia.

[5]  Pascal Fua,et al.  Counting People by Estimating People Flows , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Wei Zhan,et al.  AutoScale: Learning to Scale for Crowd Counting , 2019, Int. J. Comput. Vis..

[7]  Changan Yuan,et al.  Interpretable learning based Dynamic Graph Convolutional Networks for Alzheimer's Disease analysis , 2022, Inf. Fusion.

[8]  Pheng-Ann Heng,et al.  HCDG: A Hierarchical Consistency Framework for Domain Generalization on Medical Image Segmentation , 2022, SSRN Electronic Journal.

[9]  Wenming Tan,et al.  Scene-Adaptive Attention Network for Crowd Counting , 2021, ArXiv.

[10]  Xuelong Li,et al.  LDC-Net: A Unified Framework for Localization, Detection and Counting in Dense Crowds , 2021, ArXiv.

[11]  Yihong Gong,et al.  Towards A Universal Model for Cross-Dataset Crowd Counting , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[12]  P. Heng,et al.  Domain Generalization for Medical Image Segmentation via Hierarchical Consistency Regularization , 2021, 2109.05742.

[13]  Antoni B. Chan,et al.  A Generalized Loss Function for Crowd Counting and Localization , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Ying Tai,et al.  To Choose or to Fuse? Scale Selection for Crowd Counting , 2021, AAAI.

[15]  Nikita Araslanov,et al.  Self-supervised Augmentation Consistency for Adapting Semantic Segmentation , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Eric Marchand,et al.  Tracking Pedestrian Heads in Dense Crowd , 2021, 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Zhiguo Cao,et al.  Decoupled Two-Stage Crowd Counting and Beyond , 2021, IEEE Transactions on Image Processing.

[18]  Minh Hoai,et al.  Localization in the Crowd with Topological Constraints , 2020, AAAI.

[19]  Nicu Sebe,et al.  Embedding Perspective Analysis Into Multi-Column Convolutional Neural Network for Crowd Counting , 2020, IEEE Transactions on Image Processing.

[20]  Junhui Hou,et al.  A Self-Training Approach for Point-Supervised Object Detection and Counting in Crowds , 2020, IEEE Transactions on Image Processing.

[21]  Jiandong Tian,et al.  Tracking-by-Counting: Using Network Flows on Crowd Density Maps for Tracking Multiple Targets , 2020, IEEE Transactions on Image Processing.

[22]  Antoni B. Chan,et al.  Fine-Grained Crowd Counting , 2020, IEEE Transactions on Image Processing.

[23]  Qi Wang,et al.  NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Qi Wang,et al.  Domain-Adaptive Crowd Counting via High-Quality Image Translation and Density Reconstruction , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[25]  Yang Zhao,et al.  Deep High-Resolution Representation Learning for Visual Recognition , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Stephen Lin,et al.  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows , 2021, 2021 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Rongyao Hu,et al.  Adaptive reverse graph learning for robust subspace learning , 2021, Inf. Process. Manag..

[28]  Qi Wang,et al.  Learning Independent Instance Maps for Crowd Localization , 2020, ArXiv.

[29]  Yihong Gong,et al.  Learning Scales from Points: A Scale-aware Probabilistic Model for Crowd Counting , 2020, ACM Multimedia.

[30]  Nicu Sebe,et al.  Reverse Perspective Network for Perspective-Aware Object Counting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Wei Wu,et al.  Adaptive Dilated Network With Self-Correction Supervision for Counting , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Jing Qin,et al.  Crowd Counting Via Cross-Stage Refinement Networks , 2020, IEEE Transactions on Image Processing.

[33]  Siavash Gorji,et al.  Group Activity Detection from Trajectory and Video Data in Soccer , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[34]  Hongcheng Wang,et al.  ZoomCount: A Zooming Mechanism for Crowd Counting in Static Images , 2020, IEEE Transactions on Circuits and Systems for Video Technology.

[35]  David Berthelot,et al.  FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence , 2020, NeurIPS.

[36]  Guanghui Wang,et al.  Plug-and-Play Rescaling Based Crowd Counting in Static Images , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[37]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Qi Wang,et al.  PCC Net: Perspective Crowd Counting via Spatial Convolutional Network , 2019, IEEE Transactions on Circuits and Systems for Video Technology.

[39]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Qi Wang,et al.  Domain-adaptive Crowd Counting via Inter-domain Features Segregation and Gaussian-prior Reconstruction , 2019, arXiv.org.

[41]  Vishal M. Patel,et al.  Pushing the Frontiers of Unconstrained Crowd Counting: New Dataset and Benchmark Method , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[42]  Jang Hyun Cho,et al.  On the Efficacy of Knowledge Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Alexander Hauptmann,et al.  Improving the Learning of Multi-column Convolutional Neural Network for Crowd Counting , 2019, ACM Multimedia.

[44]  Wangmeng Zuo,et al.  Perspective-Guided Convolution Networks for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[45]  Vishal M. Patel,et al.  Multi-Level Bottom-Top and Top-Bottom Feature Fusion for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[46]  Guanbin Li,et al.  Crowd Counting With Deep Structured Scale Integration Network , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[47]  Xiang Bai,et al.  Learn to Scale: Generating Multipolar Normalized Density Maps for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[48]  Alan L. Yuille,et al.  Training Deep Neural Networks in Generations: A More Tolerant Teacher Educates Better Students , 2018, AAAI.

[49]  Wei Lin,et al.  C^3 Framework: An Open-source PyTorch Code for Crowd Counting , 2019, ArXiv.

[50]  Xin Geng,et al.  Indoor Crowd Counting by Mixture of Gaussians Label Distribution Learning , 2019, IEEE Transactions on Image Processing.

[51]  Shenghua Gao,et al.  Density Map Regression Guided Detection Network for RGB-D Crowd Counting and Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[52]  Yadong Mu,et al.  Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Ran He,et al.  PyramidBox++: High Performance Detector for Finding Tiny Face , 2019, ArXiv.

[54]  Jian Yang,et al.  DSFD: Dual Shot Face Detector , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Qijun Chen,et al.  Revisiting Perspective Information for Efficient Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[56]  Pascal Fua,et al.  Geometric and Physical Constraints for Drone-Based Head Plane Crowd Density Estimation , 2018, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[57]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[58]  Liang Lin,et al.  Crowd Counting using Deep Recurrent Spatial-Aware Network , 2018, IJCAI.

[59]  Bernard Ghanem,et al.  Finding Tiny Faces in the Wild with Generative Adversarial Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[60]  Antoni B. Chan,et al.  Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid , 2018, BMVC.

[61]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[62]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Ali Farhadi,et al.  YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[64]  Peiyun Hu,et al.  Finding Tiny Faces , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Nikos Komodakis,et al.  Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[66]  Daniel Oñoro-Rubio,et al.  Towards Perspective-Free Object Counting with Deep Learning , 2016, ECCV.

[67]  Andrew Zisserman,et al.  Counting in the Wild , 2016, ECCV.

[68]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[70]  Andrew Y. Ng,et al.  End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Ali Farhadi,et al.  You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[73]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[74]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[75]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[76]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.