Neuron Linear Transformation: Modeling the Domain Shift for Crowd Counting

Cross-domain crowd counting (CDCC) is a hot topic due to its importance in public safety. The purpose of CDCC is to reduce the domain shift between the source and target domain. Recently, typical methods attempt to extract domain-invariant features via image translation and adversarial learning. When it comes to specific tasks, we find that the final manifestation of the task gap is in the parameters of the model, and the domain shift can be represented apparently by the differences in model weights. To describe the domain gap directly at the parameter-level, we propose a Neuron Linear Transformation (NLT) method, where NLT is exploited to learn the shift at neuron-level and then transfer the source model to the target model. Specifically, for a specific neuron of a source model, NLT exploits few labeled target data to learn a group of parameters, which updates the target neuron via a linear transformation. Extensive experiments and analysis on six real-world datasets validate that NLT achieves top performance compared with other domain adaptation methods. An ablation study also shows that the NLT is robust and more effective compare with supervised and fine-tune training. Furthermore, we will release the code after the paper is accepted.

[1]  Yang Wang,et al.  Crowd Counting Using Scale-Aware Attention Networks , 2019, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[2]  Xiao Liu,et al.  Learning to Track Multiple Targets , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Antoni B. Chan,et al.  Crowd Counting by Adaptively Fusing Predictions from an Image Pyramid , 2018, BMVC.

[4]  Shiv Surya,et al.  Switching Convolutional Neural Network for Crowd Counting , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Gaoxiang Ouyang,et al.  Cross-Level Parallel Network for Crowd Counting , 2020, IEEE Transactions on Industrial Informatics.

[6]  Tjeng Wawan Cenggoro,et al.  Feature Pyramid Networks for Crowd Counting , 2019, Procedia Computer Science.

[7]  Xuelong Li,et al.  Property-Constrained Dual Learning for Video Summarization , 2019, IEEE Transactions on Neural Networks and Learning Systems.

[8]  Rongrong Ji,et al.  Body Structure Aware Deep Crowd Counting , 2018, IEEE Transactions on Image Processing.

[9]  Yihong Gong,et al.  Bayesian Loss for Crowd Count Estimation With Point Supervision , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[10]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[11]  Haroon Idrees,et al.  Detecting Humans in Dense Crowds Using Locally-Consistent Scale Prior and Global Occlusion Reasoning , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Qi Wang,et al.  Domain-adaptive Crowd Counting via Inter-domain Features Segregation and Gaussian-prior Reconstruction , 2019, arXiv.org.

[13]  Wei Lin,et al.  Learning From Synthetic Data for Crowd Counting in the Wild , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Nuno Vasconcelos,et al.  Modeling, Clustering, and Segmenting Video with Mixtures of Dynamic Textures , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Haidi Ibrahim,et al.  Recent survey on crowd density estimation and counting for visual surveillance , 2015, Eng. Appl. Artif. Intell..

[16]  Yadong Mu,et al.  Recurrent Attentive Zooming for Joint Crowd Counting and Precise Localization , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Fei Su,et al.  Scale Aggregation Network for Accurate and Efficient Crowd Counting , 2018, ECCV.

[18]  Xiaochun Cao,et al.  Background Noise Filtering and Distribution Dividing for Crowd Counting , 2020, IEEE Transactions on Image Processing.

[19]  Xiaogang Jin,et al.  Quadruplet Network With One-Shot Learning for Fast Visual Object Tracking , 2017, IEEE Transactions on Image Processing.

[20]  Pascal Fua,et al.  Context-Aware Crowd Counting , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Ling Shao,et al.  Dynamical Hyperparameter Optimization via Deep Reinforcement Learning in Tracking , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Ruigang Yang,et al.  Inferring Salient Objects from Human Fixations , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[24]  Li Fei-Fei Knowledge transfer in learning to recognize visual objects classes , 2006 .

[25]  Bing Zhou,et al.  Learning Multi-Level Density Maps for Crowd Counting , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[26]  Antoni B. Chan,et al.  Adaptive Density Map Generation for Crowd Counting , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[27]  Michael Fink,et al.  Object Classification from a Single Example Utilizing Class Relevance Metrics , 2004, NIPS.

[28]  Shaogang Gong,et al.  Feature Mining for Localised Crowd Counting , 2012, BMVC.

[29]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[30]  Xin Geng,et al.  Crowd counting in public video surveillance by label distribution learning , 2015, Neurocomputing.

[31]  Yuan Yuan,et al.  Focus on Semantic Consistency for Cross-Domain Crowd Understanding , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[32]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[33]  Haroon Idrees,et al.  Composition Loss for Counting, Density Map Estimation and Localization in Dense Crowds , 2018, ECCV.

[34]  Luca Bertinetto,et al.  Learning feed-forward one-shot learners , 2016, NIPS.

[35]  Shimon Ullman,et al.  Cross-generalization: learning novel classes from a single example by feature replacement , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[36]  Baoyuan Wu,et al.  Residual Regression With Semantic Prior for Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[37]  Liang He,et al.  Adaptive Scenario Discovery for Crowd Counting , 2018, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[38]  Arjan Kuijper,et al.  Beyond Group: Multiple Person Tracking via Minimal Topology-Energy-Variation , 2017, IEEE Transactions on Image Processing.

[39]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[40]  Junping Zhang,et al.  PaDNet: Pan-Density Crowd Counting , 2018, IEEE Transactions on Image Processing.

[41]  Liang Lin,et al.  Crowd Counting using Deep Recurrent Spatial-Aware Network , 2018, IJCAI.

[42]  Jianbing Shen,et al.  Triplet Loss in Siamese Network for Object Tracking , 2018, ECCV.

[43]  Xuelong Li,et al.  NWPU-Crowd: A Large-Scale Benchmark for Crowd Counting and Localization , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[45]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[46]  Jing Qin,et al.  Crowd Counting Via Cross-Stage Refinement Networks , 2020, IEEE Transactions on Image Processing.

[47]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[48]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[49]  Bernt Schiele,et al.  Meta-Transfer Learning for Few-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Daniel Oñoro-Rubio,et al.  Towards Perspective-Free Object Counting with Deep Learning , 2016, ECCV.

[51]  Vishal M. Patel,et al.  HA-CCN: Hierarchical Attention-Based Crowd Counting Network , 2019, IEEE Transactions on Image Processing.

[52]  Mark W. Schmidt,et al.  Where are the Blobs: Counting by Localization with Point Supervision , 2018, ECCV.

[53]  Xiaogang Wang,et al.  Pedestrian Behavior Modeling From Stationary Crowds With Applications to Intelligent Surveillance , 2016, IEEE Transactions on Image Processing.

[54]  Lu Zhang,et al.  Crowd Counting via Scale-Adaptive Convolutional Neural Network , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[55]  Ali Borji,et al.  Salient Object Detection Driven by Fixation Prediction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[56]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[57]  Xin Geng,et al.  Indoor Crowd Counting by Mixture of Gaussians Label Distribution Learning , 2019, IEEE Transactions on Image Processing.

[58]  Daan Wierstra,et al.  Meta-Learning with Memory-Augmented Neural Networks , 2016, ICML.

[59]  Xiaogang Wang,et al.  Data-Driven Crowd Understanding: A Baseline for a Large-Scale Crowd Dataset , 2016, IEEE Transactions on Multimedia.

[60]  Ling Shao,et al.  Hyperparameter Optimization for Tracking with Continuous Deep Q-Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[61]  Feiping Nie,et al.  Detecting Coherent Groups in Crowd Scenes by Multiview Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[62]  Lena Maier-Hein,et al.  Clickstream Analysis for Crowd-Based Object Segmentation with Confidence , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  Wei Lin,et al.  C^3 Framework: An Open-source PyTorch Code for Crowd Counting , 2019, ArXiv.

[64]  Vishal M. Patel,et al.  Generating High-Quality Crowd Density Maps Using Contextual Pyramid CNNs , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[65]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[66]  Yang Wang,et al.  One-Shot Scene-Specific Crowd Counting , 2019, BMVC.

[67]  Hieu Le,et al.  Iterative Crowd Counting , 2018, ECCV.

[68]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[69]  Yuan Yuan,et al.  Feature-Aware Adaptation and Density Alignment for Crowd Counting in Video Surveillance , 2020, IEEE Transactions on Cybernetics.

[70]  Chongyang Zhang,et al.  Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Yuan Yuan,et al.  Feature-aware Adaptation and Structured Density Alignment for Crowd Counting in Video Surveillance , 2019, ArXiv.

[73]  Bin Zhao,et al.  CAM-RNN: Co-Attention Model Based RNN for Video Captioning , 2019, IEEE Transactions on Image Processing.

[74]  Nuno Vasconcelos,et al.  Anomaly Detection and Localization in Crowded Scenes , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[75]  Yuan Yuan,et al.  Pixel-Wise Crowd Understanding via Synthetic Data , 2020, International Journal of Computer Vision.

[76]  Yang Wang,et al.  Few-Shot Scene Adaptive Crowd Counting Using Meta-Learning , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[77]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.