Multi-Modal Cross Learning for Improved People Counting using Short-Range FMCW Radar

Radar systems enable remote-less sensing of multiple persons in its field of view. In this paper, we propose a novel people counting system using 60-GHz frequency modulated continuous wave radar sensor. The proposed deep convolutional neural network learns from supervised radar data and also through knowledge distillation via multi-modal cross-learning of representation from a synchronized camera-based deep convolutional neural network. To overcome several shortcomings of the radar data, novel multi-modal cross learning algorithm is proposed that leverage the high-level abstractions learnt from camera modality. We also propose novel focal-regularized loss function to facilitate improved feature learning. We demonstrate the superior performance of our proposed solution in counting upto 4 people and detection of more than 4 people in indoor environment in comparison to the state-of-art radar-based uni-modal learning.

[1]  Pan Zhou,et al.  DA-Net: Learning the Fine-Grained Density Distribution With Deformation Aggregation Network , 2018, IEEE Access.

[2]  Vishal M. Patel,et al.  Improving the Performance of Unimodal Dynamic Hand-Gesture Recognition With Multimodal Training , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Avik Santra,et al.  Short-Range Millimetric-Wave Radar System for Occupancy Sensing Application , 2018, IEEE Sensors Letters.

[4]  Avik Santra,et al.  Radar-Based Human Target Detection using Deep Residual U-Net for Smart Home Applications , 2019, 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA).

[5]  Avik Santra,et al.  Human Target Detection, Tracking, and Classification Using 24-GHz FMCW Radar , 2019, IEEE Sensors Journal.

[6]  Lei Li,et al.  Dense People Counting Using IR-UWB Radar With a Hybrid Feature Extraction Method , 2018, IEEE Geoscience and Remote Sensing Letters.

[7]  Carmine Clemente,et al.  'The Micro-Doppler Effect in Radar' by V.C. Chen , 2012 .

[8]  Srinivas S. Kruthiventi,et al.  CrowdNet: A Deep Convolutional Network for Dense Crowd Counting , 2016, ACM Multimedia.

[9]  Christian Wolf,et al.  ModDrop: Adaptive Multi-Modal Gesture Recognition , 2014, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Guoyan Zheng,et al.  Crowd Counting with Deep Negative Correlation Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Antonio Torralba,et al.  Through-Wall Human Pose Estimation Using Radio Signals , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Pei Li,et al.  People counting based on head detection combining Adaboost and CNN in crowded surveillance environment , 2016, Neurocomputing.

[13]  Yuhong Li,et al.  CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14]  Lin Zhang,et al.  People counting based on CNN using IR-UWB radar , 2017, 2017 IEEE/CIC International Conference on Communications in China (ICCC).

[15]  Hanqing Lu,et al.  Fusing multi-modal features for gesture recognition , 2013, ICMI '13.

[16]  Sung Ho Cho,et al.  People Counting Based on an IR-UWB Radar Sensor , 2017, IEEE Sensors Journal.

[17]  Xiaochun Cao,et al.  Deep People Counting in Extremely Dense Crowds , 2015, ACM Multimedia.

[18]  Anish Arora,et al.  A regression-based radar-mote system for people counting , 2014, 2014 IEEE International Conference on Pervasive Computing and Communications (PerCom).