mm-Pose: Real-Time Human Skeletal Posture Estimation Using mmWave Radars and CNNs

In this paper, mm-Pose, a novel approach to detect and track human skeletons in real-time using an mmWave radar, is proposed. To the best of the authors’ knowledge, this is the first method to detect >15 distinct skeletal joints using mmWave radar reflection signals. The proposed method would find several applications in traffic monitoring systems, autonomous vehicles, patient monitoring systems and defense forces to detect and track human skeleton for effective and preventive decision making in real-time. The use of radar makes the system operationally robust to scene lighting and adverse weather conditions. The reflected radar point cloud in range, azimuth and elevation are first resolved and projected in Range-Azimuth and Range-Elevation planes. A novel low-size high-resolution radar-to-image representation is also presented, that overcomes the sparsity in traditional point cloud data and offers significant reduction in the subsequent machine learning architecture. The RGB channels were assigned with the normalized values of range, elevation/azimuth and the power level of the reflection signals for each of the points. A forked CNN architecture was used to predict the real-world position of the skeletal joints in 3-D space, using the radar-to-image representation. The proposed method was tested for a single human scenario for four primary motions, (i) Walking, (ii) Swinging left arm, (iii) Swinging right arm, and (iv) Swinging both arms to validate accurate predictions for motion in range, azimuth and elevation. The detailed methodology, implementation, challenges, and validation results are presented.

[1]  Stefano Messelodi,et al.  A computer vision system for the detection and classification of vehicles at urban road intersections , 2005, Pattern Analysis and Applications.

[2]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[3]  Andrew Zisserman,et al.  Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps , 2013, ICLR.

[4]  George Vosselman,et al.  Fusion of laser scanning data, maps, and aerial photographs for building reconstruction , 2002, IEEE International Geoscience and Remote Sensing Symposium.

[5]  Salim Hariri,et al.  Multiple Patients Behavior Detection in Real-time using mmWave Radar and Deep CNNs , 2019, 2019 IEEE Radar Conference (RadarConf).

[6]  J. Oulton,et al.  The Global Nursing Shortage: An Overview of Issues and Actions , 2006, Policy, politics & nursing practice.

[7]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[9]  Bernt Schiele,et al.  DeeperCut: A Deeper, Stronger, and Faster Multi-person Pose Estimation Model , 2016, ECCV.

[10]  Gabriel M. Rebeiz,et al.  Design of high-efficiency millimeter-wave microstrip antennas for silicon RFIC applications , 2011, 2011 IEEE International Symposium on Antennas and Propagation (APSURSI).

[11]  Yohan Dupuis,et al.  A Survey of Vision-Based Traffic Monitoring of Road Intersections , 2016, IEEE Transactions on Intelligent Transportation Systems.

[12]  Michael J. Black,et al.  HumanEva: Synchronized Video and Motion Capture Dataset and Baseline Algorithm for Evaluation of Articulated Human Motion , 2010, International Journal of Computer Vision.

[13]  Sebastian Thrun,et al.  Model based vehicle detection and tracking for autonomous urban driving , 2009, Auton. Robots.

[14]  Zhengyou Zhang,et al.  Microsoft Kinect Sensor and Its Effect , 2012, IEEE Multim..

[15]  Antonio Torralba,et al.  RF-based 3D skeletons , 2018, SIGCOMM.

[16]  Jitendra Malik,et al.  Using k-Poselets for Detecting People and Localizing Their Keypoints , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[18]  Antonio Torralba,et al.  Through-Wall Human Pose Estimation Using Radio Signals , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  T. Zwick,et al.  Millimeter-Wave Technology for Automotive Radar Sensors in the 77 GHz Frequency Band , 2012, IEEE Transactions on Microwave Theory and Techniques.

[20]  Ralf Reulke,et al.  Traffic Surveillance using Multi-Camera Detection and Multi-Target Tracking , 2007 .

[21]  Peter V. Gehler,et al.  DeepCut: Joint Subset Partition and Labeling for Multi Person Pose Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Frédo Durand,et al.  Capturing the human figure through a wall , 2015, ACM Trans. Graph..

[23]  Jean Ponce,et al.  Computer Vision: A Modern Approach , 2002 .

[24]  David A. Forsyth,et al.  Strike a pose: tracking people by finding stylized poses , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[25]  Isaac Kaminer,et al.  Vision-Based Tracking and Motion Estimation for Moving Targets Using Small UAVs , 2006 .

[26]  Siyang Cao,et al.  Real-Time Human Motion Behavior Detection via CNN Using mmWave Radar , 2019, IEEE Sensors Letters.

[27]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Richard Szeliski,et al.  Computer Vision - Algorithms and Applications , 2011, Texts in Computer Science.

[29]  Jonathan Tompson,et al.  Towards Accurate Multi-person Pose Estimation in the Wild , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).