An Automatic Detection Algorithm of Metro Passenger Boarding and Alighting Based on Deep Learning and Optical Flow

Urban rail transit is one of the important application areas of autonomous driving, which will require an algorithm to monitor the exchange of passengers between metro carriages and stations. To this end, this article first proposes a metro passenger detection (MPD) algorithm to track passengers getting on or off, namely the MPD algorithm. The algorithm is composed of two MetroNexts and an optical flow algorithm. The two MetroNexts are based on a novel multiple scales attention convolution (MSAC) block and detect metro carriages and passengers fast and accurately with small model size. The optical flow algorithm predicts the direction of moving passengers, which help the MPD algorithm filter out unrelated ones. Combining the coordinates and moving direction of passengers, the MPD algorithm can determine whether passengers are boarding or alighting. Based on various benchmark data sets and a new metro station data set, the experimental results have demonstrated that the MSAC block is effective to improve detection accuracy. Compared with the existing state-of-the-art detection networks, the two MetroNexts have achieved competitive results with good time/memory efficiency on pedestrian data sets. On the metro station data set, the MPD algorithm robustly recognizes target passengers, the detection speed of which remains competitive even on the embedded platform, thus becoming an intelligent instrument of the metro station.

[1]  Hironobu Fujiyoshi,et al.  Moving target classification and tracking from real-time video , 1998, Proceedings Fourth IEEE Workshop on Applications of Computer Vision. WACV'98 (Cat. No.98EX201).

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Weishan Zhang,et al.  Deep Learning based Real-Time Fine-grained Pedestrian Recognition using Stream Processing , 2018 .

[4]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[5]  Yaonan Wang,et al.  Real-Time Classification of Rubber Wood Boards Using an SSR-Based CNN , 2020, IEEE Transactions on Instrumentation and Measurement.

[6]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[7]  Xiaogang Wang,et al.  A discriminative deep model for pedestrian detection with occlusion handling , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Mohammad Rastegari,et al.  DiCENet: Dimension-Wise Convolutions for Efficient Networks , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Nuno Vasconcelos,et al.  Learning Complexity-Aware Cascades for Deep Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Tomaso A. Poggio,et al.  Pedestrian detection using wavelet templates , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Xiangyu Zhang,et al.  ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  Xuelong Li,et al.  Speed up deep neural network based pedestrian detection by sharing features across multi-scale models , 2016, Neurocomputing.

[13]  M. Szarvas,et al.  Pedestrian detection with convolutional neural networks , 2005, IEEE Proceedings. Intelligent Vehicles Symposium, 2005..

[14]  Qinggang Meng,et al.  An End-to-End Steel Surface Defect Detection Approach via Fusing Multiple Hierarchical Features , 2020, IEEE Transactions on Instrumentation and Measurement.

[15]  Xuelong Li,et al.  Learning Multilayer Channel Features for Pedestrian Detection , 2016, IEEE Transactions on Image Processing.

[16]  Forrest N. Iandola,et al.  SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <1MB model size , 2016, ArXiv.

[17]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[18]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[19]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Xiaogang Wang,et al.  Joint Deep Learning for Pedestrian Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[21]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[22]  Sorin Grigorescu,et al.  A Survey of Deep Learning Techniques for Autonomous Driving , 2020, J. Field Robotics.

[23]  Chunhua Shen,et al.  Pushing the Limits of Deep CNNs for Pedestrian Detection , 2016, IEEE Transactions on Circuits and Systems for Video Technology.

[24]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Charles X. Ling,et al.  Pelee: A Real-Time Object Detection System on Mobile Devices , 2018, NeurIPS.

[26]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Tao Yang,et al.  Pedestrian detection with dilated convolution, region proposal network and boosted decision trees , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[28]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[29]  Gang Wang,et al.  Graininess-Aware Deep Feature Learning for Pedestrian Detection , 2018, ECCV.

[30]  Shuicheng Yan,et al.  Scale-Aware Fast R-CNN for Pedestrian Detection , 2015, IEEE Transactions on Multimedia.

[31]  Bernt Schiele,et al.  Taking a deeper look at pedestrians , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yann LeCun,et al.  Pedestrian Detection with Unsupervised Multi-stage Feature Learning , 2012, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Jian Yang,et al.  Occluded Pedestrian Detection Through Guided Attention in CNNs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[34]  Mehran Yazdi,et al.  New trends on moving object detection in video images captured by a moving camera: A survey , 2018, Comput. Sci. Rev..

[35]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[37]  Tao Zhang,et al.  A Survey of Model Compression and Acceleration for Deep Neural Networks , 2017, ArXiv.

[38]  Anelia Angelova,et al.  Real-Time Pedestrian Detection with Deep Network Cascades , 2015, BMVC.

[39]  Azriel Rosenfeld,et al.  Tracking Groups of People , 2000, Comput. Vis. Image Underst..

[40]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[41]  Shaogang Gong,et al.  Cumulative Attribute Space for Age and Crowd Density Estimation , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[42]  Michael Ying Yang,et al.  Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection , 2019, ISPRS Journal of Photogrammetry and Remote Sensing.

[43]  Tobias Meisen,et al.  Ablation Studies in Artificial Neural Networks , 2019, ArXiv.

[44]  Xiaogang Wang,et al.  Learning Mutual Visibility Relationship for Pedestrian Detection with a Deep Model , 2016, International Journal of Computer Vision.

[45]  Shibiao Xu,et al.  Real-time pedestrian detection via hierarchical convolutional feature , 2018, Multimedia Tools and Applications.

[46]  Kaiming He,et al.  Feature Pyramid Networks for Object Detection , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Chunluan Zhou,et al.  Bi-box Regression for Pedestrian Detection and Occlusion Estimation , 2018, ECCV.

[48]  Nannan Li,et al.  Deep Pedestrian Detection Using Contextual Information and Multi-level Features , 2018, MMM.

[49]  David Ribeiro,et al.  Efficient and robust Pedestrian Detection using Deep Learning for Human-Aware Navigation , 2016, Robotics Auton. Syst..

[50]  Wentao Mao,et al.  Predicting Remaining Useful Life of Rolling Bearings Based on Deep Feature Representation and Transfer Learning , 2020, IEEE Transactions on Instrumentation and Measurement.

[51]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2005, International Journal of Computer Vision.

[52]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[53]  Liang Lin,et al.  Is Faster R-CNN Doing Well for Pedestrian Detection? , 2016, ECCV.

[54]  Yongliang Ma,et al.  An object tracking algorithm based on optical flow and temporal–spatial context , 2019, Cluster Computing.

[55]  Joachim Denzler,et al.  Model based extraction of articulated objects in image sequences for gait analysis , 1997, Proceedings of International Conference on Image Processing.

[56]  Wanqing Li,et al.  Human detection from images and videos: A survey , 2016, Pattern Recognit..

[57]  Bolei Zhou,et al.  Object Detectors Emerge in Deep Scene CNNs , 2014, ICLR.

[58]  Bernt Schiele,et al.  CityPersons: A Diverse Dataset for Pedestrian Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).