MFR-CNN: Incorporating Multi-Scale Features and Global Information for Traffic Object Detection

Object detection plays an important role in intelligent transportation systems and intelligent vehicles. Although the topic of object detection has been studied for decades, it is still challenging to accurately detect objects under complex scenarios. The contributing factors for challenges include diversified object and background appearance, motion blur, adverse weather conditions, and complex interactions among objects. In this paper, we propose a new convolutional neural network (CNN) model for traffic object detection, by using multi-scale local and global feature representation (MFR). The proposed model consists of two components: a region proposal network that generates candidate object regions and an object detection network that incorporates multi-scale features and global information, namely MFR-CNN. These two components are jointly optimized. Once the system is trained, it can detect real-world traffic objects accurately, especially small objects and heavily occluded objects. We evaluate the proposed method on four benchmark datasets, achieving consistent improvements over the state of the art.

[1]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Feng Gu,et al.  Nighttime Vehicle Detection Using Deformable Parts Model , 2015, 2015 7th International Conference on Intelligent Human-Machine Systems and Cybernetics.

[3]  Nanning Zheng,et al.  Parallel vision for perception and understanding of complex scenes: methods, framework, and perspectives , 2017, Artificial Intelligence Review.

[4]  Xiang Zhang,et al.  OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.

[5]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[6]  Mohan M. Trivedi,et al.  Drive Analysis Using Vehicle Dynamics and Vision-Based Lane Semantics , 2015, IEEE Transactions on Intelligent Transportation Systems.

[7]  Seiichi Mita,et al.  Vehicle detection using discriminatively trained part templates with variable size , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[8]  James H. Garrett,et al.  A neural network for image based vehicle detection , 1992 .

[9]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[13]  Jitendra Malik,et al.  Traffic Surveillance And Detection Technology Development: New Traffic Sensor Technology Final Report , 1997 .

[14]  Yanjie Yao,et al.  Vehicle License Plate Recognition Based on Extremal Regions and Restricted Boltzmann Machines , 2016, IEEE Transactions on Intelligent Transportation Systems.

[15]  Ehud Rivlin,et al.  Classification of Moving Targets Based on Motion and Appearance , 2003, BMVC.

[16]  Fei-Yue Wang,et al.  Data-Driven Intelligent Transportation Systems: A Survey , 2011, IEEE Transactions on Intelligent Transportation Systems.

[17]  Fei-Yue Wang,et al.  Parallel Control and Management for Intelligent Transportation Systems: Concepts, Architectures, and Applications , 2010, IEEE Transactions on Intelligent Transportation Systems.

[18]  Xiaogang Wang,et al.  Deep Learning Strong Parts for Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[19]  C. Laurgeau,et al.  Vehicle detection combining gradient analysis and AdaBoost classification , 2005, Proceedings. 2005 IEEE Intelligent Transportation Systems, 2005..

[20]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[21]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[22]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[23]  Bernt Schiele,et al.  What Makes for Effective Detection Proposals? , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Bernt Schiele,et al.  Filtered channel features for pedestrian detection , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[26]  Derek Hoiem,et al.  Diagnosing Error in Object Detectors , 2012, ECCV.

[27]  Geoffrey E. Hinton,et al.  Learning representations by back-propagating errors , 1986, Nature.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Kunfeng Wang,et al.  Measuring Driving Behaviors from Live Video , 2012, IEEE Intelligent Systems.

[30]  Anton van den Hengel,et al.  Pedestrian Detection with Spatially Pooled Features and Structured Ensemble Learning , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Fei-Yue Wang,et al.  A Multi-view Learning Approach to Foreground Detection for Traffic Surveillance Applications , 2016, IEEE Transactions on Vehicular Technology.

[32]  Mark Dougherty,et al.  A REVIEW OF NEURAL NETWORKS APPLIED TO TRANSPORT , 1995 .

[33]  Lionel Prevost,et al.  A Cascade of Boosted Generative and Discriminative Classifiers for Vehicle Detection , 2008, EURASIP J. Adv. Signal Process..

[34]  Pietro Perona,et al.  Fast Feature Pyramids for Object Detection , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Tao Xu,et al.  A novel method for people and vehicle classification based on Hough line feature , 2011, International Conference on Information Science and Technology.

[36]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[37]  Rogério Schmidt Feris,et al.  A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection , 2016, ECCV.

[38]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[39]  Joon Hee Han,et al.  Local Decorrelation For Improved Pedestrian Detection , 2014, NIPS.

[40]  Roman Goldenberg,et al.  A real-time system for classification of moving objects , 2002, Object recognition supported by user interaction for service robots.

[41]  Sharath Pankanti,et al.  Large-Scale Vehicle Detection, Indexing, and Search in Urban Surveillance Videos , 2012, IEEE Transactions on Multimedia.

[42]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[43]  Zezhi Chen,et al.  Vehicle detection, tracking and classification in urban traffic , 2012, 2012 15th International IEEE Conference on Intelligent Transportation Systems.

[44]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Akihiro Takeuchi,et al.  On-Road Multivehicle Tracking Using Deformable Object Model and Particle Filter With Improved Likelihood Estimation , 2012, IEEE Transactions on Intelligent Transportation Systems.

[46]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[48]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[49]  Nuno Vasconcelos,et al.  Learning Complexity-Aware Cascades for Deep Pedestrian Detection , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[50]  Kavita Bala,et al.  Inside-Outside Net: Detecting Objects in Context with Skip Pooling and Recurrent Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Mohan M. Trivedi,et al.  Learning to Detect Vehicles by Clustering Appearance Patterns , 2015, IEEE Transactions on Intelligent Transportation Systems.

[52]  Daniel J. Dailey,et al.  Dynamic camera calibration of roadside traffic management cameras for vehicle speed estimation , 2003, IEEE Trans. Intell. Transp. Syst..

[53]  Dayong Shen,et al.  Visual Tracking Based on Dynamic Coupled Conditional Random Field Model , 2016, IEEE Transactions on Intelligent Transportation Systems.

[54]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[55]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[56]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[57]  Fei-Yue Wang,et al.  $M^{4}CD$ : A Robust Change Detection Method for Intelligent Visual Surveillance , 2018, IEEE Access.

[58]  Koen E. A. van de Sande,et al.  Selective Search for Object Recognition , 2013, International Journal of Computer Vision.

[59]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Yanjie Yao,et al.  Video-Based Vehicle Detection Approach with Data-Driven Adaptive Neuro-Fuzzy Networks , 2015, Int. J. Pattern Recognit. Artif. Intell..

[61]  Yi Zeng,et al.  HCNN: A Neural Network Model for Combining Local and Global Features Towards Human-Like Classification , 2016, Int. J. Pattern Recognit. Artif. Intell..

[62]  Huimin Ma,et al.  3D Object Proposals for Accurate Object Class Detection , 2015, NIPS.

[63]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[64]  Ling Shao,et al.  DAVE: A Unified Framework for Fast Vehicle Detection and Annotation , 2016, ECCV.

[65]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.