Bangkok CCTV Image through a Road Environment Extraction System Using Multi-Label Convolutional Neural Network Classification

Information regarding the conditions of roads is a safety concern when driving. In Bangkok, public weather sensors such as weather stations and rain sensors are insufficiently available to provide such information. On the other hand, a number of existing CCTV cameras have been deployed recently in various places for surveillance and traffic monitoring. Instead of deploying new sensors designed specifically for monitoring road conditions, images and location information from existing cameras can be used to obtain precise environmental information. Therefore, we propose a road environment extraction framework that covers different situations, such as raining and non-raining scenes, daylight and night-time scenes, crowded and non-crowded traffic, and wet and dry roads. The framework is based on CCTV images from a Bangkok metropolitan dataset, provided by the Bangkok Metropolitan Administration. To obtain information from CCTV image sequences, multi-label classification was considered by applying a convolutional neural network. We also compared various models, including transfer learning techniques, and developed new models in order to obtain optimum results in terms of performance and efficiency. By adding dropout and batch normalization techniques, our model could acceptably perform classification with only a few convolutional layers. Our evaluation showed a Hamming loss and exact match ratio of 0.039 and 0.84, respectively. Finally, a road environment monitoring system was implemented to test the proposed framework.

[1]  Francisco Charte,et al.  Multilabel Classification , 2016, Springer International Publishing.

[2]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Grigorios Tsoumakas,et al.  Multilabel Text Classification for Automated Tag Suggestion , 2008 .

[4]  Chaoyang Zhang,et al.  Deep learning architectures for multi-label classification of intelligent health risk prediction , 2017, BMC Bioinformatics.

[5]  Yan Yan,et al.  Multi-label learning based deep transfer neural network for facial attribute classification , 2018, Pattern Recognit..

[6]  Dong Xu,et al.  Learning Rotation-Invariant and Fisher Discriminative Convolutional Neural Networks for Object Detection , 2019, IEEE Transactions on Image Processing.

[7]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[8]  Mikolaj Leszczuk,et al.  Automated Detection of Firearms and Knives in a CCTV Image , 2016, Sensors.

[9]  M Swathy,et al.  Survey on Vehicle Detection and Tracking Techniques in Video Surveillance , 2017 .

[10]  Jinglu Hu,et al.  Improving SVM based multi-label classification by using label relationship , 2015, 2015 International Joint Conference on Neural Networks (IJCNN).

[11]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[12]  S. Appavu,et al.  An Intelligent Video Surveillance Framework with Big Data Management for Indian Road Traffic System , 2015 .

[13]  Alejandro F. Frangi,et al.  Medical Image Computing and Computer-Assisted Intervention -- MICCAI 2015 , 2015, Lecture Notes in Computer Science.

[14]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[15]  Imran Memon,et al.  Travel Recommendation Using Geo-tagged Photos in Social Media for Tourist , 2015, Wirel. Pers. Commun..

[16]  Andrea Haug,et al.  Usage of Road Weather Sensors for Automatic Traffic Control on Motorways , 2016 .

[17]  Bonghee Hong,et al.  Extraction of weather information on road using CCTV video , 2016, 2016 International Conference on Big Data and Smart Computing (BigComp).

[18]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[19]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jian-Ping Li,et al.  GEO matching regions: multiple regions of interests using content based image retrieval based on relative locations , 2017, Multimedia Tools and Applications.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Nojun Kwak,et al.  Analysis on the Dropout Effect in Convolutional Neural Networks , 2016, ACCV.

[23]  Wei Zhang,et al.  Moving vehicles detection based on adaptive motion histogram , 2010, Digit. Signal Process..

[24]  Jomdet Trimek Public Confidence in CCTV and Fear of Crime in Bangkok, Thailand , 2016 .

[25]  Alexander G. Hauptmann,et al.  MoSIFT: Recognizing Human Actions in Surveillance Videos , 2009 .

[26]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[27]  Heba Kurdi Survey on Traffic Control using Closed Circuit Television (CCTV) , 2013, CloudCom 2013.

[28]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[29]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[30]  Junwei Han,et al.  Duplex Metric Learning for Image Set Classification , 2018, IEEE Transactions on Image Processing.

[31]  Mohammad S. Sorower A Literature Survey on Algorithms for Multi-label Learning , 2010 .

[32]  Bhushan Nemade Automatic Traffic Surveillance Using Video Tracking , 2016 .

[33]  Yang Liu,et al.  Multi-scale and Discriminative Part Detectors Based Features for Multi-label Image Classification , 2018, IJCAI.

[34]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[35]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Youngjin Choi,et al.  Automatic Sea Fog Detection and Estimation of Visibility Distance on CCTV , 2018, Journal of Coastal Research.

[38]  Thomas B. Moeslund,et al.  Vision-Based Traffic Sign Detection and Analysis for Intelligent Driver Assistance Systems: Perspectives and Survey , 2012, IEEE Transactions on Intelligent Transportation Systems.

[39]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[40]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .