Bus-Crowdedness Estimation by Shallow Convolutional Neural Network

Bus-crowdedness information is essential not only for a bus company to arrange schedules but also for passengers to plan their trip. This paper presents a very shallow Single camera based Convolutional Neural Networks (SCNN) and Multi-cameras based Convolutional Neural Networks (MCNN) to estimate the crowdedness of a bus. The images from front and back cameras are combined as input for the SCNN and MCNN models. To make the model robust enough, some strategies to generate samples are proposed, such as random cropped and random white balance. In order to minimize model, we try to reduce convolutional layers, reduce the size of convolution kennel and enlarge the stride of the max-pooling layer, instead of dropping the model. As such, the forward calculations of SCNN and MCNN are essentially reduced. The models are so lightweight that can run on an embedded system carried on the bus. Trace-based simulation demonstrate the viability of our design choices. Using the model to estimate the crowdedness level of the bus get a very high accuracy of 99.1%. It can be done in real time with high robustness.

[1]  Andrew Zisserman,et al.  Learning To Count Objects in Images , 2010, NIPS.

[2]  Shenghua Gao,et al.  Single-Image Crowd Counting via Multi-Column Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Paul A. Viola,et al.  Detecting Pedestrians Using Patterns of Motion and Appearance , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[4]  Nuno Vasconcelos,et al.  Bayesian Model Adaptation for Crowd Counts , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Nuno Vasconcelos,et al.  Counting People With Low-Level Features and Bayesian Regression , 2012, IEEE Transactions on Image Processing.

[6]  Junjie Yan,et al.  Water Filling: Unsupervised People Counting via Vertical Kinect Sensor , 2012, 2012 IEEE Ninth International Conference on Advanced Video and Signal-Based Surveillance.

[7]  Ullrich Köthe,et al.  Learning to count with regression forest and structured labels , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[8]  Shaogang Gong,et al.  Feature Mining for Localised Crowd Counting , 2012, BMVC.

[9]  Mario Vento,et al.  Counting people by RGB or depth overhead cameras , 2016, Pattern Recognit. Lett..

[10]  Xiaogang Wang,et al.  Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Andrew Zisserman,et al.  Interactive Object Counting , 2014, ECCV.

[12]  Andrew Y. Ng,et al.  End-to-End People Detection in Crowded Scenes , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[14]  Daniel Oñoro-Rubio,et al.  Towards Perspective-Free Object Counting with Deep Learning , 2016, ECCV.

[15]  Antoni B. Chan,et al.  Crowd Counting by Adapting Convolutional Neural Networks with Side Information , 2016, ArXiv.

[16]  Joaquín Salas,et al.  Counting pedestrians with a zenithal arrangement of depth cameras , 2015, Machine Vision and Applications.

[17]  Nuno Vasconcelos,et al.  Privacy preserving crowd monitoring: Counting people without people models or tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.