A Benchmark Data Set and Evaluation of Deep Learning Architectures for Ball Detection in the RoboCup SPL

This paper presents a benchmark data set for evaluating ball detection algorithms in the RoboCup Soccer Standard Platform League. We created a labelled data set of images with and without ball derived from vision log files recorded by multiple NAO robots in various lighting conditions. The data set contains 5209 labelled ball image regions and 10924 non-ball regions. Non-ball image regions all contain features that had been classified as a potential ball candidate by an existing ball detector. The data set was used to train and evaluate 252 different Deep Convolutional Neural Network (CNN) architectures for ball detection. In order to control computational requirements, this evaluation focused on networks with 2–5 layers that could feasibly run in the vision and cognition cycle of a NAO robot using two cameras at full frame rate (2 × 30 Hz). The results show that the classification performance of the networks is quite insensitive to the details of the network design including input image size, number of layers and number of outputs at each layer. In an effort to reduce the computational requirements of CNNs we evaluated XNOR-Net architectures which quantize the weights and activations of a neural network to binary values. We examined XNOR-Nets corresponding to the real-valued CNNs we had already tested in order to quantify the effect on classification metrics. The results indicate that ball classification performance degrades by 12% on average when changing from real-valued CNN to corresponding XNOR-Net.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Kyuyeon Hwang,et al.  Fixed-point feedforward deep neural network design using weights +1, 0, and −1 , 2014, 2014 IEEE Workshop on Signal Processing Systems (SiPS).

[3]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, NIPS.

[4]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Yann LeCun,et al.  Optimal Brain Damage , 1989, NIPS.

[6]  Lorien Y. Pratt,et al.  Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.

[7]  Aditya Bhaskara,et al.  Provable Bounds for Learning Some Deep Representations , 2013, ICML.

[8]  Ming Yang,et al.  Compressing Deep Convolutional Networks using Vector Quantization , 2014, ArXiv.

[9]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[10]  Igor Carron,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016 .

[11]  G. Lewicki,et al.  Approximation by Superpositions of a Sigmoidal Function , 2003 .

[12]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[14]  Yoshua Bengio,et al.  BinaryConnect: Training Deep Neural Networks with binary weights during propagations , 2015, NIPS.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[17]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[18]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[19]  Yoshua Bengio,et al.  Training deep neural networks with low precision multiplications , 2014 .

[20]  Rajat Raina,et al.  Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.

[21]  Babak Hassibi,et al.  Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.

[22]  Dong Yu,et al.  Conversational Speech Transcription Using Context-Dependent Deep Neural Networks , 2012, ICML.

[23]  Qiang Chen,et al.  Network In Network , 2013, ICLR.