Real-Time Scene Understanding Using Deep Neural Networks for RoboCup SPL

Convolutional neural networks (CNNs) are the state-of-the-art method for most computer vision tasks. But, the deployment of CNNs on mobile or embedded platforms is challenging because of CNNs’ excessive computational requirements. We present an end-to-end neural network solution to scene understanding for robot soccer. We compose two key neural networks: one to perform semantic segmentation on an image, and another to propagate class labels between consecutive frames. We trained our networks on synthetic datasets and fine-tuned them on a set consisting of real images from a Nao robot. Furthermore, we investigate and evaluate several practical methods for increasing the efficiency and performance of our networks. Finally, we present RoboDNN, a C++ neural network library designed for fast inference on the Nao robots.

[1]  Nicolás Cruz,et al.  Using Convolutional Neural Networks in Robots with Limited Computational Resources: Detecting NAO Robots while Playing Soccer , 2017, RoboCup.

[2]  Gunnar Farnebäck,et al.  Two-Frame Motion Estimation Based on Polynomial Expansion , 2003, SCIA.

[3]  Garrison W. Cottrell,et al.  Understanding Convolution for Semantic Segmentation , 2017, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[4]  Jacky Baltes,et al.  Humanoid Robot Detection Using Deep Learning: A Speed-Accuracy Tradeoff , 2017, RoboCup.

[5]  Oliver Urbann,et al.  A Robust and Calibration-Free Vision System for Humanoid Soccer Robots , 2015, RoboCup.

[6]  Peter Stone,et al.  Fast and Precise Black and White Ball Detection for RoboCup Soccer , 2017, RoboCup.

[7]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Wonyong Sung,et al.  Structured Pruning of Deep Convolutional Neural Networks , 2015, ACM J. Emerg. Technol. Comput. Syst..

[9]  Roberto Cipolla,et al.  SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Simon O'Keeffe,et al.  A Benchmark Data Set and Evaluation of Deep Learning Architectures for Ball Detection in the RoboCup SPL , 2017, RoboCup.

[11]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[12]  Visvanathan Ramesh,et al.  Large-Scale Stochastic Scene Generation and Semantic Annotation for Deep Convolutional Neural Network Training in the RoboCup SPL , 2017, RoboCup.

[13]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[14]  Sven Behnke,et al.  Learning Visual Obstacle Detection Using Color Histogram Features , 2011, RoboCup.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Yang Xu,et al.  Weakly supervised semantic segmentation with superpixel embedding , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[17]  Zhengqin Li,et al.  Superpixel segmentation using Linear Spectral Clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[19]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Timo Aila,et al.  Pruning Convolutional Neural Networks for Resource Efficient Inference , 2016, ICLR.