Door recognition and deep learning algorithm for visual based robot navigation

In this paper, a new method based on deep learning for robotics autonomous navigation is presented. Different from the most traditional methods based on fixed models, a convolutional neural network (CNN) modelling technique in Deep learning is selected to extract the feature inspired by the working pattern of the biological brain. This neural network model has muti-layer features where the ambient scenes can be recognized and useful information such as the location of door can be identified. The extracted information can be used for robot navigation, so does the robot can approach the target accurately. In the field experiments, detecting doors and predicting the door poses such tasks are designed in the indoor environment to verify the proposed method. The experimental results demonstrate that the doors can be identified with good performance and the deep learning model is suitable for robot navigation.

[1]  Nicolas Pinto,et al.  Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[2]  Marc'Aurelio Ranzato,et al.  Learning invariant features through topographic filter maps , 2009, CVPR.

[3]  David M. Bradley,et al.  Learning for Autonomous Navigation , 2010, IEEE Robotics & Automation Magazine.

[4]  Vu Anh Nguyen,et al.  A spatio-temporal Long-term Memory approach for visual place recognition in mobile robotic navigation , 2013, Robotics Auton. Syst..

[5]  Marc'Aurelio Ranzato,et al.  Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[7]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[8]  Marc'Aurelio Ranzato,et al.  Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Eero P. Simoncelli,et al.  Nonlinear image representation using divisive normalization , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Quoc V. Le,et al.  ICA with Reconstruction Cost for Efficient Overcomplete Feature Learning , 2011, NIPS.

[11]  Jean Ponce,et al.  A Theoretical Analysis of Feature Pooling in Visual Recognition , 2010, ICML.

[12]  Francisco Bonin-Font,et al.  Visual Navigation for Mobile Robots: A Survey , 2008, J. Intell. Robotic Syst..

[13]  Michael Suppa,et al.  Efficient navigation based on the Landmark-Tree map and the Z∞ algorithm using an omnidirectional camera , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[14]  R. Fergus,et al.  Learning invariant features through topographic filter maps , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Yann LeCun,et al.  What is the best multi-stage architecture for object recognition? , 2009, 2009 IEEE 12th International Conference on Computer Vision.