Approaching real-world navigation using object recognition network

Typical navigation systems do not use object recognition as part of their autonomous driving systems. People often use hand-crafted features (e.g. lanes, traffic lights, intersections) based on programmers' knowledge about the environment. Those agents are usually brittle during real-world tests. However, landmarks, as a type of object, need to be recognized for an autonomous navigation system to generalize its learned training data to other unfamiliar environments. In this work we utilize the Developmental Network (DN), which has been tested extensively with object recognition tasks, for a mobile agent and train it to self-navigate in controlled indoor environment. The proposed system uses Lobe Component Analysis (LCA) to learn features from both stereo cameras and desired navigation actions. Neurons attend to different areas of the input image. They compete for firing according to the goodness of matching result. This enables the agent to attend to image local patches as landmarks without explicitly defining objects. Our analysis shows that attention can be corrected by direct supervision and by indirect reinforcement provided by the teacher. We anticipate our work in this paper to be a starting point of research efforts that shift expensive range-scanner-based methods to inexpensive camera-based methods that, although using richer information, face challenges of object appearance variations.

[1]  Juyang Weng,et al.  Synapse maintenance in the Where-What Networks , 2011, The 2011 International Joint Conference on Neural Networks.

[2]  Kui Qian,et al.  Modeling the effects of neuromodulation on internal brain areas: Serotonin and dopamine , 2013, The 2013 International Joint Conference on Neural Networks (IJCNN).

[3]  Charles E. Thorpe,et al.  SCARF: a color vision system that tracks roads and intersections , 1993, IEEE Trans. Robotics Autom..

[4]  Pietro Perona,et al.  One-shot learning of object categories , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Juyang Weng,et al.  Modeling dopamine and serotonin systems in a visual recognition network , 2011, The 2011 International Joint Conference on Neural Networks.

[6]  Juyang Weng,et al.  Where-What Network 5: Dealing with scales for objects in complex backgrounds , 2011, The 2011 International Joint Conference on Neural Networks.

[7]  M. Rosenblum,et al.  Neurons that know how to drive , 2000, Proceedings of the IEEE Intelligent Vehicles Symposium 2000 (Cat. No.00TH8511).

[8]  Shumeet Baluja,et al.  Evolution of an artificial neural network based autonomous land vehicle controller , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[9]  Juyang Weng,et al.  WWN: Integration with coarse-to-fine, supervised and reinforcement learning , 2014, 2014 International Joint Conference on Neural Networks (IJCNN).

[10]  William Whittaker,et al.  Autonomous driving in urban environments: Boss and the Urban Challenge , 2008, J. Field Robotics.

[11]  Juyang Weng,et al.  Dually Optimal Neuronal Layers: Lobe Component Analysis , 2009, IEEE Transactions on Autonomous Mental Development.

[12]  Juyang Weng,et al.  State-based SHOSLIF for indoor visual navigation , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[13]  Deva Ramanan,et al.  Learning to parse images of articulated bodies , 2006, NIPS.

[14]  Sebastian Thrun,et al.  Towards fully autonomous driving: Systems and algorithms , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[15]  Fei-FeiLi,et al.  One-Shot Learning of Object Categories , 2006 .

[16]  Juyang Weng,et al.  Developmental Stereo: Emergence of Disparity Preference in Models of the Visual Cortex , 2009, IEEE Transactions on Autonomous Mental Development.

[17]  Doina Precup,et al.  Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..

[18]  Dean Pomerleau,et al.  ALVINN, an autonomous land vehicle in a neural network , 2015 .

[19]  J. Cleary,et al.  \self-organized Language Modeling for Speech Recognition". In , 1997 .

[20]  Juyang Weng,et al.  Where-What Network 3: Developmental top-down attention for multiple foregrounds and complex backgrounds , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).