Learning Visual Odometry with a Convolutional Network

We present an approach to predicting velocity and direction changes from visual information (”visual odometry”) using an end-to-end, deep learning-based architecture. The architecture uses a single type of computational module and learning rule to extract visual motion, depth, and finally odometry information from the raw data. Representations of depth and motion are extracted by detecting synchrony across time and stereo channels using network layers with multiplicative interactions. The extracted representations are turned into information about changes in velocity and direction using a convolutional neural network. Preliminary results show that the architecture is capable of learning the resulting mapping from video to egomotion.

[1]  Akihiro Yamamoto,et al.  Visual Odometry by Multi-frame Feature Integration , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[2]  Roland Memisevic Gradient-based learning of higher-order features , 2011 .

[3]  Roland Memisevic,et al.  Zero-bias autoencoders and the benefits of co-adapting features , 2014, ICLR.

[4]  Roland Memisevic,et al.  Learning to encode motion using spatio-temporal synchrony , 2013, ICLR 2014.

[5]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[6]  Roland Memisevic,et al.  Unsupervised learning of depth and motion , 2013, ArXiv.

[7]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[8]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Yizhou Wang,et al.  What Object Motion Reveals about Shape with Unknown BRDF and Lighting , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Andreas Geiger,et al.  Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme , 2010, 2010 IEEE Intelligent Vehicles Symposium.