论文信息 - CLKN: Cascaded Lucas-Kanade Networks for Image Alignment

CLKN: Cascaded Lucas-Kanade Networks for Image Alignment

This paper proposes a data-driven approach for image alignment. Our main contribution is a novel network architecture that combines the strengths of convolutional neural networks (CNNs) and the Lucas-Kanade algorithm. The main component of this architecture is a Lucas-Kanade layer that performs the inverse compositional algorithm on convolutional feature maps. To train our network, we develop a cascaded feature learning method that incorporates the coarse-to-fine strategy into the training process. This method learns a pyramid representation of convolutional features in a cascaded manner and yields a cascaded network that performs coarse-to-fine alignment on the feature pyramids. We apply our model to the task of homography estimation, and perform training and evaluation on a large labeled dataset generated from the MS-COCO dataset. Experimental results show that the proposed approach significantly outperforms the other methods.

[1] Takeo Kanade,et al. An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[2] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[3] Selim Benhimane,et al. Real-time image-based tracking of planes using efficient second-order minimization , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[4] Simon Baker,et al. Lucas-Kanade 20 Years On: A Unifying Framework , 2004, International Journal of Computer Vision.

[5] Bill Triggs,et al. Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[6] J. Weickert,et al. Lucas/Kanade meets Horn/Schunck: combining local and global optic flow methods , 2005 .

[7] Kaare Brandt Petersen,et al. The Matrix Cookbook , 2006 .

[8] Matthew A. Brown,et al. Automatic Panoramic Image Stitching using Invariant Features , 2007, International Journal of Computer Vision.

[9] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[10] Clément Farabet,et al. Torch7: A Matlab-like Environment for Machine Learning , 2011, NIPS 2011.

[11] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12] Michael S. Brown,et al. As-Projective-As-Possible Image Stitching with Moving DLT , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13] Davide Scaramuzza,et al. SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[14] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[15] Daniel Cremers,et al. LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[16] Jian Sun,et al. SteadyFlow: Spatially Smooth Optical Flow for Video Stabilization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17] Andrew G. Howard,et al. Some Improvements on Deep Convolutional Neural Network Based Image Classification , 2013, ICLR.

[18] Vincent Lepetit,et al. Robust 3D Tracking with Descriptor Fields , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[19] Stefanos Zafeiriou,et al. Feature-Based Lucas–Kanade and Active Appearance Models , 2015, IEEE Transactions on Image Processing.

[20] Sergey Ioffe,et al. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[21] Trevor Darrell,et al. Fully convolutional networks for semantic segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[23] Andrew Zisserman,et al. Spatial Transformer Networks , 2015, NIPS.

[24] Éric Marchand,et al. Pose Estimation for Augmented Reality: A Hands-On Survey , 2016, IEEE Transactions on Visualization and Computer Graphics.

[25] Tomasz Malisiewicz,et al. Deep Image Homography Estimation , 2016, ArXiv.

[26] Brett Browning,et al. Robust Tracking in Low Light and Sudden Illumination Changes , 2016, 2016 Fourth International Conference on 3D Vision (3DV).