Fundamental Principles on Learning New Features for Effective Dense Matching

In dense matching (including stereo matching and optical flow), nearly all existing approaches are based on simple features, such as gray or RGB color, gradient or simple transformations like census, to calculate matching costs. These features do not perform well in complex scenes that may involve radiometric changes, noises, overexposure and/or textureless regions. Various problems may appear, such as wrong matching at the pixel or region level, flattening/breaking of edges and/or even entire structural collapse. In this paper, we propose two fundamental principles based on the consistency and the distinctiveness of features. We show that almost all existing problems in dense matching are caused by features that violate one or both of these principles. To systematically learn good features for dense matching, we develop a general multi-objective optimization based on these two principles and apply convolutional neural networks to find new features that lie on the Pareto frontier. By using two-frame optical flow and stereo matching as applications, our experimental results show that the features learned can significantly improve the performance of state-of-the-art approaches. Based on the KITTI benchmarks, our method ranks first on the two stereo benchmarks and is the best among existing two-frame optical-flow algorithms on flow benchmarks.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  A. Shamsai,et al.  Multi-objective Optimization , 2017, Encyclopedia of Machine Learning and Data Mining.

[3]  Dima Damen,et al.  Recognizing linked events: Searching the space of feasible explanations , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Stefan Roth,et al.  Joint Optical Flow and Temporally Consistent Semantic Segmentation , 2016, ECCV Workshops.

[5]  Carsten Rother,et al.  Fast cost-volume filtering for visual correspondence and beyond , 2011, CVPR 2011.

[6]  Larry H. Matthies,et al.  Enhanced real-time stereo using bilateral filtering , 2004, Proceedings. 2nd International Symposium on 3D Data Processing, Visualization and Transmission, 2004. 3DPVT 2004..

[7]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[8]  Fiora Pirri,et al.  Confidence driven TGV fusion , 2016, ArXiv.

[9]  Minh N. Do,et al.  Joint Histogram-Based Cost Aggregation for Stereo Matching , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Heiko Hirschmüller,et al.  Evaluation of Cost Functions for Stereo Matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Michael S. Brown,et al.  SPM-BP: Sped-Up PatchMatch Belief Propagation for Continuous MRFs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[13]  Didier Stricker,et al.  Supplementary material of : CNN-based Patch Matching for Optical Flow with Thresholded Hinge Embedding Loss , 2017 .

[14]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Carsten Rother,et al.  PatchMatch Stereo - Stereo Matching with Slanted Support Windows , 2011, BMVC.

[16]  Hujun Bao,et al.  Consistent Depth Maps Recovery from a Video Sequence , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Yasuyuki Matsushita,et al.  Motion detail preserving optical flow estimation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[19]  Heiko Hirschmüller,et al.  Evaluation of Stereo Matching Costs on Images with Radiometric Differences , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Michael J. Black,et al.  Optical Flow with Semantic Segmentation and Localized Layers , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Kaisa Miettinen,et al.  Introduction to Multiobjective Optimization: Noninteractive Approaches , 2008, Multiobjective Optimization.

[22]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[23]  Ruigang Yang,et al.  A Performance Study on Different Cost Aggregation Approaches Used in Real-Time Stereo Matching , 2007, International Journal of Computer Vision.

[24]  Lothar Thiele,et al.  Multiobjective evolutionary algorithms: a comparative case study and the strength Pareto approach , 1999, IEEE Trans. Evol. Comput..

[25]  Manoranjan Paul,et al.  Just Noticeable Difference for Images With Decomposition Model for Separating Edge and Textured Regions , 2010, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Zhuowen Tu,et al.  Aggregated Residual Transformations for Deep Neural Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Kalyanmoy Deb,et al.  A fast and elitist multiobjective genetic algorithm: NSGA-II , 2002, IEEE Trans. Evol. Comput..

[28]  Zhenzhong Chen,et al.  Binocular Just-Noticeable-Difference Model for Stereoscopic Images , 2011, IEEE Signal Processing Letters.

[29]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Adam Finkelstein,et al.  PatchMatch: a randomized correspondence algorithm for structural image editing , 2009, SIGGRAPH 2009.

[31]  Kaisa Miettinen,et al.  Introduction to Multiobjective Optimization: Interactive Approaches , 2008, Multiobjective Optimization.

[32]  Sang Uk Lee,et al.  Illumination and camera invariant stereo matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[33]  Andrew W. Fitzgibbon,et al.  PMBP: PatchMatch Belief Propagation for Correspondence Field Estimation , 2014, International Journal of Computer Vision.

[34]  Takeshi Naemura,et al.  Graph Cut Based Continuous Stereo Matching Using Locally Shared Labels , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Lawrence D. Jackel,et al.  Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[36]  Xukun Shen,et al.  PM-PM: PatchMatch With Potts Model for Object Segmentation and Stereo Matching , 2015, IEEE Transactions on Image Processing.

[37]  Qingxiong Yang,et al.  A non-local cost aggregation method for stereo matching , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[39]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[41]  C. Hwang Multiple Objective Decision Making - Methods and Applications: A State-of-the-Art Survey , 1979 .

[42]  Lior Wolf,et al.  PatchBatch: A Batch Augmented Loss for Optical Flow , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[43]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[45]  Nanning Zheng,et al.  Stereo Matching Using Belief Propagation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[46]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[47]  Cheng Soon Ong,et al.  Multivariate spearman's ρ for aggregating ranks using copulas , 2016 .

[48]  Jian Sun,et al.  Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[49]  Andreas Geiger,et al.  Displets: Resolving stereo ambiguities using object knowledge , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Jiaolong Yang,et al.  Dense, accurate optical flow estimation with piecewise parametric model , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[51]  Trevor Darrell,et al.  Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[52]  Tomoyuki Hiroyasu,et al.  SPEA2+: Improving the Performance of the Strength Pareto Evolutionary Algorithm 2 , 2004, PPSN.

[53]  Yann LeCun,et al.  Computing the stereo matching cost with a convolutional neural network , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Marc Pollefeys,et al.  Patch Based Confidence Prediction for Dense Disparity Map , 2016, BMVC.

[55]  Marc Roubens,et al.  Multiple criteria decision making , 1994 .