An Empirical Evaluation Study on the Training of SDC Features for Dense Pixel Matching

Training a deep neural network is a non-trivial task. Not only the tuning of hyperparameters, but also the gathering and selection of training data, the design of the loss function, and the construction of training schedules is important to get the most out of a model. In this study, we perform a set of experiments all related to these issues. The model for which different training strategies are investigated is the recently presented SDC descriptor network (stacked dilated convolution). It is used to describe images on pixel-level for dense matching tasks. Our work analyzes SDC in more detail, validates some best practices for training deep neural networks, and provides insights into training with multiple domain data.

[1]  Jan Kautz,et al.  Models Matter, So Does Training: An Empirical Study of CNNs for Optical Flow Estimation , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[3]  Didier Stricker,et al.  Supplementary material of : CNN-based Patch Matching for Optical Flow with Thresholded Hinge Embedding Loss , 2017 .

[4]  Didier Stricker,et al.  FlowFields++: Accurate Optical Flow Correspondences Meet Robust Interpolation , 2018, 2018 25th IEEE International Conference on Image Processing (ICIP).

[5]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  Heiko Hirschmüller,et al.  Stereo Processing by Semiglobal Matching and Mutual Information , 2008, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[9]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[10]  Didier Stricker,et al.  SDC – Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[12]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[15]  Vincent Lepetit,et al.  DAISY: An Efficient Dense Descriptor Applied to Wide-Baseline Stereo , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Shaogang Gong,et al.  Class Rectification Hard Mining for Imbalanced Deep Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[17]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[18]  Didier Stricker,et al.  SceneFlowFields: Dense Interpolation of Sparse Scene Flow Correspondences , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Sepp Hochreiter,et al.  Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs) , 2015, ICLR.

[21]  Bernd Jähne,et al.  The HCI Benchmark Suite: Stereo and Flow Ground Truth with Uncertainties for Urban Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[22]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[23]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[24]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[25]  Andreas Geiger,et al.  Efficient Large-Scale Stereo Matching , 2010, ACCV.

[26]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[27]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Yunsong Li,et al.  Efficient Coarse-to-Fine Patch Match for Large Displacement Optical Flow , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Torsten Sattler,et al.  A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).