Synthetic training data for deep neural networks on visual correspondence tasks

In the realm of deep learning for computer vision tasks, the best performing models tend to be trained with supervision, i.e. with a training dataset that contains ground-truth annotations which the model is expected to match. Visual tasks are particularly interesting because humans rely mostly on their eyes for almost everything they do; we attribute great importance to our visual perception of the world, and we have developed methods to produce visually realistic simulations of this world for purposes of entertainment, communication, and research. These same methods enable the creation of synthetic training data: rendered views of virtual worlds with annotations that are more extensive and accurate than anything a human could label with justifiable time and effort. In this thesis, we motivate and describe the making of large synthetic datasets for low-level correspondence matching problems. We used these datasets to train deep neural networks for the fundamental vision tasks of optical flow and stereo disparity estimation, achieving a new state of the art at the time of their publication. We further isolate individual design components that make up an optical flow dataset, and analyze their contributions to the data’s suitability for training. Finally, we use our results to create new datasets for specific real-world scenarios, thereby demonstrating that data engineering is a viable and practicable method for improving the performance of neural networks. Complementary to optimizations that operate on a network itself such as those of architecture, loss function or model capacity, data is a design dimension that can be varied even if the learning algorithm is a black box.

[1]  Daniel Asmar,et al.  The benefits of synthetic data for action categorization , 2020, 2020 International Joint Conference on Neural Networks (IJCNN).

[2]  Thomas Brox,et al.  Diskmask: Focusing Object Features for Accurate Instance Segmentation of Elongated or Overlapping Objects , 2020, 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI).

[3]  Wei Chen,et al.  A Unified Framework for Depth Prediction from a Single Image and Binocular Stereo Matching , 2020, Remote. Sens..

[4]  Vladlen Koltun,et al.  Playing for Data: Ground Truth from Computer Games , 2016, ECCV.

[5]  Zhengqi Li,et al.  MegaDepth: Learning Single-View Depth Prediction from Internet Photos , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[6]  D J Heeger,et al.  Model for the extraction of image flow. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[7]  Trevor Darrell,et al.  Hierarchical Discrete Distribution Decomposition for Match Density Estimation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yingyun Yang,et al.  A feature extraction technique in stereo matching network , 2019, 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC).

[9]  Frederic Devernay,et al.  A Variational Method for Scene Flow Estimation from Stereo Sequences , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[10]  Karteek Alahari,et al.  Learning Motion Patterns in Videos , 2016, CVPR.

[11]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Max Mehltretter,et al.  Uncertainty Estimation for End-To-End Learned Dense Stereo Matching via Probabilistic Deep Learning , 2020, ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[13]  Xiaolin Hu,et al.  UnrealStereo: Controlling Hazardous Factors to Analyze Stereo Vision , 2016, 2018 International Conference on 3D Vision (3DV).

[14]  Alexandre Bernardino,et al.  Applying Domain Randomization to Synthetic Data for Object Category Detection , 2018, ArXiv.

[15]  Reinhard Koch,et al.  Pattern recognition : 36th German Conference, GCPR 2014, Münster, Germany, September 2-5, 2014 : proceedings , 2014 .

[16]  Wolfram Burgard,et al.  Adaptive Curriculum Generation from Demonstrations for Sim-to-Real Visuomotor Control , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Thomas Brox,et al.  Uncertainty Estimates and Multi-hypotheses Networks for Optical Flow , 2018, ECCV.

[18]  Lior Wolf,et al.  ScopeFlow: Dynamic Scene Scoping for Optical Flow , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Wojciech Zaremba,et al.  Domain Randomization and Generative Models for Robotic Grasping , 2017, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[20]  Theo Gevers,et al.  Three for one and one for three: Flow, Segmentation, and Surface Normals , 2018, BMVC.

[21]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[22]  Tobias Senst,et al.  Optical Flow Dataset and Benchmark for Visual Crowd Analysis , 2018, 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS).

[23]  Shaowu Yang,et al.  Convolutional neural network-based coarse initial position estimation of a monocular camera in large-scale 3D light detection and ranging maps , 2019, International Journal of Advanced Robotic Systems.

[24]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[25]  Xuebin Liu,et al.  A Stereo Matching with Reconstruction Network for Low-light Stereo Vision , 2019, SPML '19.

[26]  Karteek Alahari,et al.  Learning Video Object Segmentation with Visual Memory , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[27]  Raquel Urtasun,et al.  Efficient Deep Learning for Stereo Matching , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yasuyuki Matsushita,et al.  Motion detail preserving optical flow estimation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[30]  David J. Fleet,et al.  Performance of optical flow techniques , 1994, International Journal of Computer Vision.

[31]  Alan L. Yuille,et al.  UnrealCV: Connecting Computer Vision to Unreal Engine , 2016, ECCV Workshops.

[32]  Stefano Mattoccia,et al.  Learning a confidence measure in the disparity domain from O(1) features , 2020, Comput. Vis. Image Underst..

[33]  Xiao Guo,et al.  Autonomous quadrotor obstacle avoidance based on dueling double deep recurrent Q network with monocular vision , 2020, ArXiv.

[34]  Leonidas J. Guibas,et al.  FlowNet3D: Learning Scene Flow in 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Philippos Mordohai,et al.  RecResNet: A Recurrent Residual CNN Architecture for Disparity Map Enhancement , 2018, 2018 International Conference on 3D Vision (3DV).

[36]  Thomas Brox,et al.  Automated Boxwood Topiary Trimming with a Robotic Arm and Integrated Stereo Vision* , 2019, 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[37]  Stefan Leutenegger,et al.  SceneNet RGB-D: Can 5M Synthetic Images Beat Generic ImageNet Pre-training on Indoor Segmentation? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[38]  Ruigang Yang,et al.  Domain-invariant Stereo Matching Networks , 2019, ECCV.

[39]  Farzeen Munir,et al.  Disparity Estimation Using Stereo Images With Different Focal Lengths , 2020, IEEE Transactions on Intelligent Transportation Systems.

[40]  Alexandre Bernardino,et al.  Two‐stage 3D model‐based UAV pose estimation: A comparison of methods for optimization , 2020, J. Field Robotics.

[41]  Guangming Shi,et al.  Joint Demosaicing and Denoising with Perceptual Optimization on a Generative Adversarial Network , 2018, ArXiv.

[42]  Sebastian Scherer,et al.  Deep-Learning Assisted High-Resolution Binocular Stereo Depth Reconstruction , 2020, 2020 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Rongke Liu,et al.  Depth Estimation with Multi-Resolution Stereo Matching , 2019, 2019 IEEE Visual Communications and Image Processing (VCIP).

[44]  Ijaz Akhter,et al.  EpO-Net: Exploiting Geometric Constraints on Dense Trajectories for Motion Saliency , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[45]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[46]  Luc Van Gool,et al.  Towards Good Practice for CNN-Based Monocular Depth Estimation , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[47]  Hans-Hellmut Nagel,et al.  Optical Flow Estimation: Advances and Comparisons , 1994, ECCV.

[48]  Qiao Wang,et al.  VirtualWorlds as Proxy for Multi-object Tracking Analysis , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Robert B. Fisher,et al.  TrimBot2020: an outdoor robot for automatic gardening , 2018, ArXiv.

[50]  Kwanghoon Sohn,et al.  Simultaneous Deep Stereo Matching and Dehazing with Feature Attention , 2020, International Journal of Computer Vision.

[51]  Naila Murray,et al.  Virtual KITTI 2 , 2020, ArXiv.

[52]  Torsten Sattler,et al.  A Multi-view Stereo Benchmark with High-Resolution Images and Multi-camera Videos , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Bo Li,et al.  MSDC-Net: Multi-Scale Dense and Contextual Networks for Automated Disparity Map for Stereo Matching , 2019, ArXiv.

[55]  Andrew J. Davison,et al.  A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[56]  Dragomir Anguelov,et al.  Scalability in Perception for Autonomous Driving: Waymo Open Dataset , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[57]  Theo Gevers,et al.  Unsupervised Generation of Optical Flow Datasets from Videos in the Wild , 2018, ArXiv.

[58]  Sergey Levine,et al.  (CAD)$^2$RL: Real Single-Image Flight without a Single Real Image , 2016, Robotics: Science and Systems.

[59]  Ivar Austvoll,et al.  A Study of the Yosemite Sequence Used as a Test Sequence for Estimation of Optical Flow , 2005, SCIA.

[60]  Brendan McCane,et al.  On Benchmarking Optical Flow , 2001, Comput. Vis. Image Underst..

[61]  Qiang Wang,et al.  IRS: A Large Synthetic Indoor Robotics Stereo Dataset for Disparity and Surface Normal Estimation , 2019, ArXiv.

[62]  Dengxin Dai,et al.  Don’t Forget The Past: Recurrent Depth Estimation from Monocular Video , 2020, IEEE Robotics and Automation Letters.

[63]  Andrew J. Chosak,et al.  OVVV: Using Virtual Worlds to Design and Evaluate Surveillance Systems , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[64]  T. Vaudrey,et al.  Differences between stereo and motion behaviour on synthetic and real-world stereo sequences , 2008, 2008 23rd International Conference Image and Vision Computing New Zealand.

[65]  Gregory Ditzler,et al.  Edge-Guided Occlusion Fading Reduction for a Light-Weighted Self-Supervised Monocular Depth Estimation , 2019, ArXiv.

[66]  Pieter Abbeel,et al.  Domain Randomization for Active Pose Estimation , 2019, 2019 International Conference on Robotics and Automation (ICRA).

[67]  Jan Kautz,et al.  A Fusion Approach for Multi-Frame Optical Flow Estimation , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[68]  Michael J. Black,et al.  Lessons and Insights from Creating a Synthetic Optical Flow Benchmark , 2012, ECCV Workshops.

[69]  Philipp Fischer Convolutional networks to relate images , 2016 .

[70]  Nathan Silberman,et al.  Indoor scene segmentation using a structured light sensor , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[71]  Thomas Brox,et al.  Understanding and Robustifying Differentiable Architecture Search , 2020, ICLR.

[72]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[73]  Thomas Brox,et al.  DeepTAM: Deep Tracking and Mapping , 2018, ECCV.

[74]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[75]  IEEE conference on computer vision and pattern recognition , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[76]  James J. Little,et al.  Play and Learn: Using Video Games to Train Computer Vision Models , 2016, BMVC.

[77]  Rongke Liu,et al.  Image-Based End-to-End Neural Network for Dense Disparity Estimation , 2019, 2019 IEEE Visual Communications and Image Processing (VCIP).

[78]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[79]  Thomas Brox,et al.  Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation , 2018, ECCV.

[80]  Matthias Nießner,et al.  ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[81]  Nikos Komodakis,et al.  Detect, Replace, Refine: Deep Structured Prediction for Pixel Wise Labeling , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[82]  Brendan McCane,et al.  Generating motion fields of complex scenes , 1999, 1999 Proceedings Computer Graphics International.

[83]  Zhidong Deng,et al.  DrivingStereo: A Large-Scale Dataset for Stereo Matching in Autonomous Driving Scenarios , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[84]  Lance Williams,et al.  View Interpolation for Image Synthesis , 1993, SIGGRAPH.

[85]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[86]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[87]  Vladlen Koltun,et al.  Playing for Benchmarks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[88]  Oisin Mac Aodha,et al.  Unsupervised Monocular Depth Estimation with Left-Right Consistency , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[89]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[90]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[91]  Wojciech Zaremba,et al.  Domain randomization for transferring deep neural networks from simulation to the real world , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[92]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[93]  Brian Okorn,et al.  Just Go With the Flow: Self-Supervised Scene Flow Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[94]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[95]  Xi Wang,et al.  High-Resolution Stereo Datasets with Subpixel-Accurate Ground Truth , 2014, GCPR.

[96]  Jia Deng,et al.  Stacked Hourglass Networks for Human Pose Estimation , 2016, ECCV.

[97]  Bin Xu,et al.  Multi-level Fusion Based 3D Object Detection from Monocular Images , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[98]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[99]  Hans-Hellmut Nagel,et al.  Estimation of Optical Flow Based on Higher-Order Spatiotemporal Derivatives in Interlaced and Non-Interlaced Image Sequences , 1995, Artif. Intell..

[100]  Gregory D. Hager,et al.  RSA: Randomized Simulation as Augmentation for Robust Human Action Recognition , 2019, ArXiv.

[101]  Timo Kohlberger,et al.  Variational optical flow computation in real time , 2005, IEEE Transactions on Image Processing.

[102]  Pat Hanrahan,et al.  Semantically-enriched 3D models for common-sense knowledge , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[103]  Victor Adrian Prisacariu,et al.  FlowNet3D++: Geometric Losses For Deep Scene Flow Estimation , 2020, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).

[104]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[105]  Pascal Fua,et al.  Combining Stereo and Monocular Information to Compute Dense Depth Maps that Preserve Depth Discontinuities , 1991, IJCAI.

[106]  Varun Jampani,et al.  Training Deep Networks with Synthetic Data: Bridging the Reality Gap by Domain Randomization , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[107]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[108]  Qiong Yan,et al.  Cascade Residual Learning: A Two-Stage Convolutional Neural Network for Stereo Matching , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[109]  Yong-Sheng Chen,et al.  Pyramid Stereo Matching Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[110]  Yuan Shen,et al.  Un-VDNet: unsupervised network for visual odometry and depth estimation , 2019, J. Electronic Imaging.

[111]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[112]  Yuning Jiang,et al.  What Can Help Pedestrian Detection? , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[113]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[114]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[115]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[116]  Matthew R. Walter,et al.  DIODE: A Dense Indoor and Outdoor DEpth Dataset , 2019, ArXiv.

[117]  S. Meister,et al.  Real versus realistically rendered scenes for optical flow evaluation , 2011, 2011 14th ITG Conference on Electronic Media Technology.

[118]  Thomas A. Funkhouser,et al.  Semantic Scene Completion from a Single Depth Image , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[119]  Marc Pollefeys,et al.  Segmenting video into classes of algorithm-suitability , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[120]  Philip H. S. Torr,et al.  Recurrent Instance Segmentation , 2015, ECCV.

[121]  Stefan Roth,et al.  Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[122]  Yan Wang,et al.  Pseudo-LiDAR++: Accurate Depth for 3D Object Detection in Autonomous Driving , 2019, ICLR.

[123]  Thomas Brox,et al.  FusionNet and AugmentedFlowNet: Selective Proxy Ground Truth for Training on Unlabeled Images , 2018, ArXiv.

[124]  Robert B. Fisher,et al.  Segmentation and 3D reconstruction of rose plants from stereoscopic images , 2020, Comput. Electron. Agric..

[125]  Siyu Zhu,et al.  Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[126]  Marc Pollefeys,et al.  Learning a Confidence Measure for Optical Flow , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[127]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[128]  Robert B. Fisher,et al.  The Second Workshop on 3D Reconstruction Meets Semantics: Challenge Results Discussion , 2018, ECCV Workshops.

[129]  Jitendra Malik,et al.  Human Pose Estimation with Iterative Error Feedback , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[130]  Antonio M. López,et al.  The SYNTHIA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[131]  Sudipta Sinha,et al.  Towards Privacy-Preserving Ego-Motion Estimation Using an Extremely Low-Resolution Camera , 2020, IEEE Robotics and Automation Letters.

[132]  Angel Domingo Sappa,et al.  Speed and Texture: An Empirical Study on Optical-Flow Accuracy in ADAS Scenarios , 2014, IEEE Transactions on Intelligent Transportation Systems.

[133]  Thomas Brox,et al.  DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[134]  Roberto Cipolla,et al.  Understanding RealWorld Indoor Scenes with Synthetic Data , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[135]  Bernd Jähne,et al.  The HCI Benchmark Suite: Stereo and Flow Ground Truth with Uncertainties for Urban Autonomous Driving , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[136]  Brendan McCane,et al.  Recovering Motion Fields: An Evaluation of Eight Optical Flow Algorithms , 1998, BMVC.

[137]  Matthias Bethge,et al.  ImageNet-trained CNNs are biased towards texture; increasing shape bias improves accuracy and robustness , 2018, ICLR.

[138]  Marc Pollefeys,et al.  SGM-Nets: Semi-Global Matching with Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[139]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[140]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[141]  Xiaoou Tang,et al.  LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[142]  Narendra Ahuja,et al.  Detecting Faces in Images: A Survey , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[143]  Michael Milford,et al.  Adversarial discriminative sim-to-real transfer of visuo-motor policies , 2017, Int. J. Robotics Res..

[144]  Michael J. Black,et al.  Slow Flow: Exploiting High-Speed Cameras for Accurate and Diverse Optical Flow Reference Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[145]  Samuel Rota Bulo,et al.  The Five Elements of Flow , 2019, ArXiv.

[146]  Alexei A. Efros,et al.  Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[147]  Thomas Brox,et al.  AutoDispNet: Improving Disparity Estimation With AutoML , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[148]  Leonidas J. Guibas,et al.  Render for CNN: Viewpoint Estimation in Images Using CNNs Trained with Rendered 3D Model Views , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[149]  Takeo Kanade,et al.  How Useful Is Photo-Realistic Rendering for Visual Learning? , 2016, ECCV Workshops.

[150]  Bernard Ghanem,et al.  Sim4CV: A Photo-Realistic Simulator for Computer Vision Applications , 2017, International Journal of Computer Vision.

[151]  Antonio Manuel López Peña,et al.  Procedural Generation of Videos to Train Deep Action Recognition Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[152]  Magnus Wrenninge,et al.  Procedural Modeling and Physically Based Rendering for Synthetic Data Generation in Automotive Applications , 2017, ArXiv.

[153]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[154]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[155]  Alex Kendall,et al.  End-to-End Learning of Geometry and Context for Deep Stereo Regression , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[156]  Zhuwen Li,et al.  PointPWC-Net: A Coarse-to-Fine Network for Supervised and Self-Supervised Scene Flow Estimation on 3D Point Clouds , 2019, ArXiv.

[157]  David W. Murray,et al.  Simulating Low-Cost Cameras for Augmented Reality Compositing , 2010, IEEE Transactions on Visualization and Computer Graphics.

[158]  Haidi Ibrahim,et al.  Literature Survey on Stereo Vision Disparity Map Algorithms , 2016, J. Sensors.