Devon: Deformable Volume Network for Learning Optical Flow

State-of-the-art neural network models estimate large displacement optical flow in multi-resolution and use warping to propagate the estimation between two resolutions. Despite their impressive results, it is known that there are two problems with the approach. First, the multi-resolution estimation of optical flow fails in situations where small objects move fast. Second, warping creates artifacts when occlusion or dis-occlusion happens. In this paper, we propose a new neural network module, Deformable Cost Volume, which alleviates the two problems. Based on this module, we designed the Deformable Volume Network (Devon) which can estimate multi-scale optical flow in a single high resolution. Experiments show Devon is more suitable in handling small objects moving fast and achieves comparable results to the state-of-the-art methods in public benchmarks.

[1]  Yi Li,et al.  Deformable Convolutional Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[2]  Martial Hebert,et al.  Learning to Extract Motion from Videos in Convolutional Neural Networks , 2016, ACCV.

[3]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  C. Bregler,et al.  Large displacement optical flow , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Cordelia Schmid,et al.  Action recognition by dense trajectories , 2011, CVPR 2011.

[6]  Xiaoou Tang,et al.  LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Patrick Bouthemy,et al.  Optical flow modeling and computation: A survey , 2015, Comput. Vis. Image Underst..

[8]  Stefan Roth,et al.  UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss , 2017, AAAI.

[9]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  D. Scharstein,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, Proceedings IEEE Workshop on Stereo and Multi-Baseline Vision (SMBV 2001).

[11]  Bingbing Ni,et al.  Unsupervised Deep Learning for Optical Flow Estimation , 2017, AAAI.

[12]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[13]  Michael J. Black,et al.  Supplementary Material for Unsupervised Learning of Multi-Frame Optical Flow with Occlusions , 2018 .

[14]  Steven S. Beauchemin,et al.  The computation of optical flow , 1995, CSUR.

[15]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[16]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[17]  Michael R. Lyu,et al.  SelFlow: Self-Supervised Learning of Optical Flow , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Ce Liu,et al.  Deformable Spatial Pyramid Matching for Fast Dense Correspondences , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Jiri Matas,et al.  Continual Occlusion and Optical Flow Estimation , 2018, ACCV.

[20]  Yao Lu,et al.  Learning Image Relations with Contrast Association Networks , 2017, 2019 International Joint Conference on Neural Networks (IJCNN).

[21]  Thomas Brox,et al.  Occlusions, Motion and Depth Boundaries with a Generic Network for Disparity, Optical Flow or Scene Flow Estimation , 2018, ECCV.

[22]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Daniel Cremers,et al.  What Makes Good Synthetic Training Data for Learning Disparity and Optical Flow Estimation? , 2018, International Journal of Computer Vision.

[24]  Nikos Paragios,et al.  Handbook of Mathematical Models in Computer Vision , 2005 .

[25]  Sang Chul Ahn,et al.  Generalized Deformable Spatial Pyramid: Geometry-preserving dense correspondence estimation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[27]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[31]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[32]  Thomas Brox,et al.  DeMoN: Depth and Motion Network for Learning Monocular Stereo , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33]  Stefano Soatto,et al.  S2F: Slow-to-Fast Interpolator Flow , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[34]  Hans-Hellmut Nagel,et al.  Combination of Edge Element and Optical Flow Estimates for 3D-Model-Based Vehicle Tracking in Traffic Image Sequences , 1999, International Journal of Computer Vision.

[35]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[36]  P ? ? ? ? ? ? ? % ? ? ? ? , 1991 .

[37]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[38]  Stefan Roth,et al.  Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Jia Xu,et al.  Accurate Optical Flow via Direct Cost Volume Processing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[40]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[41]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[42]  Michael J. Black,et al.  Optical Flow Estimation with Channel Constancy , 2014, ECCV.

[43]  Michael J. Black,et al.  On the Spatial Statistics of Optical Flow , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[44]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[45]  Yi Yang,et al.  Occlusion Aware Unsupervised Learning of Optical Flow , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  Yann LeCun,et al.  Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches , 2015, J. Mach. Learn. Res..

[47]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.