Learning Optical Flow via Dilated Networks and Occlusion Reasoning

Despite the significant progress that has been made on estimating optical flow recently, most estimation methods, including classical and deep learning approaches, still have difficulty with multi-scale estimation, real-time computation, and/or occlusion reasoning. In this paper, we introduce dilated convolution and occlusion reasoning into unsupervised optical flow estimation to address these issues. The dilated convolution allows our network to avoid upsampling via deconvolution and the resulting gridding artifacts. Dilated convolution also results in a smaller memory footprint which speeds up interference. The occlusion reasoning prevents our network from learning incorrect deformations due to occluded image regions during training. Our proposed method outperforms state-of-the-art unsupervised approaches on the KITTI benchmark. We also demonstrate its generalization capability by applying it to action recognition in video.

[1]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Jiajun Wu,et al.  Video Enhancement with Task-Oriented Flow , 2018, International Journal of Computer Vision.

[3]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[4]  Jian Sun,et al.  Guided Image Filtering , 2010, ECCV.

[5]  Thomas A. Funkhouser,et al.  Dilated Residual Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kurt Keutzer,et al.  Dense Point Trajectories by GPU-Accelerated Large Displacement Optical Flow , 2010, ECCV.

[8]  Bingbing Ni,et al.  Unsupervised Deep Learning for Optical Flow Estimation , 2017, AAAI.

[9]  Ming-Hsuan Yang,et al.  Semi-Supervised Learning for Optical Flow with Generative Adversarial Networks , 2017, NIPS.

[10]  Jitendra Malik,et al.  Large Displacement Optical Flow: Descriptor Matching in Variational Motion Estimation , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Yi Zhu,et al.  Deep Local Video Feature for Action Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[12]  Stefan Roth,et al.  UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss , 2017, AAAI.

[13]  Ioannis Patras,et al.  Unsupervised convolutional neural networks for motion estimation , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[14]  Fridtjof Stein,et al.  Efficient Computation of Optical Flow Using the Census Transform , 2004, DAGM-Symposium.

[15]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Alexander G. Hauptmann,et al.  Guided Optical Flow Learning , 2017, ArXiv.

[17]  Yi Zhu,et al.  Depth2Action: Exploring Embedded Depth for Large-Scale Action Recognition , 2016, ECCV Workshops.

[18]  Yi Yang,et al.  Occlusion Aware Unsupervised Learning of Optical Flow , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[19]  Konstantinos G. Derpanis,et al.  Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness , 2016, ECCV Workshops.

[20]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[21]  Yi Zhu,et al.  Hidden Two-Stream Convolutional Networks for Action Recognition , 2017, ACCV.

[22]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[23]  Berthold K. P. Horn,et al.  Determining Optical Flow , 1981, Other Conferences.

[24]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[25]  Didier Stricker,et al.  Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Yi Zhu,et al.  DenseNet for dense flow , 2017, 2017 IEEE International Conference on Image Processing (ICIP).

[27]  Vincent Dumoulin,et al.  Deconvolution and Checkerboard Artifacts , 2016 .

[28]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.