Unsupervised Learning of Optical Flow with Deep Feature Similarity

Deep unsupervised learning for optical flow has been proposed, where the loss measures image similarity with the warping function parameterized by estimated flow. The census transform, instead of image pixel values, is often used for the image similarity. In this work, rather than the handcrafted features i.e. census or pixel values, we propose to use deep self-supervised features with a novel similarity measure, which fuses multi-layer similarities. With the fused similarity, our network better learns flow by minimizing our proposed feature separation loss. The proposed method is a polarizing scheme, resulting in a more discriminative similarity map. In the process, the features are also updated to get high similarity for matching pairs and low for uncertain pairs, given estimated flow. We evaluate our method on FlyingChairs, MPI Sintel, and KITTI benchmarks. In quantitative and qualitative comparisons, our method effectively improves the state-of-the-art techniques.

[1]  Stefan Roth,et al.  Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Cordelia Schmid,et al.  DeepFlow: Large Displacement Optical Flow with Deep Matching , 2013, 2013 IEEE International Conference on Computer Vision.

[3]  Jia Xu,et al.  Accurate Optical Flow via Direct Cost Volume Processing , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andreas Geiger,et al.  Deep Discrete Flow , 2016, ACCV.

[5]  Horst Bischof,et al.  Optical Flow Guided TV-L1 Video Interpolation and Restoration , 2011, EMMCVPR.

[6]  Maksims Volkovs,et al.  Guided Similarity Separation for Image Retrieval , 2019, NeurIPS.

[7]  Björn Ommer,et al.  Deep Semantic Feature Matching , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Bolei Zhou,et al.  Deep Flow-Guided Video Inpainting , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Stefan Roth,et al.  UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss , 2017, AAAI.

[10]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[11]  Richard Szeliski,et al.  A Database and Evaluation Methodology for Optical Flow , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Michael J. Black,et al.  Supplementary Material for Unsupervised Learning of Multi-Frame Optical Flow with Occlusions , 2018 .

[13]  Yoshua Bengio,et al.  Semi-supervised Learning by Entropy Minimization , 2004, CAP.

[14]  Michael R. Lyu,et al.  SelFlow: Self-Supervised Learning of Optical Flow , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[17]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  Andrew Zisserman,et al.  Spatial Transformer Networks , 2015, NIPS.

[19]  Jan Kautz,et al.  PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Weilin Huang,et al.  ClothFlow: A Flow-Based Model for Clothed Person Generation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[22]  Nikos Komodakis,et al.  Unsupervised Representation Learning by Predicting Image Rotations , 2018, ICLR.

[23]  Michael J. Black,et al.  Optical Flow Estimation Using a Spatial Pyramid Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Michael J. Black,et al.  A Naturalistic Open Source Movie for Optical Flow Evaluation , 2012, ECCV.

[25]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Ramin Zabih,et al.  Non-parametric Local Transforms for Computing Visual Correspondence , 1994, ECCV.

[27]  Zhipeng Zhang,et al.  Deeper and Wider Siamese Networks for Real-Time Visual Tracking , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Luca Bertinetto,et al.  Fully-Convolutional Siamese Networks for Object Tracking , 2016, ECCV Workshops.

[29]  Cordelia Schmid,et al.  EpicFlow: Edge-preserving interpolation of correspondences for optical flow , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Ian D. Reid,et al.  Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31]  Paolo Favaro,et al.  Unsupervised Learning of Visual Representations by Solving Jigsaw Puzzles , 2016, ECCV.

[32]  Bingbing Ni,et al.  Unsupervised Deep Learning for Optical Flow Estimation , 2017, AAAI.

[33]  Jianbing Shen,et al.  Triplet Loss in Siamese Network for Object Tracking , 2018, ECCV.

[34]  Thomas Brox,et al.  FlowNet: Learning Optical Flow with Convolutional Networks , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[35]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Didier Stricker,et al.  Flow Fields: Dense Correspondence Fields for Highly Accurate Large Displacement Optical Flow Estimation , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[37]  Yi Yang,et al.  Occlusion Aware Unsupervised Learning of Optical Flow , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  Konstantinos G. Derpanis,et al.  Back to Basics: Unsupervised Learning of Optical Flow via Brightness Constancy and Motion Smoothness , 2016, ECCV Workshops.

[39]  Didier Stricker,et al.  Supplementary material of : CNN-based Patch Matching for Optical Flow with Thresholded Hinge Embedding Loss , 2017 .

[40]  Michael R. Lyu,et al.  DDFlow: Learning Optical Flow with Unlabeled Data Distillation , 2019, AAAI.

[41]  Andrew Zisserman,et al.  Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).