Adversarial Self-Supervised Scene Flow Estimation

This work proposes a metric learning approach for self-supervised scene flow estimation. Scene flow estimation is the task of estimating 3D flow vectors for consecutive 3D point clouds. Such flow vectors are fruitful, e.g. for recognizing actions, or avoiding collisions. Training a neural network via supervised learning for scene flow is impractical, as this requires manual annotations for each 3D point at each new timestamp for each scene. To that end, we seek for a self-supervised approach, where a network learns a latent metric to distinguish between points translated by flow estimations and the target point cloud. Our adversarial metric learning includes a multi-scale triplet loss on sequences of two-point clouds as well as a cycle consistency loss. Furthermore, we outline a benchmark for self-supervised scene flow estimation: the Scene Flow Sandbox. The benchmark consists of five datasets designed to study individual aspects of flow estimation in progressive order of complexity, from a moving object to real-world scenes. Experimental evaluation on the benchmark shows that our approach obtains state-of-the-art self-supervised scene flow results, outperforming recent neighbor-based approaches. We use our proposed benchmark to expose shortcomings and draw insights on various training setups. We find that our setup captures motion coherence and preserves local geometries. Dealing with occlusions, on the other hand, is still an open challenge.

[1]  Andreas Geiger,et al.  Object scene flow for autonomous vehicles , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Krystian Mikolajczyk,et al.  Learning local feature descriptors with triplets and shallow convolutional neural networks , 2016, BMVC.

[3]  Stefan Roth,et al.  Self-Supervised Monocular Scene Flow Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Zhuwen Li,et al.  PointPWC-Net: A Coarse-to-Fine Network for Supervised and Self-Supervised Scene Flow Estimation on 3D Point Clouds , 2019, ArXiv.

[5]  Fuxin Li,et al.  PointConv: Deep Convolutional Networks on 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Olivier D. Faugeras,et al.  Multi-View Stereo Reconstruction and Scene Flow Estimation with a Global Image-Based Matching Score , 2007, International Journal of Computer Vision.

[7]  Konrad Schindler,et al.  Piecewise Rigid Scene Flow , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Leonidas J. Guibas,et al.  FlowNet3D: Learning Scene Flow in 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Yaser Sheikh,et al.  Photogeometric Scene Flow for High-Detail Dynamic 3D Reconstruction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10]  Thomas Brox,et al.  A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Brian Okorn,et al.  Just Go With the Flow: Self-Supervised Scene Flow Estimation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Leonidas J. Guibas,et al.  PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Leonidas J. Guibas,et al.  ShapeNet: An Information-Rich 3D Model Repository , 2015, ArXiv.

[14]  Leonidas J. Guibas,et al.  PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space , 2017, NIPS.

[15]  Stefan Roth,et al.  UnFlow: Unsupervised Learning of Optical Flow with a Bidirectional Census Loss , 2017, AAAI.

[16]  Didier Stricker,et al.  PWOC-3D: Deep Occlusion-Aware End-to-End Scene Flow Estimation , 2019, 2019 IEEE Intelligent Vehicles Symposium (IV).

[17]  Alexandre Boulch,et al.  FLOT: Scene Flow on Point Clouds Guided by Optimal Transport , 2020, ECCV.

[18]  C. Qi,et al.  FlowNet3D: Learning Scene Flow in 3D Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Andreas Geiger,et al.  Bounding Boxes, Segmentations and Object Coordinates: How Important is Recognition for 3D Scene Flow Estimation in Autonomous Driving Scenarios? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Wolfram Burgard,et al.  Rigid scene flow for 3D LiDAR scans , 2016, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Yong Jae Lee,et al.  HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Aseem Behl,et al.  PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Michael R. Lyu,et al.  SelFlow: Self-Supervised Learning of Optical Flow , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Qiang Xu,et al.  nuScenes: A Multimodal Dataset for Autonomous Driving , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Subhransu Maji,et al.  SPLATNet: Sparse Lattice Networks for Point Cloud Processing , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[26]  Ryan M. Eustice,et al.  A learning approach for real-time temporal scene flow estimation from LIDAR data , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[27]  Julius Ziegler,et al.  Sparse scene flow segmentation for moving object detection in urban environments , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[28]  Pichao Wang,et al.  Scene Flow to Action Map: A New Representation for RGB-D Based Action Recognition with Convolutional Neural Networks , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).