Cycle and Semantic Consistent Adversarial Domain Adaptation for Reducing Simulation-to-Real Domain Shift in LiDAR Bird's Eye View

The performance of object detection methods based on LiDAR information is heavily impacted by the availability of training data, usually limited to certain laser devices. As a result, the use of synthetic data is becoming popular when training neural network models, as both sensor specifications and driving scenarios can be generated ad-hoc. However, bridging the gap between virtual and real environments is still an open challenge, as current simulators cannot completely mimic real LiDAR operation. To tackle this issue, domain adaptation strategies are usually applied, obtaining remarkable results on vehicle detection when applied to range view (RV) and bird's eye view (BEV) projections while failing for smaller road agents. In this paper, we present a BEV domain adaptation method based on CycleGAN that uses prior semantic classification in order to preserve the information of small objects of interest during the domain adaptation process. The quality of the generated BEVs has been evaluated using a state-of-the-art 3D object detection framework at KITTI 3D Object Detection Benchmark. The obtained results show the advantages of the proposed method over the existing alternatives.

[1]  Cheng Wang,et al.  Point-Based Multilevel Domain Adaptation for Point Cloud Segmentation , 2022, IEEE Geoscience and Remote Sensing Letters.

[2]  Yan Wang,et al.  Train in Germany, Test in the USA: Making 3D Object Detectors Generalize , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ibrahim Sobh,et al.  LiDAR Sensor modeling and Data augmentation with GANs for Autonomous driving , 2019, ArXiv.

[4]  Germán Ros,et al.  CARLA: An Open Urban Driving Simulator , 2017, CoRL.

[5]  Arturo de la Escalera,et al.  A Method for Synthetic LiDAR Generation to Create Annotated Datasets for Autonomous Vehicles Perception , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[6]  Wojciech Zaremba,et al.  Improved Techniques for Training GANs , 2016, NIPS.

[7]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[8]  Eren Erdal Aksoy,et al.  SalsaNext: Fast, Uncertainty-Aware Semantic Segmentation of LiDAR Point Clouds , 2020, ISVC.

[9]  Trevor Darrell,et al.  ePointDA: An End-to-End Simulation-to-Real Domain Adaptation Framework for LiDAR Point Cloud Segmentation , 2020, AAAI.

[10]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Joseph E. Gonzalez,et al.  BEV-Seg: Bird's Eye View Semantic Segmentation Using Geometry and Semantic Point Cloud , 2020, ArXiv.

[12]  C.-C. Jay Kuo,et al.  PointDAN: A Multi-Scale 3D Domain Adaption Network for Point Cloud Representation , 2019, NeurIPS.

[13]  Julie Iskander,et al.  Domain Adaptation for Vehicle Detection from Bird's Eye View LiDAR Point Cloud Data , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[14]  Alexei A. Efros,et al.  Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[15]  Kurt Keutzer,et al.  SqueezeSegV2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a LiDAR Point Cloud , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[16]  David J. Griffiths,et al.  SynthCity: A large scale synthetic point cloud , 2019, ArXiv.

[17]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[18]  A. Cherian,et al.  Sem-GAN: Semantically-Consistent Image-to-Image Translation , 2018, 2019 IEEE Winter Conference on Applications of Computer Vision (WACV).

[19]  Srikanth Saripalli,et al.  LiDARNet: A Boundary-Aware Domain Adaptation Model for Lidar Point Cloud Semantic Segmentation , 2020, ArXiv.

[20]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).