Map-Guided Curriculum Domain Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation

We address the problem of semantic nighttime image segmentation and improve the state-of-the-art, by adapting daytime models to nighttime without using nighttime annotations. Moreover, we design a new evaluation framework to address the substantial uncertainty of semantics in nighttime images. Our central contributions are: 1) a curriculum framework to gradually adapt semantic segmentation models from day to night through progressively darker times of day, exploiting cross-time-of-day correspondences between daytime images from a reference map and dark images to guide the label inference in the dark domains; 2) a novel uncertainty-aware annotation and evaluation framework and metric for semantic segmentation, including image regions beyond human recognition capability in the evaluation in a principled fashion; 3) the Dark Zurich dataset, comprising 2416 unlabeled nighttime and 2920 unlabeled twilight images with correspondences to their daytime counterparts plus a set of 201 nighttime images with fine pixel-level annotations created with our protocol, which serves as a first benchmark for our novel evaluation. Experiments show that our map-guided curriculum adaptation significantly outperforms state-of-the-art methods on nighttime sets both for standard metrics and our uncertainty-aware metric. Furthermore, our uncertainty-aware evaluation reveals that selective invalidation of predictions can improve results on data with ambiguous content such as our benchmark and profit safety-oriented applications involving invalid inputs.

[1]  John D. Austin,et al.  Adaptive histogram equalization and its variations , 1987 .

[2]  Shree K. Nayar,et al.  Vision and the Atmosphere , 2002, International Journal of Computer Vision.

[3]  Xia Liu,et al.  Pedestrian detection and tracking with night vision , 2005, IEEE Transactions on Intelligent Transportation Systems.

[4]  Dani Lischinski,et al.  Joint bilateral upsampling , 2007, ACM Trans. Graph..

[5]  Frédo Durand,et al.  A Fast Approximation of the Bilateral Filter Using a Signal Processing Approach , 2006, International Journal of Computer Vision.

[6]  Chongzhao Han,et al.  Night-time pedestrian detection by visual-infrared video fusion , 2008, 2008 7th World Congress on Intelligent Control and Automation.

[7]  Yupin Luo,et al.  Real-Time Pedestrian Detection and Tracking at Nighttime for Driver-Assistance Systems , 2009, IEEE Transactions on Intelligent Transportation Systems.

[8]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[9]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[10]  Jason Weston,et al.  Curriculum learning , 2009, ICML '09.

[11]  Sebastian Thrun,et al.  Robust vehicle localization in urban environments using probabilistic maps , 2010, 2010 IEEE International Conference on Robotics and Automation.

[12]  Bernardo Wagner,et al.  Autonomous robot navigation based on OpenStreetMap geodata , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[13]  Antonio M. López,et al.  Road Detection Based on Illuminant Invariance , 2011, IEEE Transactions on Intelligent Transportation Systems.

[14]  Véronique Berge-Cherfaoui,et al.  Drivable space characterization using automotive lidar and georeferenced map information , 2012, 2012 IEEE Intelligent Vehicles Symposium.

[15]  Kiyoshi Irie,et al.  Road recognition from a single image using prior information , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Trevor Darrell,et al.  Continuous Manifold Based Adaptation for Evolving Visual Domains , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Julius Ziegler,et al.  Video based localization for Bertha , 2014, 2014 IEEE Intelligent Vehicles Symposium Proceedings.

[18]  Véronique Berge-Cherfaoui,et al.  Map-Aided Evidential Grids for Driving Scene Understanding , 2015, IEEE Intelligent Transportation Systems Magazine.

[19]  Germán Ros,et al.  Unsupervised image transformation for outdoor semantic labelling , 2015, 2015 IEEE Intelligent Vehicles Symposium (IV).

[20]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[21]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[22]  RefineNet : MultiPath Refinement Networks with Identity Mappings for High-Resolution Semantic Segmentation , 2016 .

[23]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[24]  Kang Ryoung Park,et al.  Convolutional Neural Network-Based Human Detection in Nighttime Images Using Visible Light Camera Sensors , 2017, Sensors.

[25]  Xiaogang Wang,et al.  Pyramid Scene Parsing Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26]  Philip David,et al.  Domain Adaptation for Semantic Segmentation of Urban Scenes , 2017 .

[27]  Alex Bewley,et al.  Addressing appearance change in outdoor robotics with adversarial domain adaptation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[28]  Peter Kontschieder,et al.  The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[30]  James J. Little,et al.  The Raincouver Scene Parsing Benchmark for Self-Driving in Adverse Weather and at Night , 2017, IEEE Robotics and Automation Letters.

[31]  Oliver Zendel,et al.  How Good Is My Test Data? Introducing Safety Analysis for Computer Vision , 2017, International Journal of Computer Vision.

[32]  拓海 杉山,et al.  “Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks”の学習報告 , 2017 .

[33]  Wolfram Burgard,et al.  AdapNet: Adaptive semantic segmentation in adverse environmental conditions , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[34]  Alex Kendall,et al.  What Uncertainties Do We Need in Bayesian Deep Learning for Computer Vision? , 2017, NIPS.

[35]  Luc Van Gool,et al.  ROAD: Reality Oriented Adaptation for Semantic Segmentation of Urban Scenes , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Larry S. Davis,et al.  DCAN: Dual Channel-wise Alignment Networks for Unsupervised Scene Adaptation , 2018, ECCV.

[37]  Ming-Hsuan Yang,et al.  Learning to Adapt Structured Output Space for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[38]  In So Kweon,et al.  KAIST Multi-Spectral Day/Night Data Set for Autonomous and Assisted Driving , 2018, IEEE Transactions on Intelligent Transportation Systems.

[39]  Jia Xu,et al.  Learning to See in the Dark , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[40]  Oliver Zendel,et al.  WildDash - Creating Hazard-Aware Benchmarks , 2018, ECCV.

[41]  Hui Zhou,et al.  Penalizing Top Performers: Conservative Loss for Semantic Segmentation Adaptation , 2018, ECCV.

[42]  Luc Van Gool,et al.  Dark Model Adaptation: Semantic Image Segmentation from Daytime to Nighttime , 2018, 2018 21st International Conference on Intelligent Transportation Systems (ITSC).

[43]  Hong Yan,et al.  Bayes Saliency-Based Object Proposal Generator for Nighttime Traffic Images , 2018, IEEE Transactions on Intelligent Transportation Systems.

[44]  Torsten Sattler,et al.  Benchmarking 6DOF Outdoor Visual Localization in Changing Conditions , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[45]  Jianhua Wu,et al.  MBLLEN: Low-Light Image/Video Enhancement Using CNNs , 2018, BMVC.

[46]  Dong Liu,et al.  Fully Convolutional Adaptation Networks for Semantic Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Werner Ritter,et al.  Benchmarking Image Sensors Under Adverse Weather Conditions for Autonomous Driving , 2018, 2018 IEEE Intelligent Vehicles Symposium (IV).

[48]  Taesung Park,et al.  CyCADA: Cycle-Consistent Adversarial Domain Adaptation , 2017, ICML.

[49]  Luc Van Gool,et al.  End-to-End Learning of Driving Models with Surround-View Cameras and Route Planners , 2018, ECCV.

[50]  Ming Yang,et al.  Conditional Generative Adversarial Network for Structured Domain Adaptation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Luc Van Gool,et al.  Model Adaptation with Synthetic and Real Data for Semantic Dense Foggy Scene Understanding , 2018, ECCV.

[52]  Trevor Darrell,et al.  BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling , 2018, ArXiv.

[53]  Bin Yang,et al.  HDNET: Exploiting HD Maps for 3D Object Detection , 2018, CoRL.

[54]  Iasonas Kokkinos,et al.  DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[55]  Alex Bewley,et al.  Incremental Adversarial Domain Adaptation for Continually Changing Environments , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[56]  Swami Sankaranarayanan,et al.  Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[57]  Yang Zou,et al.  Domain Adaptation for Semantic Segmentation via Class-Balanced Self-Training , 2018, ArXiv.

[58]  Luc Van Gool,et al.  Semantic Foggy Scene Understanding with Synthetic Data , 2017, International Journal of Computer Vision.

[59]  Patrick Pérez,et al.  ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[60]  Luc Van Gool,et al.  Semantic Understanding of Foggy Scenes with Purely Synthetic Data , 2019, 2019 IEEE Intelligent Transportation Systems Conference (ITSC).

[61]  Gabriel J. Brostow,et al.  Digging Into Self-Supervised Monocular Depth Estimation , 2018, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[62]  Chao Dong,et al.  LAP-Net: Level-Aware Progressive Network for Image Dehazing , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[63]  Luc Van Gool,et al.  Guided Curriculum Model Adaptation and Uncertainty-Aware Evaluation for Semantic Nighttime Image Segmentation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[64]  Chi-Wing Fu,et al.  Underexposed Photo Enhancement Using Deep Illumination Estimation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[65]  Simon Lucey,et al.  Argoverse: 3D Tracking and Forecasting With Rich Maps , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[66]  Marin Oršić,et al.  Simultaneous Semantic Segmentation and Outlier Detection in Presence of Domain Shift , 2019, GCPR.

[67]  Kurt Keutzer,et al.  Multi-source Domain Adaptation for Semantic Segmentation , 2019, NeurIPS.

[68]  Torsten Sattler,et al.  A Cross-Season Correspondence Dataset for Robust Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[69]  Mohan M. Trivedi,et al.  Looking at Vehicles in the Night: Detection and Dynamics of Rear Lights , 2019, IEEE Transactions on Intelligent Transportation Systems.

[70]  Nuno Vasconcelos,et al.  Bidirectional Learning for Domain Adaptation of Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[71]  Carsten Rother,et al.  Panoptic Segmentation , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[72]  Dengxin Dai,et al.  Curriculum Model Adaptation with Synthetic and Real Data for Semantic Foggy Scene Understanding , 2019, International Journal of Computer Vision.

[73]  Jie Song,et al.  Monocular Neural Image Based Rendering With Continuous View Control , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[74]  Open Compound Domain Adaptation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[75]  Yunchao Wei,et al.  Content-Consistent Matching for Domain Adaptive Semantic Segmentation , 2020, ECCV.

[76]  Yunchao Wei,et al.  Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation , 2020, NeurIPS.

[77]  Chen Change Loy,et al.  Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[78]  Luc Van Gool,et al.  Action Sequence Predictions of Vehicles in Urban Environments using Map and Social Context , 2020, 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[79]  Armin Mustafa,et al.  A*3D Dataset: Towards Autonomous Driving in Challenging Environments , 2019, 2020 IEEE International Conference on Robotics and Automation (ICRA).