Large-scale public lidar and satellite image data set for urban semantic labeling

Automated semantic labeling of complex urban scenes in remotely sensed 2D and 3D data is one of the most challenging steps in producing realistic 3D scene models and maps. Recent large-scale public benchmark data sets and challenges for semantic labeling with 2D imagery have been instrumental in identifying state of the art methods and enabling new research. 3D data from lidar and multi-view stereo have also been shown to provide valuable additional information to enable improved semantic labeling accuracy. In this work, we describe the development of a new large-scale data set combining public lidar and multi-view satellite imagery with pixel-level truth for ground labels and instance-level truth for building labels. We demonstrate the use of this data set to evaluate methods for ground and building labeling tasks to establish performance expectations and identify areas for improvement. We also discuss initial steps toward further leveraging this data set to enable machine learning for more complex semantic and instance segmentation and 3D reconstruction tasks. All software developed to produce this public data set and to enable metric scoring are also released as open source code.