SceneNN: A Scene Meshes Dataset with aNNotations

Several RGB-D datasets have been publicized over the past few years for facilitating research in computer vision and robotics. However, the lack of comprehensive and fine-grained annotation in these RGB-D datasets has posed challenges to their widespread usage. In this paper, we introduce SceneNN, an RGB-D scene dataset consisting of 100 scenes. All scenes are reconstructed into triangle meshes and have per-vertex and per-pixel annotation. We further enriched the dataset with fine-grained information such as axis-aligned bounding boxes, oriented bounding boxes, and object poses. We used the dataset as a benchmark to evaluate the state-of-the-art methods on relevant research problems such as intrinsic decomposition and shape completion. Our dataset and annotation tools are available at http://www.scenenn.net.

[1]  Bolei Zhou,et al.  Learning Deep Features for Scene Recognition using Places Database , 2014, NIPS.

[2]  Leonidas J. Guibas,et al.  Data-driven structural priors for shape completion , 2015, ACM Trans. Graph..

[3]  Derek Hoiem,et al.  Indoor Segmentation and Support Inference from RGBD Images , 2012, ECCV.

[4]  Daniel Cremers,et al.  Dense visual SLAM for RGB-D cameras , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Stefan Leutenegger,et al.  ElasticFusion: Dense SLAM Without A Pose Graph , 2015, Robotics: Science and Systems.

[6]  Stephen Lin,et al.  A Closed-Form Solution to Retinex with Nonlocal Texture Constraints , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Stella X. Yu,et al.  Direct Intrinsics: Learning Albedo-Shading Decomposition by Convolutional Regression , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[8]  Vladlen Koltun,et al.  Robust reconstruction of indoor scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  John J. Leonard,et al.  Kintinuous: Spatially Extended KinectFusion , 2012, AAAI 2012.

[10]  Pat Hanrahan,et al.  Synthesizing open worlds with constraints using locally annealed reversible jump MCMC , 2012, ACM Trans. Graph..

[11]  Bernard Ghanem,et al.  Intrinsic Scene Decomposition from RGB-D Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[12]  Antonio Torralba,et al.  Recognizing indoor scenes , 2009, CVPR.

[13]  Ross B. Girshick,et al.  Fast R-CNN , 2015, 1504.08083.

[14]  Pat Hanrahan,et al.  Example-based synthesis of 3D object arrangements , 2012, ACM Trans. Graph..

[15]  Marco Attene,et al.  Polygon mesh repairing: An application perspective , 2013, CSUR.

[16]  Jianxiong Xiao,et al.  3D ShapeNets: A deep representation for volumetric shapes , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Marsette Vona,et al.  Moving Volume KinectFusion , 2012, BMVC.

[18]  Demetri Terzopoulos,et al.  The Clutterpalette: An Interactive Tool for Detailing Indoor Scenes , 2016, IEEE Transactions on Visualization and Computer Graphics.

[19]  Derek Hoiem,et al.  Completing 3D object shape from one depth image , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Jitendra Malik,et al.  Intrinsic Scene Properties from a Single RGB-D Image , 2013, CVPR.

[21]  Vladlen Koltun,et al.  Elastic Fragments for Dense Scene Reconstruction , 2013, 2013 IEEE International Conference on Computer Vision.

[22]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[23]  Vladlen Koltun,et al.  Dense scene reconstruction with points of interest , 2013, ACM Trans. Graph..

[24]  Michael Firman,et al.  RGBD Datasets: Past, Present and Future , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[25]  Qinping Zhao,et al.  Image2Scene: Transforming Style of 3D Room , 2015, ACM Multimedia.

[26]  Roberto Cipolla,et al.  SceneNet: Understanding Real World Indoor Scenes With Synthetic Data , 2015, ArXiv.

[27]  Matthias Nießner,et al.  Real-time 3D reconstruction at scale using voxel hashing , 2013, ACM Trans. Graph..

[28]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[29]  Chi-Keung Tang,et al.  Make it home: automatic optimization of furniture arrangement , 2011, ACM Trans. Graph..

[30]  Michael M. Kazhdan,et al.  Screened poisson surface reconstruction , 2013, TOGS.

[31]  Vladlen Koltun,et al.  Simultaneous Localization and Calibration: Self-Calibration of Consumer Depth Cameras , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[32]  Chuohao Yeo,et al.  Intrinsic Image Decomposition Using a Sparse Representation of Reflectance , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[33]  Vladlen Koltun,et al.  A Simple Model for Intrinsic Image Decomposition with Depth Cues , 2013, 2013 IEEE International Conference on Computer Vision.

[34]  Andrew W. Fitzgibbon,et al.  Large-scale and drift-free surface reconstruction using online subvolume registration , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Duc Thanh Nguyen,et al.  A Field Model for Repairing 3D Shapes , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Wolfram Burgard,et al.  A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[37]  Alexei A. Efros,et al.  Learning Data-Driven Reflectance Priors for Intrinsic Image Decomposition , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  Andrew Owens,et al.  SUN3D: A Database of Big Spaces Reconstructed Using SfM and Object Labels , 2013, 2013 IEEE International Conference on Computer Vision.

[39]  Noah Snavely,et al.  Intrinsic images in the wild , 2014, ACM Trans. Graph..

[40]  Andrew W. Fitzgibbon,et al.  Scene Coordinate Regression Forests for Camera Relocalization in RGB-D Images , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Duc Thanh Nguyen,et al.  A Robust 3D-2D Interactive Tool for Scene Segmentation and Annotation , 2016, IEEE Transactions on Visualization and Computer Graphics.

[42]  Daniel P. Huttenlocher,et al.  Efficient Graph-Based Image Segmentation , 2004, International Journal of Computer Vision.

[43]  Shi-Min Hu,et al.  Sketch2Scene: sketch-based co-retrieval and co-placement of 3D models , 2013, ACM Trans. Graph..

[44]  Jianxiong Xiao,et al.  SUN RGB-D: A RGB-D scene understanding benchmark suite , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Dieter Fox,et al.  Unsupervised feature learning for 3D scene labeling , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Tao Ju,et al.  Robust repair of polygonal models , 2004, ACM Trans. Graph..