Object Recognition in 3D Point Clouds Using Web Data and Domain Adaptation

In recent years, object detection has become an increasingly active field of research in robotics. An important problem in object detection is the availability of a sufficient amount of labeled training data to learn good classifiers. In this paper we show how to significantly reduce the need for manually labeled training data by leveraging data sets available on the World Wide Web. Specifically, we show how to use objects from Google’s 3D Warehouse to train an object detection system for 3D point clouds collected by robots navigating through both urban and indoor environments. In order to deal with the different characteristics of the web data and the real robot data, we additionally use a small set of labeled point clouds and perform domain adaptation. Our experiments demonstrate that additional data taken from the 3D Warehouse along with our domain adaptation greatly improves the classification accuracy on real-world environments.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[3]  Rich Caruana,et al.  Multitask Learning: A Knowledge-Based Source of Inductive Bias , 1993, ICML.

[4]  Jitendra Malik,et al.  Normalized cuts and image segmentation , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5]  Rebecca Hwa Supervised Grammar Induction using Training Data with Limited Constituent Information , 1999, ACL.

[6]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Y. Freund,et al.  Discussion of the Paper \additive Logistic Regression: a Statistical View of Boosting" By , 2000 .

[8]  Daniel Gildea,et al.  Corpus Variation and Parser Performance , 2001, EMNLP.

[9]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Marcel Körtgen,et al.  3D Shape Matching with 3D Shape Contexts , 2003 .

[11]  Joachim Hertzberg,et al.  Automatic Classification of Objects in 3D Laser Range Scans , 2003 .

[12]  Brian Roark,et al.  Unsupervised language model adaptation , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[13]  Brian Roark,et al.  Supervised and unsupervised PCFG adaptation to novel domains , 2003, NAACL.

[14]  Marcin Novotni,et al.  3D zernike descriptors for content based shape retrieval , 2003, SM '03.

[15]  Marcel Körtgen,et al.  3 D Shape Matching with 3 D Shape Contexts , 2003 .

[16]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[17]  Thomas A. Funkhouser,et al.  The Princeton Shape Benchmark , 2004, Proceedings Shape Modeling Applications, 2004..

[18]  Alex Acero,et al.  Adaptation of Maximum Entropy Capitalizer: Little Data Can Help a Lo , 2006, Comput. Speech Lang..

[19]  Ben Taskar,et al.  Discriminative learning of Markov random fields for segmentation of 3D scan data , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Aaron C. Courville,et al.  Interacting Markov Random Fields for Simultaneous Terrain Modeling and Obstacle Detection , 2005, Robotics: Science and Systems.

[21]  Daniel Marcu,et al.  Domain Adaptation for Statistical Classifiers , 2006, J. Artif. Intell. Res..

[22]  S. Sathiya Keerthi,et al.  Building Support Vector Machines with Reduced Classifier Complexity , 2006, J. Mach. Learn. Res..

[23]  Antonio Torralba,et al.  LabelMe: A Database and Web-Based Tool for Image Annotation , 2008, International Journal of Computer Vision.

[24]  Alexei A. Efros,et al.  Improving Spatial Support for Objects via Multiple Segmentations , 2007, BMVC.

[25]  Alberto Del Bimbo,et al.  Content-Based Retrieval of 3-D Objects Using Spin Image Signatures , 2007, IEEE Transactions on Multimedia.

[26]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2007, SIGGRAPH 2007.

[27]  Qiang Yang,et al.  Boosting for transfer learning , 2007, ICML '07.

[28]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[29]  ChengXiang Zhai,et al.  Instance Weighting for Domain Adaptation in NLP , 2007, ACL.

[30]  Fei-Fei Li,et al.  OPTIMOL: Automatic Online Picture Collection via Incremental Model Learning , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Wolfram Burgard,et al.  Instace-Based AMN Classification for Improved Object Recognition in 2D and 3D Laser Range Data , 2007, IJCAI.

[32]  Ashutosh Saxena,et al.  A Fast Data Collection and Augmentation Procedure for Object Recognition , 2008, AAAI.

[33]  Antonio Torralba,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 80 Million Tiny Images: a Large Dataset for Non-parametric Object and Scene Recognition , 2022 .

[34]  Alexei A. Efros,et al.  Recognition by association via learning per-exemplar distances , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Ashutosh Saxena,et al.  Robotic Grasping of Novel Objects using Vision , 2008, Int. J. Robotics Res..

[36]  Dieter Fox,et al.  Laser and Vision Based Outdoor Object Mapping , 2008, Robotics: Science and Systems.

[37]  Alexei A. Efros,et al.  Scene completion using millions of photographs , 2008, Commun. ACM.

[38]  Paul Newman,et al.  Fast Probabilistic Labeling of City Maps , 2008, Robotics: Science and Systems.

[39]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[40]  Discriminative Learning , 2010, Encyclopedia of Machine Learning.

[41]  Kurt Konolige,et al.  Projected texture stereo , 2010, 2010 IEEE International Conference on Robotics and Automation.

[42]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.