Fast and automatic object pose estimation for range images on the GPU

We present a pose estimation method for rigid objects from single range images. Using 3D models of the objects, many pose hypotheses are compared in a data-parallel version of the downhill simplex algorithm with an image-based error function. The pose hypothesis with the lowest error value yields the pose estimation (location and orientation), which is refined using ICP. The algorithm is designed especially for implementation on the GPU. It is completely automatic, fast, robust to occlusion and cluttered scenes, and scales with the number of different object types. We apply the system to bin picking, and evaluate it on cluttered scenes. Comprehensive experiments on challenging synthetic and real-world data demonstrate the effectiveness of our method.

[1]  Paul A. Viola,et al.  Fast Multi-view Face Detection , 2003 .

[2]  Akio Kosaka,et al.  Vision-based bin-picking: recognition and localization of multiple complex objects using simple visual cues , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[3]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[4]  Hanspeter Pfister,et al.  Automatic Pose Estimation for Range Images on the GPU , 2007, Sixth International Conference on 3-D Digital Imaging and Modeling (3DIM 2007).

[5]  Arun K. Sood,et al.  Range image segmentation combining edge-detection and region-growing techniques with applications sto robot bin-picking using vacuum gripper , 1990, IEEE Trans. Syst. Man Cybern..

[6]  Cordelia Schmid,et al.  Combining greyvalue invariants with local constraints for object recognition , 1996, Proceedings CVPR IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Cordelia Schmid,et al.  Viewpoint-independent object class detection using 3D Feature Maps , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Cordelia Schmid,et al.  3D object modeling and recognition using affine-invariant patches and multi-view spatial constraints , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Gernot Bachler,et al.  Vision Guided Bin Picking and Mounting in a Flexible Assembly Cell , 2000, IEA/AIE.

[10]  Michael A. Greenspan Geometric Probing of Dense Range Data , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[12]  Marc Levoy,et al.  Efficient variants of the ICP algorithm , 2001, Proceedings Third International Conference on 3-D Digital Imaging and Modeling.

[13]  Mohammed Bennamoun,et al.  Three-Dimensional Model-Based Object Recognition and Segmentation in Cluttered Scenes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Michael Jones,et al.  Multidimensional Morphable Models: A Framework for Representing and Matching Object Classes , 2004, International Journal of Computer Vision.

[15]  Tiow Seng Tan,et al.  Jump flooding in GPU with applications to Voronoi diagram and distance transform , 2006, I3D '06.

[16]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[17]  Luc Van Gool,et al.  Real-time face pose estimation from single range images , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Raghu Machiraju,et al.  Finding optimal views for 3D face shape modeling , 2004, Sixth IEEE International Conference on Automatic Face and Gesture Recognition, 2004. Proceedings..

[19]  Manabu Hashimoto,et al.  HM-ICP: Fast 3-D Registration Algorithm with Hierarchical and Region Selection Approach of M-ICP , 2006, J. Robotics Mechatronics.

[20]  Matthew Turk,et al.  A Morphable Model For The Synthesis Of 3D Faces , 1999, SIGGRAPH.

[21]  Leonidas J. Guibas,et al.  Robust global registration , 2005, SGP '05.

[22]  Michael A. Greenspan,et al.  Discrete pose space estimation to improve ICP-based tracking , 2005, Fifth International Conference on 3-D Digital Imaging and Modeling (3DIM'05).

[23]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[24]  Michael A. Greenspan,et al.  Efficient tracking with the Bounded Hough Transform , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[25]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[26]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[27]  Jens H. Krüger,et al.  A Survey of General‐Purpose Computation on Graphics Hardware , 2007, Eurographics.

[28]  Marc Levoy,et al.  Real-time 3D model acquisition , 2002, ACM Trans. Graph..

[29]  Gérard G. Medioni,et al.  Object modeling by registration of multiple range images , 1991, Proceedings. 1991 IEEE International Conference on Robotics and Automation.

[30]  Yehezkel Lamdan,et al.  Geometric Hashing: A General And Efficient Model-based Recognition Scheme , 1988, [1988 Proceedings] Second International Conference on Computer Vision.

[31]  Katsushi Ikeuchi,et al.  Generating an interpretation tree from a CAD model for 3D-object recognition in bin-picking tasks , 1987, International Journal of Computer Vision.

[32]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[33]  Chitra Dorai,et al.  COSMOS - A Representation Scheme for 3D Free-Form Objects , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[34]  Dana H. Ballard,et al.  Computer Vision , 1982 .

[35]  M. Vasmi Abidi,et al.  Laser ranging and video imaging for bin picking Faysal , 2002 .

[36]  Gérard G. Medioni,et al.  Object modelling by registration of multiple range images , 1992, Image Vis. Comput..