An interactive framework for acquiring vision models of 3-D objects from 2-D images

This paper presents a human-computer interaction (HCI) framework for building vision models of three-dimensional (3-D) objects from their two-dimensional (2-D) images. Our framework is based on two guiding principles of HCI: 1) provide the human with as much visual assistance as possible to help the human make a correct input; and 2) verify each input provided by the human for its consistency with the inputs previously provided. For example, when stereo correspondence information is elicited from a human, his/her job is facilitated by superimposing epipolar lines on the images. Although that reduces the possibility of error in the human marked correspondences, such errors are not entirely eliminated because there can be multiple candidate points close together for complex objects. For another example, when pose-to-pose correspondence is sought from a human, his/her job is made easier by allowing the human to rotate the partial model constructed in the previous pose in relation to the partial model for the current pose. While this facility reduces the incidence of human-supplied pose-to-pose correspondence errors, such errors cannot be eliminated entirely because of confusion created when multiple candidate features exist close together. Each input provided by the human is therefore checked against the previous inputs by invoking situation-specific constraints. Different types of constraints (and different human-computer interaction protocols) are needed for the extraction of polygonal features and for the extraction of curved features. We will show results on both polygonal objects and object containing curved features.

[1]  Ramakant Nevatia,et al.  Including interaction in an automated modelling system , 1995, Proceedings of International Symposium on Computer Vision - ISCV.

[2]  Avinash C. Kak,et al.  A robot vision system for recognizing 3D objects in low-order polynomial time , 1989, IEEE Trans. Syst. Man Cybern..

[3]  Yoshiaki Shirai,et al.  Three-Dimensional Computer Vision , 1987, Symbolic Computation.

[4]  Jitendra Malik,et al.  Modeling and Rendering Architecture from Photographs: A hybrid geometry- and image-based approach , 1996, SIGGRAPH.

[5]  Andrew W. Fitzgibbon,et al.  Direct Least Square Fitting of Ellipses , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Robert C. Bolles,et al.  3DPO: A Three- Dimensional Part Orientation System , 1986, IJCAI.

[7]  Marc Levoy,et al.  The digital Michelangelo project: 3D scanning of large statues , 2000, SIGGRAPH.

[8]  Katsushi Ikeuchi,et al.  Appearance compression and synthesis based on 3D model for mixed reality , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[9]  Yuichi Motai,et al.  SmartView: hand-eye robotic calibration for active viewpoint generation and object grasping , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[10]  Gérard G. Medioni,et al.  Interactive 3D model extraction from a single image , 2001, Image Vis. Comput..

[11]  Masayuki Inaba,et al.  Learning by watching: extracting reusable task knowledge from visual observation of human performance , 1994, IEEE Trans. Robotics Autom..

[12]  Katsushi Ikeuchi,et al.  Toward an assembly plan from observation. I. Task recognition with polyhedral objects , 1994, IEEE Trans. Robotics Autom..

[13]  J. Tyrrell,et al.  Analytic Geometry , 1965, Nature.

[14]  Avinash C. Kak,et al.  Interactive Learning of a Multiple-Attribute Hash Table Classifier for Fast Object Recognition , 1995, Comput. Vis. Image Underst..

[15]  Katsushi Ikeuchi,et al.  Acquiring hand-action models by attention point analysis , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[16]  Avinash C. Kak,et al.  3-D Object Recognition Using Bipartite Matching Embedded in Discrete Relaxation , 1991, IEEE Trans. Pattern Anal. Mach. Intell..