Active 3D Object Localization Using a Humanoid Robot

We study the problem of actively searching for an object in a three-dimensional (3-D) environment under the constraint of a maximum search time using a visually guided humanoid robot with 26 degrees of freedom. The inherent intractability of the problem is discussed, and a greedy strategy for selecting the best next viewpoint is employed. We describe a target probability updating scheme approximating the optimal solution to the problem, providing an efficient solution to the selection of the best next viewpoint. We employ a hierarchical recognition architecture, inspired by human vision, that uses contextual cues for attending to the view-tuned units at the proper intrinsic scales and for active control of the robotic platform sensor's coordinate frame, which also gives us control of the extrinsic image scale and achieves the proper sequence of pathognomonic views of the scene. The recognition model makes no particular assumptions on shape properties like texture and is trained by showing the object by hand to the robot. Our results demonstrate the feasibility of using state-of-the-art vision-based systems for efficient and reliable object localization in an indoor 3-D environment.

[1]  Michael Gienger,et al.  Task-oriented whole body motion for humanoid robots , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[2]  Heiko Wersing,et al.  Biologically motivated visual behaviors for humanoids: Learning to interact and learning in interaction , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[3]  Lucas Paletta,et al.  Active Object Recognition in Parametric Eigenspace , 1998, BMVC.

[4]  Yiming Ye,et al.  Sensor Planning for 3D Object Search, , 1999, Comput. Vis. Image Underst..

[5]  Sebastian Thrun,et al.  Probabilistic robotics , 2002, CACM.

[6]  John K. Tsotsos On the relative complexity of active vs. passive visual search , 2004, International Journal of Computer Vision.

[7]  Heiko Wersing,et al.  Learning Optimized Features for Hierarchical Models of Invariant Object Recognition , 2003, Neural Computation.

[8]  James J. Little,et al.  Curious George: An attentive semantic robot , 2008, Robotics Auton. Syst..

[9]  Heiko Wersing,et al.  Online Learning of Objects in a Biologically Motivated Visual Architecture , 2007, Int. J. Neural Syst..

[10]  John K. Tsotsos,et al.  Attending to visual motion , 2005, Comput. Vis. Image Underst..

[11]  H. Barlow Vision: A computational investigation into the human representation and processing of visual information: David Marr. San Francisco: W. H. Freeman, 1982. pp. xvi + 397 , 1983 .

[12]  Bernt Schiele,et al.  Transinformation for active object recognition , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[13]  John K. Tsotsos,et al.  Behaviors for active object recognition , 1993, Other Conferences.

[14]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[15]  Lambert E. Wixson,et al.  Using intermediate objects to improve the efficiency of visual search , 1994, International Journal of Computer Vision.

[16]  Subhashis Banerjee,et al.  Recognizing large 3-D objects through next view planning using an uncalibrated camera , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[17]  Ruzena Bajcsy,et al.  Occlusions as a Guide for Planning the Next View , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Sven J. Dickinson,et al.  Active Object Recognition Integrating Attention and Viewpoint Control , 1997, Comput. Vis. Image Underst..

[19]  Frank P. Ferrie,et al.  Active recognition: using uncertainty to reduce ambiguity , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[20]  Christopher M. Brown,et al.  Control of selective perception using bayes nets and decision theory , 1994, International Journal of Computer Vision.

[21]  Deb Roy,et al.  Connecting language to the world , 2005, Artif. Intell..

[22]  Tal Arbel,et al.  Efficient Discriminant Viewpoint Selection for Active Bayesian Recognition , 2006, International Journal of Computer Vision.

[23]  David S. Johnson,et al.  Computers and Intractability: A Guide to the Theory of NP-Completeness , 1978 .

[24]  J. Woods,et al.  Probability and Random Processes with Applications to Signal Processing , 2001 .

[25]  Geoffrey J. Gordon,et al.  Finding Approximate POMDP solutions Through Belief Compression , 2011, J. Artif. Intell. Res..

[26]  T. Poggio,et al.  BOOK REVIEW David Marr’s Vision: floreat computational neuroscience VISION: A COMPUTATIONAL INVESTIGATION INTO THE HUMAN REPRESENTATION AND PROCESSING OF VISUAL INFORMATION , 2009 .

[27]  Hanspeter A. Mallot,et al.  Saccadic object recognition with an active vision system , 1992, [1992] Proceedings. 11th IAPR International Conference on Pattern Recognition.

[28]  John K. Tsotsos,et al.  A theory of active object localization , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[29]  Julian Eggert,et al.  Integrated Research and Development Environment for Real-Time Distributed Embodied Intelligent Systems , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Danica Kragic,et al.  Integrating Active Mobile Robot Object Recognition and SLAM in Natural Environments , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[31]  Sumantra Dutta Roy,et al.  Active 3-D Object Recognition Using Appearance-Based Aspect Graphs , 2004, ICVGIP.

[32]  James J. Little,et al.  Informed visual search: Combining attention and object recognition , 2008, 2008 IEEE International Conference on Robotics and Automation.

[33]  Wilson S. Geisler,et al.  Optimal eye movement strategies in visual search , 2005, Nature.

[34]  John K. Tsotsos,et al.  Neurobiology of Attention , 2005 .

[35]  John K. Tsotsos,et al.  Attention and Visual Search: Active Robotic Vision Systems that Search , 2007 .

[36]  Olivier Stasse,et al.  Online object search with a humanoid robot , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[37]  T. Garvey Perceptual strategies for purposive vision , 1975 .

[38]  Sven J. Dickinson,et al.  A Computational Model of View Degeneracy , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Olivier Stasse,et al.  A next-best-view algorithm for autonomous 3D object modeling by a humanoid robot , 2008, Humanoids 2008 - 8th IEEE-RAS International Conference on Humanoid Robots.

[40]  Frank P. Ferrie,et al.  Active Object Recognition: Looking for Differences , 2001, International Journal of Computer Vision.

[41]  Steven Dubowsky,et al.  Efficient Information-based Visual Robotic Mapping in Unstructured Environments , 2005, Int. J. Robotics Res..

[42]  John K. Tsotsos Analyzing vision at the complexity level , 1990, Behavioral and Brain Sciences.