An analysis of depth estimation within interaction range

Interactions between humans or humanoids and their environment through tasks like grasping or manipulation typically require accurate depth information. The human vision system integrates various monocular and binocular depth estimation mechanisms in order to achieve robust and reliable depth perception. Such an integrated approach can be applied to humanoid depth perception. Integration requires a knowledge of the characteristics of the methods being combined. Three different methods incorporating active vision (stereo disparity, vergence and familiar size) were statistically examined and combinations of these methods based on this statistical examination were investigated. We found evidence that active vision provides better depth estimations than the standard static-parallel stereo methods examined within interaction range and therefore is better suited for tasks like reaching, grasping and manipulation. We also demonstrate that a combination of methods have the potential to increase the accuracy of estimations.

[1]  Yiannis Aloimonos,et al.  Active vision , 2004, International Journal of Computer Vision.

[2]  Richard Szeliski,et al.  A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms , 2001, International Journal of Computer Vision.

[3]  Giulio Sandini,et al.  Precise 3D measurements with a high resolution stereo head , 2000, IWISPA 2000. Proceedings of the First International Workshop on Image and Signal Processing and Analysis. in conjunction with 22nd International Conference on Information Technology Interfaces. (IEEE.

[4]  Darius Burschka,et al.  Advances in Computational Stereo , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Gary R. Bradski,et al.  Learning OpenCV - computer vision with the OpenCV library: software that sees , 2008 .

[6]  Heiko Wersing,et al.  Figure-ground Segmentation using Metrics Adaptation in Level Set Methods , 2010, ESANN.

[7]  Olga Veksler,et al.  Markov random fields with efficient approximations , 1998, Proceedings. 1998 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No.98CB36231).

[8]  Alexander Zelinsky,et al.  Active Vision - Rectification and Depth Mapping , 2004 .

[9]  Chen Zhang,et al.  Tracking with Depth-from-Size , 2008, ICONIP.

[10]  Heiko Wersing,et al.  A biologically motivated visual memory architecture for online learning of objects , 2008, Neural Networks.

[11]  Kurt Konolige,et al.  Small Vision Systems: Hardware and Implementation , 1998 .

[12]  Alexandre Bernardino,et al.  Vergence control for robotic heads using log-polar images , 1996, Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96.

[13]  Martin Heckmann,et al.  Interactive online multimodal association for internal concept building in humanoids , 2009, 2009 9th IEEE-RAS International Conference on Humanoid Robots.

[14]  H. Barlow Vision Science: Photons to Phenomenology by Stephen E. Palmer , 2000, Trends in Cognitive Sciences.

[15]  M. Ernst,et al.  Humans integrate visual and haptic information in a statistically optimal fashion , 2002, Nature.

[16]  Alexander Zelinsky,et al.  MAP ZDF segmentation and tracking using active stereo vision: Hand tracking case study , 2007, Comput. Vis. Image Underst..

[17]  A. Bernardino Correlation Based Vergence Control Using Log-polar Images ? , 1996 .