A KinFu based approach for robot spatial attention and view planning

When a user and a robot share the same physical workspace the robot may need to keep an updated 3D representation of the environment. Indeed, robot systems often need to reconstruct relevant parts of the environment where the user executes manipulation tasks. This paper proposes a spatial attention approach for a robot manipulator with an eye-in-hand Kinect range sensor. Salient regions of the environment, where user manipulation actions are more likely to have occurred, are detected by applying a clustering algorithm based on Gaussian Mixture Models applied to the user hand trajectory. A motion capture sensor is used for hand tracking. The robot attentional behavior is driven by a next-best view algorithm that computes the most promising range sensor viewpoints to observe the detected salient regions, where potential changes in the environment have occurred. The environment representation is built upon the PCL KinFu Large Scale project 1, an open source implementation of KinectFusion. KinFu has been modified to support the execution of the next-best view algorithm directly on the GPU and to properly manage voxel data. Experiments are reported to illustrate the proposed attention based approach and to show the effectiveness of GPU-based next-best view planning compared to the same algorithm executed on the CPU. A spatial attention approach is presented for a robot manipulator.Salient human actions are detected by hand motion tracking and GMM.Approximate locations of salient user actions draw robot's attention.Next-best views are computed on the GPU using KinFu Large Scale.

[1]  James M. Rehg,et al.  Modeling Actions through State Changes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Richard Pito,et al.  A Solution to the Next Best View Problem for Automated Surface Acquisition , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Jonathan Feng-Shun Lin,et al.  Online Segmentation of Human Motion for Automated Rehabilitation Exercise Analysis , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[5]  Dmitry Berenson,et al.  Grasp planning in complex scenes , 2007, 2007 7th IEEE-RAS International Conference on Humanoid Robots.

[7]  Katsushi Ikeuchi,et al.  Toward automatic robot instruction from perception-temporal segmentation of tasks from human hand motion , 1993, IEEE Trans. Robotics Autom..

[8]  Albert Ali Salah,et al.  Joint Attention by Gaze Interpolation and Saliency , 2013, IEEE Transactions on Cybernetics.

[9]  Eren Erdal Aksoy,et al.  Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..

[10]  Giulio Sandini,et al.  Object-based Visual Attention: a Model for a Behaving Robot , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05) - Workshops.

[11]  Christophe Dumont,et al.  A next-best-view system for autonomous 3-D object reconstruction , 2000, IEEE Trans. Syst. Man Cybern. Part A.

[12]  John J. Leonard,et al.  Robust real-time visual odometry for dense RGB-D mapping , 2013, 2013 IEEE International Conference on Robotics and Automation.

[13]  Mohammed Yeasin,et al.  Toward automatic robot programming: learning human skill from visual data , 2000, IEEE Trans. Syst. Man Cybern. Part B.

[14]  Sang Hyoung Lee,et al.  Learning basis skills by autonomous segmentation of humanoid motion trajectories , 2012, 2012 12th IEEE-RAS International Conference on Humanoid Robots (Humanoids 2012).

[15]  John J. Leonard,et al.  Toward lifelong object segmentation from change detection in dense RGB-D maps , 2013, 2013 European Conference on Mobile Robots.

[16]  Jorge Dias,et al.  3D hand trajectory segmentation by curvatures and hand orientation for classification through a probabilistic approach , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[17]  Stefano Caselli,et al.  Global registration of mid-range 3D observations and short range next best views , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Berthold Bäuml,et al.  Real-time dense multi-scale workspace modeling on a humanoid robot , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Andrew W. Fitzgibbon,et al.  KinectFusion: Real-time dense surface mapping and tracking , 2011, 2011 10th IEEE International Symposium on Mixed and Augmented Reality.

[20]  Zoltan-Csaba Marton,et al.  Combining object modeling and recognition for active scene exploration , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[21]  Kamal K. Gupta,et al.  Integrated view and path planning for an autonomous six-DOF eye-in-hand object modeling system , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Youfu Li,et al.  Information entropy-based viewpoint planning for 3-D object reconstruction , 2005, IEEE Transactions on Robotics.

[23]  Gaurav S. Sukhatme,et al.  A probabilistic framework for next best view estimation in a cluttered environment , 2014, J. Vis. Commun. Image Represent..

[24]  Junhao Xiao,et al.  Finding next best views for autonomous UAV mapping through GPU-accelerated particle simulation , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Jorge Dias,et al.  Extracting data from human manipulation of objects towards improving autonomous robotic grasping , 2012, Robotics Auton. Syst..

[26]  Hongbin Zha,et al.  Next Best Viewpoint (NBV) Planning for Active Object Modeling Based on a Learning-by-Showing Approach , 1998, ACCV.

[27]  Darius Burschka,et al.  Representation of manipulation-relevant object properties and actions for surprise-driven exploration , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[28]  C. Ian Connolly,et al.  The determination of next best views , 1985, Proceedings. 1985 IEEE International Conference on Robotics and Automation.

[29]  Gerd Hirzinger,et al.  Next-best-scan planning for autonomous 3D modeling , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Frank P. Ferrie,et al.  Autonomous exploration: driven by uncertainty , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Marsette Vona,et al.  Moving Volume KinectFusion , 2012, BMVC.

[32]  Gernot A. Fink,et al.  Saliency-based identification and recognition of pointed-at objects , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[33]  Carme Torras,et al.  Object modeling using a ToF camera under an uncertainty reduction approach , 2010, 2010 IEEE International Conference on Robotics and Automation.

[34]  Dieter Fox,et al.  Toward online 3-D object segmentation and mapping , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[36]  Shengyong Chen,et al.  Vision sensor planning for 3-D model acquisition , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[37]  Dieter Fox,et al.  RGB-D object discovery via multi-scene analysis , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[38]  Michel Drouin,et al.  Automatic observation for 3D reconstruction of unknown objects using visual servoing , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[39]  Antonio A. F. Oliveira,et al.  A framework for attention and object categorization using a stereo head robot , 1999, XII Brazilian Symposium on Computer Graphics and Image Processing (Cat. No.PR00481).

[40]  Mario Fernando Montenegro Campos,et al.  Novelty detection and segmentation based on Gaussian mixture models: A case study in 3D robotic laser mapping , 2013, Robotics Auton. Syst..