Articulated Surgical Tool Detection Using Virtually-Rendered Templates

Purpose We propose a system capable of detecting articulated surgical instruments without the use of assistive markers or manual initialization. Methods The algorithm can provide 3D pose using a combination of online and offline learning techniques along with prior geometric knowledge of the tool. It uses live kinematic data from the robotic system to render nearby poses on-thefly as virtual images and creates gradient orientation templates for fast matching into the real image. Prior appearance models of different material classes and projective invariance are used to reject false positives. Results Results are verified using in-vivo data recorded from the da Vinci R © robotic surgical system. The method detects successfully at a high correctness rate and a pyramid search method is proposed which reduces a brute-force method from 23 secs/frame down to 3 secs/frame. Conclusion We have shown a top-down approach to detect surgical tools within in-vivo video sequences and is capable of determining the pose and articulation by learning on-thefly from virtual renderings driven by real kinematic data. A. Reiter, P. K. Allen Dept. of Computer Science Columbia University 500 W. 120th Street, M.C. 0401 New York, NY 10027 Tel.: +1 212-939-7093 Fax: +1 212-666-0140 E-mail: areiter, allen@cs.columbia.edu T. Zhao Intuitive Surgical, Inc. 1266 Kifer Road, Bldg. 101 Sunnyvale, CA 94086 Tel.: +1 408-523-2100 Fax: +1 408-523-1390 E-mail: tao.zhao@intusurg.com

[1]  Yuan-Fang Wang,et al.  Image analysis for automated tracking in robot-assisted endoscopic surgery , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[2]  Gerd Hirzinger,et al.  Automatic tracking of laparoscopic instruments by color coding , 1997, CVRMed.

[3]  Gregory D. Hager,et al.  Efficient Region Tracking With Parametric Models of Geometry and Illumination , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[5]  Xiaoli Zhang,et al.  Application of visual tracking for robot-assisted laparoscopic surgery , 2002, J. Field Robotics.

[6]  Trevor Darrell,et al.  Fast pose estimation with parameter-sensitive hashing , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[7]  Luc Soler,et al.  Autonomous 3-D positioning of surgical instruments in robotized laparoscopic surgery using visual servoing , 2003, IEEE Trans. Robotics Autom..

[8]  Florent Nageotte,et al.  Detection of grey regions in color images : application to the segmentation of a surgical instrument in robotized laparoscopy , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[9]  Darius Burschka,et al.  Navigating inner space: 3-D assistance for minimally invasive surgery , 2005, Robotics Auton. Syst..

[10]  David G. Lowe,et al.  What and Where: 3D Object Recognition with Accurate Pose , 2006, Toward Category-Level Object Recognition.

[11]  Florent Nageotte,et al.  Segmentation and Guidance of Multiple Rigid Objects for Intra-operative Endoscopic Vision , 2006, WDV.

[12]  Philippe Cinquin,et al.  Automatic Detection of Instruments in Laparoscopic Images: A First Step Towards High-level Command of Robotic Endoscopic Holders , 2007, The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, 2006. BioRob 2006..

[13]  Ryutarou Ohbuchi,et al.  Salient local visual features for shape-based 3D model retrieval , 2008, 2008 IEEE International Conference on Shape Modeling and Applications.

[14]  Gerd Hirzinger,et al.  Motion Tracking for Minimally Invasive Robotic Surgery , 2008 .

[15]  Yihong Gong,et al.  Discriminative learning of visual words for 3D human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Gregory D. Hager,et al.  Articulated object tracking by rendering consistent appearance parts , 2009, 2009 IEEE International Conference on Robotics and Automation.

[17]  Austin Reiter,et al.  An online learning approach to in-vivo tracking using synergistic features , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[19]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.