Vision‐based and marker‐less surgical tool detection and tracking: a review of the literature

&NA; In recent years, tremendous progress has been made in surgical practice for example with Minimally Invasive Surgery (MIS). To overcome challenges coming from deported eye‐to‐hand manipulation, robotic and computer‐assisted systems have been developed. Having real‐time knowledge of the pose of surgical tools with respect to the surgical camera and underlying anatomy is a key ingredient for such systems. In this paper, we present a review of the literature dealing with vision‐based and marker‐less surgical tool detection. This paper includes three primary contributions: (1) identification and analysis of data‐sets used for developing and testing detection algorithms, (2) in‐depth comparison of surgical tool detection methods from the feature extraction process to the model learning strategy and highlight existing shortcomings, and (3) analysis of validation techniques employed to obtain detection performance results and establish comparison between surgical tool detectors. The papers included in the review were selected through PubMed and Google Scholar searches using the keywords: “surgical tool detection”, “surgical tool tracking”, “surgical instrument detection” and “surgical instrument tracking” limiting results to the year range 2000 2015. Our study shows that despite significant progress over the years, the lack of established surgical tool data‐sets, and reference format for performance assessment and method ranking is preventing faster improvement. HighlightsIn‐depth state‐of‐the‐art review of surgical tool detection from 24 recent papers.Lack of a standard format regarding datasets employed and corresponding annotations.Comprehensive highlighting of advantages and disadvantages of existing methods.Limited consensus on a common reference format within validation methodologies. Graphical abstract Figure. No caption available.

[1]  Harold Joseph Highland,et al.  Electromagnetic interference , 1988, Comput. Secur..

[2]  Andrew W. Fitzgibbon,et al.  What Shape Are Dolphins? Building 3D Morphable Models from 2D Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Tomaso A. Poggio,et al.  A general framework for object detection , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[4]  Russell H. Taylor,et al.  Unified Detection and Tracking of Instruments during Retinal Microsurgery , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Yuan-Fang Wang,et al.  Image analysis for automated tracking in robot-assisted endoscopic surgery , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[6]  H. Nait Charif,et al.  Towards Video Understanding of Laparoscopic Surgery : Instrument Tracking , 2005 .

[7]  Vincent Lepetit,et al.  Multimodal templates for real-time detection of texture-less objects in heavily cluttered scenes , 2011, 2011 International Conference on Computer Vision.

[8]  Russell H. Taylor,et al.  Visual tracking using the sum of conditional variance , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Kpalma Kidiyo,et al.  A Survey of Shape Feature Extraction Techniques , 2008 .

[10]  Andrew W. Fitzgibbon,et al.  Real-time human pose recognition in parts from single depth images , 2011, CVPR 2011.

[11]  Russell H. Taylor,et al.  Data-Driven Visual Tracking in Retinal Microsurgery , 2012, MICCAI.

[12]  Cordelia Schmid,et al.  Learning Color Names for Real-World Applications , 2009, IEEE Transactions on Image Processing.

[13]  Russell H. Taylor,et al.  Visual Tracking of Surgical Tools for Proximity Detection in Retinal Surgery , 2011, IPCAI.

[14]  Venkat Krovi,et al.  Video-based Framework for Safer and Smarter Computer Aided Surgery , 2013 .

[15]  Zhuowen Tu,et al.  Supervised Learning of Edges and Object Boundaries , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[16]  Klaus Radermacher,et al.  Assessment of optical localizer accuracy for computer aided surgery systems , 2010, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[17]  Jürgen Weese,et al.  Voxel-based 2-D/3-D registration of fluoroscopy images and CT scans for image-guided surgery , 1997, IEEE Transactions on Information Technology in Biomedicine.

[18]  Junzhou Huang,et al.  Instrument Tracking via Online Learning in Retinal Microsurgery , 2014, MICCAI.

[19]  P. Pérez,et al.  Tracking multiple objects with particle filtering , 2002 .

[20]  G. Dogangil,et al.  A review of medical robotics for minimally invasive soft tissue surgery , 2010, Proceedings of the Institution of Mechanical Engineers. Part H, Journal of engineering in medicine.

[21]  Yasuhiro Fukui,et al.  Development of automatic acquisition system of surgical-instrument informantion in endoscopic and laparoscopic surgey , 2009, 2009 4th IEEE Conference on Industrial Electronics and Applications.

[22]  Sven Haase,et al.  Laparoscopic instrument localization using a 3-D Time-of-Flight/RGB endoscope , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[23]  Austin Reiter,et al.  Learning features on robotic surgical tools , 2012, 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops.

[24]  Paolo Dario,et al.  Tracking endoscopic instruments without a localizer: A shape-analysis-based approach , 2007, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[25]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Pierre Jannin,et al.  Model for defining and reporting reference-based validation protocols in medical image processing , 2006, International Journal of Computer Assisted Radiology and Surgery.

[27]  Nassir Navab,et al.  Surgical Tool Tracking and Pose Estimation in Retinal Microsurgery , 2015, MICCAI.

[28]  Florent Nageotte,et al.  Detection of grey regions in color images : application to the segmentation of a surgical instrument in robotized laparoscopy , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[29]  Stefanie Speidel,et al.  Tracking of Instruments in Minimally Invasive Surgery for Surgical Skill Analysis , 2006, MIAR.

[30]  G. Hirzinger,et al.  Real-time visual servoing for laparoscopic surgery. Controlling robot motion with color image segmentation , 1997, IEEE Engineering in Medicine and Biology Magazine.

[31]  Ian D. Reid,et al.  Nonlinear shape manifolds as shape priors in level set segmentation and tracking , 2011, CVPR 2011.

[32]  Lena Maier-Hein,et al.  Can Masses of Non-Experts Train Highly Accurate Image Classifiers? - A Crowdsourcing Approach to Instrument Segmentation in Laparoscopic Images , 2014, MICCAI.

[33]  David A. McAllester,et al.  A discriminatively trained, multiscale, deformable part model , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[34]  Nassir Navab,et al.  Surgical tool detection and tracking in retinal microsurgery , 2015, Medical Imaging.

[35]  Pietro Perona,et al.  Integral Channel Features , 2009, BMVC.

[36]  V. Lepetit,et al.  EPnP: An Accurate O(n) Solution to the PnP Problem , 2009, International Journal of Computer Vision.

[37]  Paul A. Viola,et al.  Robust Real-Time Face Detection , 2001, International Journal of Computer Vision.

[38]  Antanas Verikas,et al.  Mining data with random forests: A survey and results of new tests , 2011, Pattern Recognit..

[39]  Luc Van Gool,et al.  Handling Occlusions with Franken-Classifiers , 2013, 2013 IEEE International Conference on Computer Vision.

[40]  BlakeAndrew,et al.  C ONDENSATION Conditional Density Propagation forVisual Tracking , 1998 .

[41]  Sébastien Ourselin,et al.  2D-3D Pose Tracking of Rigid Instruments in Minimally Invasive Surgery , 2014, IPCAI.

[42]  C. Marano,et al.  To err is human. Building a safer health system , 2005 .

[43]  Florent Nageotte,et al.  Segmentation and Guidance of Multiple Rigid Objects for Intra-operative Endoscopic Vision , 2006, WDV.

[44]  Sébastien Ourselin,et al.  Toward Detection and Localization of Instruments in Minimally Invasive Surgery , 2013, IEEE Transactions on Biomedical Engineering.

[45]  Pierre Graebling,et al.  Real-time segmentation of surgical instruments inside the abdominal cavity using a joint hue saturation color feature , 2005, Real Time Imaging.

[46]  David Parry,et al.  Interference with the operation of medical devices resulting from the use of radio frequency identification technology. , 2009, The New Zealand medical journal.

[47]  Danail Stoyanov,et al.  Surgical Vision , 2011, Annals of Biomedical Engineering.

[48]  Austin Reiter,et al.  Feature Classification for Tracking Articulated Surgical Tools , 2012, MICCAI.

[49]  B. Christe,et al.  Testing potential interference with RFID usage in the patient care environment. , 2008, Biomedical instrumentation & technology.

[50]  P. Jannin,et al.  Assessment of Image-Guided Interventions , 2008 .

[51]  Gregory Hager,et al.  Vision-based navigation in image-guided interventions. , 2011, Annual review of biomedical engineering.

[52]  Serge J. Belongie,et al.  Context based object categorization: A critical survey , 2010, Comput. Vis. Image Underst..

[53]  M.,et al.  Statistical and Structural Approaches to Texture , 2022 .

[54]  R.M. Haralick,et al.  Statistical and structural approaches to texture , 1979, Proceedings of the IEEE.

[55]  Dong-Soo Kwon,et al.  Intelligent interaction between surgeon and laparoscopic assistant robot system , 2005, ROMAN 2005. IEEE International Workshop on Robot and Human Interactive Communication, 2005..

[56]  Vincent Lepetit,et al.  Accurate Non-Iterative O(n) Solution to the PnP Problem , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[57]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[58]  Antonio Criminisi,et al.  TextonBoost for Image Understanding: Multi-Class Object Recognition and Segmentation by Jointly Modeling Texture, Layout, and Context , 2007, International Journal of Computer Vision.

[59]  Crystal Conde,et al.  Are we ready? , 2008, Texas medicine.

[60]  David J. Hawkes,et al.  MR to Ultrasound Image Registration for Guiding Prostate Biopsy and Interventions , 2009, MICCAI.

[61]  Philippe Cinquin,et al.  3D Tracking of Laparoscopic Instruments Using Statistical and Geometric Modeling , 2011, MICCAI.

[62]  M. Schijven,et al.  The value of haptic feedback in conventional and robot-assisted minimal invasive surgery and virtual reality training: a current review , 2009, Surgical Endoscopy.

[63]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[64]  Jakob E. Bardram,et al.  Phase recognition during surgical procedures using embedded and body-worn sensors , 2011, 2011 IEEE International Conference on Pervasive Computing and Communications (PerCom).

[65]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[66]  Bernt Schiele,et al.  Ten Years of Pedestrian Detection, What Have We Learned? , 2014, ECCV Workshops.

[67]  K. Cleary,et al.  State of the art in surgical robotics: clinical applications and technology challenges. , 2001, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[68]  Philippe Cinquin,et al.  Automatic Detection of Instruments in Laparoscopic Images: A First Step Towards High-level Command of Robotic Endoscopic Holders , 2007, The First IEEE/RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, 2006. BioRob 2006..

[69]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[70]  Luc Van Gool,et al.  Pedestrian detection at 100 frames per second , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[71]  Guang-Zhong Yang,et al.  Episode Classification for the Analysis of Tissue/Instrument Interaction with Multiple Visual Cues , 2003, MICCAI.

[72]  Yuichi Yoshida,et al.  CARD: Compact And Real-time Descriptors , 2011, 2011 International Conference on Computer Vision.

[73]  Josep Amat,et al.  Automatic guidance of an assistant robot in laparoscopic surgery , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[74]  Bodo Rosenhahn,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence Combined Region-and Motion-based 3d Tracking of Rigid and Articulated Objects , 2022 .

[75]  B. Davies A review of robotics in surgery , 2000, Proceedings of the Institution of Mechanical Engineers. Part H, Journal of engineering in medicine.

[76]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[77]  Sébastien Ourselin,et al.  Image Based Surgical Instrument Pose Estimation with Multi-class Labelling and Optical Flow , 2015, MICCAI.

[78]  Pietro Perona,et al.  Pedestrian Detection: An Evaluation of the State of the Art , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[79]  Bernt Schiele,et al.  Detecting Surgical Tools by Modelling Local Appearance and Global Shape , 2015, IEEE Transactions on Medical Imaging.

[80]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[81]  Russell H. Taylor,et al.  Unified Detection and Tracking in Retinal Microsurgery , 2011, MICCAI.

[82]  Rüdiger Dillmann,et al.  Recognition of risk situations based on endoscopic instrument tracking and knowledge based situation modeling , 2008, SPIE Medical Imaging.

[83]  Luc Soler,et al.  Autonomous 3-D positioning of surgical instruments in robotized laparoscopic surgery using visual servoing , 2003, IEEE Trans. Robotics Autom..

[84]  Xiaoli Zhang,et al.  Application of visual tracking for robot-assisted laparoscopic surgery , 2002, J. Field Robotics.

[85]  Sebastian Bodenstedt,et al.  Visual tracking of da Vinci instruments for laparoscopic surgery , 2014, Medical Imaging.

[86]  Jan Flusser,et al.  Pattern recognition by affine moment invariants , 1993, Pattern Recognit..

[87]  Ivan Laptev,et al.  Improvements of Object Detection Using Boosted Histograms , 2006, BMVC.

[88]  L. Kohn,et al.  To Err Is Human : Building a Safer Health System , 2007 .

[89]  Simon J. D. Prince,et al.  Computer Vision: Models, Learning, and Inference , 2012 .

[90]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[91]  Ralph Weischedel,et al.  PERFORMANCE MEASURES FOR INFORMATION EXTRACTION , 2007 .

[92]  Zhuowen Tu,et al.  Probabilistic boosting-tree: learning discriminative models for classification, recognition, and clustering , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[93]  Gregory D. Hager,et al.  Articulated object tracking by rendering consistent appearance parts , 2009, 2009 IEEE International Conference on Robotics and Automation.

[94]  K. Cleary,et al.  State of the Art in Surgical Robotics: Clinical Applications and Technology Challenges , 2001 .

[95]  Jitendra Malik,et al.  Contour and Texture Analysis for Image Segmentation , 2001, International Journal of Computer Vision.

[96]  Ivan Marsic,et al.  Activity recognition for emergency care using RFID , 2011, BODYNETS.

[97]  Aly A. Farag,et al.  CSIFT: A SIFT Descriptor with Color Invariant Characteristics , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[98]  Constantinos Loukas,et al.  A simple sensor calibration technique for estimating the 3D pose of endoscopic instruments , 2016, Surgical Endoscopy.

[99]  Pablo Lamata,et al.  Laparoscopic Tool Tracking Method for Augmented Reality Surgical Applications , 2008, ISBMS.

[100]  Boris Babenko,et al.  Task Specific Local Region Matching , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[101]  Pascal Fua,et al.  A Real-Time Deformable Detector , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[102]  Matti Pietikäinen,et al.  Multiresolution Gray-Scale and Rotation Invariant Texture Classification with Local Binary Patterns , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[103]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[104]  Sebastian Bodenstedt,et al.  Robust feature tracking for endoscopic pose estimation and structure recovery , 2013, Medical Imaging.

[105]  Vladlen Koltun,et al.  Efficient Inference in Fully Connected CRFs with Gaussian Edge Potentials , 2011, NIPS.

[106]  P. Allen,et al.  Marker-less Articulated Surgical Tool Detection , 2012 .

[107]  A. Darzi,et al.  Recent advances in minimal access surgery , 2002, BMJ : British Medical Journal.

[108]  Zhuowen Tu,et al.  Feature Mining for Image Classification , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[109]  M. Fried,et al.  Image‐Guided Endoscopic Surgery: Results of Accuracy and Performance in a Multicenter Clinical Study Using an Electromagnetic Tracking System , 1997, The Laryngoscope.

[110]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[111]  G. McLachlan Discriminant Analysis and Statistical Pattern Recognition , 1992 .

[112]  M. Feuerstein,et al.  Navigation in endoscopic soft tissue surgery: perspectives and limitations. , 2008, Journal of endourology.

[113]  Paul M. Thompson,et al.  Brain Anatomical Structure Segmentation by Hybrid Discriminative/Generative Models , 2008, IEEE Transactions on Medical Imaging.

[114]  Pascal Fua,et al.  Fast Part-Based Classification for Instrument Detection in Minimally Invasive Surgery , 2014, MICCAI.

[115]  Ian D. Reid,et al.  Robust Real-Time Visual Tracking Using Pixel-Wise Posteriors , 2008, ECCV.

[116]  Austin Reiter,et al.  An online learning approach to in-vivo tracking using synergistic features , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[117]  J Dankelman,et al.  Systems for tracking minimally invasive surgical instruments , 2007, Minimally invasive therapy & allied technologies : MITAT : official journal of the Society for Minimally Invasive Therapy.

[118]  Jason J. Corso,et al.  Product of tracking experts for visual tracking of surgical tools , 2013, 2013 IEEE International Conference on Automation Science and Engineering (CASE).

[119]  P. Allen,et al.  Articulated Surgical Tool Detection Using Virtually-Rendered Templates , 2012 .

[120]  Luc Van Gool,et al.  Seeking the Strongest Rigid Detector , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[121]  Darius Burschka,et al.  Navigating inner space: 3-D assistance for minimally invasive surgery , 2005, Robotics Auton. Syst..

[122]  R. Van der Togt,et al.  Electromagnetic interference from radio frequency identification inducing potentially hazardous incidents in critical care medical equipment. , 2008, JAMA.

[123]  Daniel Cremers,et al.  Dynamical statistical shape priors for level set-based tracking , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[124]  Shahram Payandeh,et al.  Visual Tracking of Laparoscopic Instruments , 2014 .

[125]  Stefanie Speidel,et al.  Automatic classification of minimally invasive instruments based on endoscopic image sequences , 2009, Medical Imaging.

[126]  Jenny Dankelman,et al.  In-vivo real-time tracking of surgical instruments in endoscopic video , 2012, Minimally invasive therapy & allied technologies : MITAT : official journal of the Society for Minimally Invasive Therapy.

[127]  Terry M. Peters,et al.  Medical Image Computing and Computer Assisted Intervention – MICCAI 2019: 22nd International Conference, Shenzhen, China, October 13–17, 2019, Proceedings, Part IV , 2019, MICCAI.