Instrument detection and pose estimation with rigid part mixtures model in video‐assisted surgeries

HIGHLIGHTSA method for 2D pose estimation of multiple non‐rigid and robotic instruments is proposed.Rigidly structured model expresses deformations of surgical instruments by encoding diverse, pose‐specific appearance mixtures.Extensive diagnostic experiments inform feature regularization significantly improves learning articulated tool models from videos.The method successfully leverages the shaft part that improves estimation of endeffector articulation.The algorithm obtains state‐of‐the art results on publically available datasets of nonrigid tools. ABSTRACT Localizing instrument parts in video‐assisted surgeries is an attractive and open computer vision problem. A working algorithm would immediately find applications in computer‐aided interventions in the operating theater. Knowing the location of tool parts could help virtually augment visual faculty of surgeons, assess skills of novice surgeons, and increase autonomy of surgical robots. A surgical tool varies in appearance due to articulation, viewpoint changes, and noise. We introduce a new method for detection and pose estimation of multiple non‐rigid and robotic tools in surgical videos. The method uses a rigidly structured, bipartite model of end‐effector and shaft parts that consistently encode diverse, pose‐specific appearance mixtures of the tool. This rigid part mixtures model then jointly explains the evolving tool structure by switching between mixture components. Rigidly capturing end‐effector appearance allows explicit transfer of keypoint meta‐data of the detected components for full 2D pose estimation. The detector can as well delineate precise skeleton of the end‐effector by transferring additional keypoints. To this end, we propose effective procedure for learning such rigid mixtures from videos and for pooling the modeled shaft part that undergoes frequent truncation at the border of the imaged scene. Notably, extensive diagnostic experiments inform that feature regularization is a key to fine‐tune the model in the presence of inherent appearance bias in videos. Experiments further illustrate that estimation of end‐effector pose improves upon including the shaft part in the model. We then evaluate our approach on publicly available datasets of in‐vivo sequences of non‐rigid tools and demonstrate state‐of‐the‐art results.

[1]  Gerd Hirzinger,et al.  Automatic tracking of laparoscopic instruments by color coding , 1997, CVRMed.

[2]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[3]  Bernt Schiele,et al.  Detecting Surgical Tools by Modelling Local Appearance and Global Shape , 2015, IEEE Transactions on Medical Imaging.

[4]  Russell H. Taylor,et al.  Unified Detection and Tracking in Retinal Microsurgery , 2011, MICCAI.

[5]  Nassir Navab,et al.  Concurrent Segmentation and Localization for Tracking of Surgical Instruments , 2017, MICCAI.

[6]  Philippe Cinquin,et al.  3D Tracking of Laparoscopic Instruments Using Statistical and Geometric Modeling , 2011, MICCAI.

[7]  Nassir Navab,et al.  Real-time localization of articulated surgical instruments in retinal microsurgery , 2016, Medical Image Anal..

[8]  Yuan-Fang Wang,et al.  Automated instrument tracking in robotically assisted laparoscopic surgery. , 1995 .

[9]  Austin Reiter,et al.  Feature Classification for Tracking Articulated Surgical Tools , 2012, MICCAI.

[10]  Gerd Hirzinger,et al.  Motion Tracking for Minimally Invasive Robotic Surgery , 2008 .

[11]  Silvio Savarese,et al.  Deformable part models revisited: A performance evaluation for object category pose estimation , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).

[12]  Gregory D. Hager,et al.  A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery , 2017, IEEE Transactions on Biomedical Engineering.

[13]  Varun Ramakrishna,et al.  Pose Machines: Articulated Pose Estimation via Inference Machines , 2014, ECCV.

[14]  Jean-Alexandre Long,et al.  Automatic detection of instruments in laparoscopic images: a first step towards high level command of robotized endoscopic holders , 2006 .

[15]  David A. Forsyth,et al.  Improved Human Parsing with a Full Relational Model , 2010, ECCV.

[16]  Stefano Soatto,et al.  Learning and matching multiscale template descriptors for real-time detection, localization and tracking , 2011, CVPR 2011.

[17]  Darius Burschka,et al.  Navigating inner space: 3-D assistance for minimally invasive surgery , 2005, Robotics Auton. Syst..

[18]  Pascal Fua,et al.  Simultaneous Recognition and Pose Estimation of Instruments in Minimally Invasive Surgery , 2017, MICCAI.

[19]  Austin Reiter,et al.  Appearance learning for 3D tracking of robotic surgical tools , 2014, Int. J. Robotics Res..

[20]  J. Dankelman,et al.  The influence of experience and camera holding on laparoscopic instrument movements measured with the TrEndo tracking system , 2007, Surgical Endoscopy.

[21]  Sven Haase,et al.  Laparoscopic instrument localization using a 3-D Time-of-Flight/RGB endoscope , 2013, 2013 IEEE Workshop on Applications of Computer Vision (WACV).

[22]  Russell H. Taylor,et al.  Data-Driven Visual Tracking in Retinal Microsurgery , 2012, MICCAI.

[23]  Russell H. Taylor,et al.  Visual Tracking of Surgical Tools for Proximity Detection in Retinal Surgery , 2011, IPCAI.

[24]  Deva Ramanan,et al.  Analysis by Synthesis: 3D Object Recognition by Object Reconstruction , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[26]  Andrew Zisserman,et al.  Tracking People by Learning Their Appearance , 2007 .

[27]  Xiaoli Zhang,et al.  Application of visual tracking for robot-assisted laparoscopic surgery , 2002, J. Field Robotics.

[28]  Gregory D. Hager,et al.  Deformable Tracking of Textured Curvilinear Objects , 2012, BMVC.

[29]  Stefan Roth,et al.  People-tracking-by-detection and people-detection-by-tracking , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[30]  Russell H. Taylor,et al.  Unified Detection and Tracking of Instruments during Retinal Microsurgery , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[31]  Kenneth Y. Goldberg,et al.  Automating multi-throw multilateral surgical suturing with a mechanical needle guide and sequential convex optimization , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Lena Maier-Hein,et al.  Comparative Validation of Single-Shot Optical Techniques for Laparoscopic 3-D Surface Reconstruction , 2014, IEEE Transactions on Medical Imaging.

[33]  P. Allen,et al.  Marker-less Articulated Surgical Tool Detection , 2012 .

[34]  Joshua B. Tenenbaum,et al.  Learning to share visual appearance for multiclass object detection , 2011, CVPR 2011.

[35]  H. Nait Charif,et al.  Towards Video Understanding of Laparoscopic Surgery : Instrument Tracking , 2005 .

[36]  Danail Stoyanov,et al.  Vision‐based and marker‐less surgical tool detection and tracking: a review of the literature , 2017, Medical Image Anal..

[37]  Andrew Zisserman,et al.  Progressive search space reduction for human pose estimation , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Pierre Jannin,et al.  Automatic knowledge-based recognition of low-level tasks in ophthalmological procedures , 2012, International Journal of Computer Assisted Radiology and Surgery.

[39]  Luc Soler,et al.  Computer-aided suturing in laparoscopic surgery , 2004, CARS.

[40]  Iasonas Kokkinos,et al.  Shufflets: Shared Mid-level Parts for Fast Object Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[41]  Stefanie Speidel,et al.  Tracking of Instruments in Minimally Invasive Surgery for Surgical Skill Analysis , 2006, MIAR.

[42]  Javad Sovizi,et al.  Surgical tool pose estimation from monocular endoscopic videos , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[43]  Josep Amat,et al.  Automatic guidance of an assistant robot in laparoscopic surgery , 1996, Proceedings of IEEE International Conference on Robotics and Automation.

[44]  Jason J. Corso,et al.  Surgical tool attributes from monocular video , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[45]  Jay B. West,et al.  Designing optically tracked instruments for image-guided surgery , 2004, IEEE Transactions on Medical Imaging.

[46]  Rüdiger Dillmann,et al.  Recognition of risk situations based on endoscopic instrument tracking and knowledge based situation modeling , 2008, SPIE Medical Imaging.

[47]  Jonathon Shlens,et al.  Fast, Accurate Detection of 100,000 Object Classes on a Single Machine , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Jason J. Corso,et al.  Product of tracking experts for visual tracking of surgical tools , 2013, 2013 IEEE International Conference on Automation Science and Engineering (CASE).

[49]  Y. F. Wang,et al.  Automated instrument tracking in robotically assisted laparoscopic surgery. , 1995, Journal of image guided surgery.

[50]  Sébastien Ourselin,et al.  2 D-3 D Pose Tracking of Rigid Instruments in Minimally Invasive Surgery , 2014 .

[51]  Heinrich Niemann,et al.  A System for Real-Time Endoscopic Image Enhancement , 2003, MICCAI.

[52]  Kun Liu,et al.  Rotation-Invariant HOG Descriptors Using Fourier Analysis in Polar and Spherical Coordinates , 2014, International Journal of Computer Vision.

[53]  Russell H. Taylor,et al.  Single Fiber Optical Coherence Tomography Microsurgical Instruments for Computer and Robot-Assisted Retinal Surgery , 2009, MICCAI.

[54]  Junzhou Huang,et al.  Instrument Tracking via Online Learning in Retinal Microsurgery , 2014, MICCAI.

[55]  Guang-Zhong Yang,et al.  Real-Time 3D Tracking of Articulated Tools for Robotic Surgery , 2016, MICCAI.

[56]  Paolo Dario,et al.  Tracking endoscopic instruments without a localizer: A shape-analysis-based approach , 2007, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[57]  Farida Cheriet,et al.  Detection and correction of specular reflections for automatic surgical tool segmentation in thoracoscopic images , 2007, Machine Vision and Applications.

[58]  Alois Knoll,et al.  Visual Instrument Guidance in Minimally Invasive Robot Surgery , 2010 .

[59]  Rui Caseiro,et al.  High-Speed Tracking with Kernelized Correlation Filters , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[60]  J. A. Sánchez-Margallo,et al.  EVA: Laparoscopic Instrument Tracking Based on Endoscopic Video Analysis for Psychomotor Skills Assessment , 2013, Surgical Endoscopy.

[61]  Sébastien Ourselin,et al.  Combined 2D and 3D tracking of surgical instruments for minimally invasive and robotic-assisted surgery , 2016, International Journal of Computer Assisted Radiology and Surgery.

[62]  Pascal Fua,et al.  Fast Part-Based Classification for Instrument Detection in Minimally Invasive Surgery , 2014, MICCAI.

[63]  Austin Reiter,et al.  An online learning approach to in-vivo tracking using synergistic features , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[64]  David A. Forsyth,et al.  30Hz Object Detection with DPM V5 , 2014, ECCV.

[65]  Daniel Wesierski,et al.  Instrument Tracking with Rigid Part Mixtures Model , 2015, CARE@MICCAI.

[66]  Florent Nageotte,et al.  Segmentation and Guidance of Multiple Rigid Objects for Intra-operative Endoscopic Vision , 2006, WDV.

[67]  Sébastien Ourselin,et al.  Toward Detection and Localization of Instruments in Minimally Invasive Surgery , 2013, IEEE Transactions on Biomedical Engineering.

[68]  Pierre Graebling,et al.  Real-time segmentation of surgical instruments inside the abdominal cavity using a joint hue saturation color feature , 2005, Real Time Imaging.

[69]  Yuan-Fang Wang,et al.  Image analysis for automated tracking in robot-assisted endoscopic surgery , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[70]  David A. Forsyth,et al.  Tracking People by Learning Their Appearance , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[71]  Sébastien Ourselin,et al.  2D-3D Pose Tracking of Rigid Instruments in Minimally Invasive Surgery , 2014, IPCAI.

[72]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[73]  Luc Soler,et al.  Autonomous 3-D positioning of surgical instruments in robotized laparoscopic surgery using visual servoing , 2003, IEEE Trans. Robotics Autom..

[74]  Jason J. Corso,et al.  Detection and Localization of Robotic Tools in Robot-Assisted Surgery Videos Using Deep Neural Networks for Region Proposal and Detection , 2017, IEEE Transactions on Medical Imaging.

[75]  Yi Yang,et al.  Articulated Human Detection with Flexible Mixtures of Parts , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[76]  Andru Putra Twinanda,et al.  EndoNet: A Deep Architecture for Recognition Tasks on Laparoscopic Videos , 2016, IEEE Transactions on Medical Imaging.

[77]  Peter Kazanzides,et al.  Medical robotics—Regulatory, ethical, and legal considerations for increasing levels of autonomy , 2017, Science Robotics.