Real-time multilevel sequencing of cataract surgery videos

Data recorded and stored during video-monitored surgeries are a relevant source of information for surgeons, especially during their training period. But today, this data is virtually unexploited. In this paper, we propose to reuse videos recorded during cataract surgeries to automatically analyze the surgical process with the real-time constraint, with the aim to assist the surgeon during the surgery. We propose to automatically recognize, in real-time, what the surgeon is doing: what surgical phase or, more precisely, what surgical step he or she is performing. This recognition relies on the inference of a multilevel statistical model which uses 1) the conditional relations between levels of description (steps and phases) and 2) the temporal relations among steps and among phases. The model accepts two types of inputs: 1) the presence of surgical instruments, manually provided by the surgeons, or 2) motion in videos, automatically analyzed through the CBVR paradigm. A dataset of 30 cataract surgery videos was collected at Brest University hospital. The system was evaluated in terms of mean area under the ROC curve. Promising results were obtained using either motion analysis (Az = 0.759) or the presence of surgical instruments (Az = 0.983).

[1]  Germain Forestier,et al.  Automatic phase prediction from low-level surgical activities , 2015, International Journal of Computer Assisted Radiology and Surgery.

[2]  Gwénolé Quellec,et al.  Real-time recognition of surgical tasks in eye surgery videos , 2014, Medical Image Anal..

[3]  Pierre Jannin,et al.  Automatic knowledge-based recognition of low-level tasks in ophthalmological procedures , 2012, International Journal of Computer Assisted Radiology and Surgery.

[4]  Nassir Navab,et al.  Statistical modeling and recognition of surgical workflow , 2012, Medical Image Anal..

[5]  A. Pal,et al.  An application for retrieval of frames from a laparoscopic surgical video based on image of query instrument , 2008, TENCON 2008 - 2008 IEEE Region 10 Conference.

[6]  Yu Cao,et al.  Medical Video Event Classification Using Shared Features , 2008, 2008 Tenth IEEE International Symposium on Multimedia.

[7]  Gwénolé Quellec,et al.  Real-Time Task Recognition in Cataract Surgery Videos Using Adaptive Spatiotemporal Polynomials , 2015, IEEE Transactions on Medical Imaging.

[8]  Gwénolé Quellec,et al.  Real-Time Segmentation and Recognition of Surgical Tasks in Cataract Surgery Videos , 2014, IEEE Transactions on Medical Imaging.

[9]  Pierre Jannin,et al.  A Framework for the Recognition of High-Level Surgical Tasks From Video Images for Cataract Surgeries , 2012, IEEE Transactions on Biomedical Engineering.

[10]  Ashutosh Kumar Singh,et al.  High-Fidelity Cataract Surgery Simulation and Third World Blindness , 2015, Surgical innovation.

[11]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[12]  Fernando Pereira,et al.  Shallow Parsing with Conditional Random Fields , 2003, NAACL.

[13]  Gwénolé Quellec,et al.  Normalizing videos of anterior eye segment surgeries , 2014, 2014 36th Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[14]  Nicholas Ayache,et al.  Learning Semantic and Visual Similarity for Endomicroscopy Video Retrieval , 2012, IEEE Transactions on Medical Imaging.

[15]  Jung-Hwan Oh,et al.  Automatic real-time detection of endoscopic procedures using temporal features , 2012, Comput. Methods Programs Biomed..

[16]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17]  Gregory D. Hager,et al.  Surgical Gesture Segmentation and Recognition , 2013, MICCAI.

[18]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[19]  Gregory D. Hager,et al.  Surgical gesture classification from video and kinematic data , 2013, Medical Image Anal..

[20]  Ivan Laptev,et al.  On Space-Time Interest Points , 2005, International Journal of Computer Vision.