Real-time retrieval of similar videos with application to computer-aided retinal surgery

This paper introduces ongoing research on computer-aided ophthalmic surgery. In particular, a novel Content-Based Video Retrieval (CBVR) system is presented. Its purpose is the following: given a video stream captured by a digital camera monitoring the surgery, the system should retrieve, in real-time, similar video subsequences in video archives. In order to retrieve semantically-relevant videos, most existing CBVR systems rely on temporally flexible distance measures such as Dynamic Time Warping. These distance measures are slow and therefore do not allow real-time retrieval. In the proposed system, temporal flexibility is introduced in the way video subsequences are characterized, which allows the use of simple and fast distance measures. As a consequence, realtime retrieval of similar video subsequences, among hundreds of thousands of examples, is now possible. Besides, the proposed system is adaptive: a fast training procedure is presented. The system has been successfully applied to automated recognition of retinal surgery steps on a 69-video dataset: areas under the Receiver Operating Characteristic curves range from Az=0.809 to Az=0.989.

[1]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[2]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[3]  Nicholas Ayache,et al.  An Image Retrieval Approach to Setup Difficulty Levels in Training Systems for Endomicroscopy Diagnosis , 2010, MICCAI.

[4]  Yu Cao,et al.  Computer-Aided Detection of Diagnostic and Therapeutic Operations in Colonoscopy Videos , 2007, IEEE Transactions on Biomedical Engineering.

[5]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[6]  Pablo Lamata,et al.  Laparoscopic Tool Tracking Method for Augmented Reality Surgical Applications , 2008, ISBMS.

[7]  Gregory D. Hager,et al.  Real-Time Endoscopic Mosaicking , 2006, MICCAI.

[8]  Zhouyu Fu,et al.  Semantic-Based Surveillance Video Retrieval , 2007, IEEE Transactions on Image Processing.

[9]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[10]  Guang-Zhong Yang,et al.  Content-Based Surgical Workflow Representation Using Probabilistic Motion Modeling , 2010, MIAR.

[11]  Gwénolé Quellec,et al.  Wavelet optimization for content-based image retrieval in medical databases , 2010, Medical Image Anal..

[12]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[13]  Dong Xu,et al.  Video Event Recognition Using Kernel Methods with Multilevel Temporal Alignment , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.