论文信息 - Six Degree-of-Freedom Localization of Endoscopic Capsule Robots using Recurrent Neural Networks embedded into a Convolutional Neural Network

Six Degree-of-Freedom Localization of Endoscopic Capsule Robots using Recurrent Neural Networks embedded into a Convolutional Neural Network

Since its development, ingestible wireless endoscopy is considered to be a painless diagnostic method to detect a number of diseases inside GI tract. Medical related engineering companies have made significant improvements in this technology in last decade; however, some major limitations still residue. Localization of the next generation steerable endoscopic capsule robot in six-degree-of freedom (6 DoF) and active motion control are some of these limitations. The significance of localization capability concerns with the correct diagnosis of the disease area. This paper presents a very robust 6-DoF localization method based on supervised training of an architecture consisting of recurrent networks (RNN) embedded into a convolutional neural network (CNN) to make use of both just-in-moment information obtained by CNN and correlative information across frames obtained by RNN. To our knowledge, the idea of embedding RNNs into a CNN architecture is for the first time proposed in literature. The experimental results show that the proposed RNN-in-CNN architecture performs very well for endoscopic capsule robot localization in cases reflection distortions, noise, sudden camera movements and lack of distinguishable features.

[1] Davide Scaramuzza,et al. SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[2] Horst Bischof,et al. CD SLAM - continuous localization and mapping in a dynamic world , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[3] J. M. M. Montiel,et al. ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[4] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[5] J. M. M. Montiel,et al. Visual SLAM for Handheld Monocular Endoscope , 2014, IEEE Transactions on Medical Imaging.

[6] Metin Sitti,et al. A 5-D Localization Method for a Magnetically Manipulated Untethered Robot Using a 2-D Array of Hall-Effect Sensors , 2016, IEEE/ASME Transactions on Mechatronics.

[7] Yasin Almalioglu,et al. A Deep Learning Based 6 Degree-of-Freedom Localization Method for Endoscopic Capsule Robots , 2017, ArXiv.

[8] Janne Heikkilä,et al. A four-step camera calibration procedure with implicit image correction , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9] Roberto Cipolla,et al. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[10] P. Cavanagh. Visual cognition , 2011, Vision Research.

[11] Niko Sünderhauf,et al. On the performance of ConvNet features for place recognition , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[12] Frank Dellaert,et al. iSAM2: Incremental smoothing and mapping using the Bayes tree , 2012, Int. J. Robotics Res..

[13] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[14] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.

[15] Paul Newman,et al. FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[16] Stephen Lin,et al. Single-image vignetting correction using radial gradient symmetry , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[17] G. Klein,et al. Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[18] Metin Sitti,et al. Biopsy using a Magnetic Capsule Endoscope Carrying, Releasing, and Retrieving Untethered Microgrippers , 2014, IEEE Transactions on Biomedical Engineering.

[19] Wolfram Burgard,et al. A benchmark for the evaluation of RGB-D SLAM systems , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[20] Helder Araújo,et al. A Non-Rigid Map Fusion-Based RGB-Depth SLAM Method for Endoscopic Capsule Robots , 2017, ArXiv.

[21] Daniel Cremers,et al. LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[22] Guang-Zhong Yang,et al. Real-Time Stereo Reconstruction in Robotically Assisted Minimally Invasive Surgery , 2010, MICCAI.

[23] Trevor Darrell,et al. Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[25] Roland Memisevic,et al. Learning Visual Odometry with a Convolutional Network , 2015, VISAPP.

[26] Ping-Sing Tsai,et al. Shape from Shading: A Survey , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[27] Carlo Tomasi,et al. Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[28] Eric Diller,et al. Biomedical Applications of Untethered Mobile Milli/Microrobots , 2015, Proceedings of the IEEE.

[29] Andrew J. Davison,et al. DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[30] Daniel Cremers,et al. A primal-dual framework for real-time dense RGB-D scene flow , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[31] Dumitru Erhan,et al. Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32] Luc Van Gool,et al. Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..