Appearance-Based Gaze Estimator for Natural Interaction Control of Surgical Robots

Robots are playing an increasingly important role in modern surgery. However, conventional human–computer interaction methods, such as joystick control and sound control, have some shortcomings, and medical personnel are required to specifically practice operating the robot. We propose a human–computer interaction model based on eye movement with which medical staff can conveniently use their eye movements to control the robot. Our algorithm requires only an RGB camera to perform tasks without requiring expensive eye-tracking devices. Two kinds of eye control modes are designed in this paper. The first type is the pick and place movement, with which the user uses eye gaze to specify the point where the robotic arm is required to move. The second type is user command movement, with which the user can use eye gaze to select the direction in which the user desires the robot to move. The experimental results demonstrate the feasibility and convenience of these two modes of movement.

[1]  Qiong Huang,et al.  TabletGaze: dataset and analysis for unconstrained appearance-based gaze estimation in mobile tablets , 2017, Machine Vision and Applications.

[2]  Takahiro Okabe,et al.  Learning gaze biases with head motion for head pose-free gaze estimation , 2014, Image Vis. Comput..

[3]  Matti Pietikäinen,et al.  OMEG: Oulu Multi-Pose Eye Gaze Dataset , 2015, SCIA.

[4]  Peng Li,et al.  Efficient and Low-Cost Deep-Learning Based Gaze Estimator for Surgical Robot Control , 2018, 2018 IEEE International Conference on Real-time Computing and Robotics (RCAR).

[5]  Jean-Marc Odobez,et al.  EYEDIAP: a database for the development and evaluation of gaze estimation algorithms from RGB and RGB-D cameras , 2014, ETRA.

[6]  Steven K. Feiner,et al.  Gaze locking: passive eye contact detection for human-object interaction , 2013, UIST.

[7]  Jean-Marc Odobez,et al.  Person independent 3D gaze estimation from remote RGB-D cameras , 2013, 2013 IEEE International Conference on Image Processing.

[8]  Andreas Bulling,et al.  EyeTab: model-based gaze estimation on unmodified tablet computers , 2014, ETRA.

[9]  Timo Schneider,et al.  Manifold Alignment for Person Independent Appearance-Based Gaze Estimation , 2014, 2014 22nd International Conference on Pattern Recognition.

[10]  Wojciech Matusik,et al.  Eye Tracking for Everyone , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Gregory Shakhnarovich,et al.  FractalNet: Ultra-Deep Neural Networks without Residuals , 2016, ICLR.

[12]  Yoichi Sato,et al.  Learning-by-Synthesis for Appearance-Based 3D Gaze Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Jean-Marc Odobez,et al.  Gaze Estimation in the 3D Space Using RGB-D Sensors , 2015, International Journal of Computer Vision.

[14]  Luca Rigazio,et al.  ShiftCNN: Generalized Low-Precision Architecture for Inference of Convolutional Neural Networks , 2017, ArXiv.

[15]  Jürgen Schmidhuber,et al.  Highway Networks , 2015, ArXiv.

[16]  Kilian Q. Weinberger,et al.  Deep Networks with Stochastic Depth , 2016, ECCV.

[17]  Guang-Zhong Yang,et al.  Gaze contingent control for an articulated mechatronic laparoscope , 2010, 2010 3rd IEEE RAS & EMBS International Conference on Biomedical Robotics and Biomechatronics.

[18]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[19]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[20]  Gustavo Carneiro,et al.  Competitive Multi-scale Convolution , 2015, ArXiv.

[21]  Guang-Zhong Yang,et al.  Gaze contingent cartesian control of a robotic arm for laparoscopic surgery , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[22]  Yoshimitsu Aoki,et al.  Unconstrained and Calibration-Free Gaze Estimation in a Room-Scale Area Using a Monocular Camera , 2018, IEEE Access.

[23]  Kaiming He,et al.  Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour , 2017, ArXiv.

[24]  George P. Mylonas,et al.  Gaze-contingent control for minimally invasive robotic surgery , 2006, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[25]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[26]  Mario Fritz,et al.  Appearance-based gaze estimation in the wild , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[28]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[29]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Qilong Wang,et al.  Is Second-Order Information Helpful for Large-Scale Visual Recognition? , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[31]  Jitendra Malik,et al.  Hypercolumns for object segmentation and fine-grained localization , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Yanxia Zhang,et al.  SideWays: a gaze interface for spontaneous interaction with situated displays , 2013, CHI.

[33]  Yoichi Sato,et al.  Appearance-Based Gaze Estimation With Online Calibration From Mouse Operations , 2015, IEEE Transactions on Human-Machine Systems.

[34]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Sergey Ioffe,et al.  Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning , 2016, AAAI.

[36]  David Navarro-Alarcon,et al.  An image-based uterus positioning interface using ADALINE networks for robot-assisted hysterectomy , 2017, 2017 IEEE International Conference on Real-time Computing and Robotics (RCAR).

[37]  Peter Corcoran,et al.  A Review and Analysis of Eye-Gaze Estimation Systems, Algorithms and Performance Evaluation Methods in Consumer Platforms , 2017, IEEE Access.

[38]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Songfan Yang,et al.  Multi-scale Recognition with DAG-CNNs , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[40]  Peter Robinson,et al.  Learning an appearance-based gaze estimator from one million synthesised images , 2016, ETRA.