Relative Hidden Markov Models for Video-Based Evaluation of Motion Skills in Surgical Training

A proper temporal model is essential to analysis tasks involving sequential data. In computer-assisted surgical training, which is the focus of this study, obtaining accurate temporal models is a key step towards automated skill-rating. Conventional learning approaches can have only limited success in this domain due to insufficient amount of data with accurate labels. We propose a novel formulation termed Relative Hidden Markov Model and develop algorithms for obtaining a solution under this formulation. The method requires only relative ranking between input pairs, which are readily available from training sessions in the target application, hence alleviating the requirement on data labeling. The proposed algorithm learns a model from the training data so that the attribute under consideration is linked to the likelihood of the input, hence supporting comparing new sequences. For evaluation, synthetic data are first used to assess the performance of the approach, and then we experiment with real videos from a widely-adopted surgical training platform. Experimental results suggest that the proposed approach provides a promising solution to video-based motion skill evaluation. To further illustrate the potential of generalizing the method to other applications of temporal analysis, we also report experiments on using our model on speech-based emotion recognition.

[1]  Scott Sanner,et al.  Score-Based Bayesian Skill Learning , 2012, ECML/PKDD.

[2]  Ohad Ben-Shahar,et al.  Small sample scene categorization from perceptual relations , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Adriana Kovashka,et al.  Relative Attributes for Enhanced Human-Machine Communication , 2012, AAAI.

[4]  L. Baum,et al.  A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains , 1970 .

[5]  Dimitri P. Bertsekas,et al.  Constrained Optimization and Lagrange Multiplier Methods , 1982 .

[6]  Shree K. Nayar,et al.  Attribute and simile classifiers for face verification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[7]  Blake Hannaford,et al.  Generalized approach for modeling minimally invasive surgery as a stochastic process using a discrete Markov model , 2006, IEEE Transactions on Biomedical Engineering.

[8]  Gang Wang,et al.  Comparative object similarity for improved recognition with few or no examples , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[9]  Tom Minka,et al.  TrueSkill Through Time: Revisiting the History of Chess , 2007, NIPS.

[10]  Ivan Laptev,et al.  On Space-Time Interest Points , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[11]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[12]  Thorsten Joachims,et al.  Learning a Distance Metric from Relative Comparisons , 2003, NIPS.

[13]  Venkat Krovi,et al.  Evaluation of robotic minimally invasive surgical skills using motion studies , 2012, Journal of Robotic Surgery.

[14]  Baoxin Li,et al.  Relative Hidden Markov Models for Evaluating Motion Skill , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Steve J. Young,et al.  Large vocabulary continuous speech recognition using HTK , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[16]  Qiang Zhang,et al.  Video-based analysis of motion skills in simulation-based surgical training , 2013, Electronic Imaging.

[17]  Fumio Harashima,et al.  Skill Evaluation from Observation of Discrete Hand Movements during Console Operation , 2010, J. Robotics.

[18]  David A. McAllester,et al.  Object Detection with Discriminatively Trained Part Based Models , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Thomas Hofmann,et al.  Hidden Markov Support Vector Machines , 2003, ICML.

[20]  Sethuraman Panchanathan,et al.  Measuring movement expertise in surgical tasks , 2006, MM '06.

[21]  Hideki Kasuya,et al.  UU Database: A Spoken Dialogue Corpus for Studies on Paralinguistic Information in Expressive Conversation , 2008, TSD.

[22]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[23]  Biing-Hwang Juang,et al.  The segmental K-means algorithm for estimating parameters of hidden Markov models , 1990, IEEE Trans. Acoust. Speech Signal Process..

[24]  Adriana Kovashka,et al.  WhittleSearch: Image search with relative attribute feedback , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Sarah Jane Delany,et al.  Benchmarking classification models for emotion recognition in natural speech: A multi-corporal study , 2011, Face and Gesture 2011.

[26]  Neri Merhav,et al.  Maximum likelihood hidden Markov modeling using a dominant sequence of states , 1991, IEEE Trans. Signal Process..

[27]  Björn W. Schuller,et al.  Hidden Markov model-based speech emotion recognition , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[28]  K. Furuta,et al.  Skill evaluation based on state-transition model for human adaptive mechatronics (HAM) , 2004, 30th Annual Conference of IEEE Industrial Electronics Society, 2004. IECON 2004.

[29]  Feng Duan,et al.  Analyzing human skill through control trajectories and motion capture data , 2008, 2008 IEEE International Conference on Automation Science and Engineering.

[30]  B. Hannaford,et al.  Task decomposition of laparoscopic surgery for objective evaluation of surgical residents' learning curve using hidden Markov model. , 2002, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[31]  Emily B. Fox,et al.  Bayesian nonparametric learning of complex dynamical phenomena , 2009 .

[32]  R. Satava,et al.  Virtual Reality Simulation for the Operating Room: Proficiency-Based Training as a Paradigm Shift in Surgical Skills Training , 2005, Annals of surgery.

[33]  Baoxin Li,et al.  Video-based motion expertise analysis in simulation-based surgical training using hierarchical dirichlet process hidden markov model , 2011, MMAR '11.

[34]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[35]  Kajiro Watanabe,et al.  Kinematical analysis and measurement of sports form , 2006, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[36]  K. H. Kim,et al.  Emotion recognition system using short-term monitoring of physiological signals , 2004, Medical and Biological Engineering and Computing.

[37]  Henry C. Lin,et al.  Towards automatic skill evaluation: Detection and segmentation of robot-assisted surgical motions , 2006, Computer aided surgery : official journal of the International Society for Computer Aided Surgery.

[38]  Fakhri Karray,et al.  Survey on speech emotion recognition: Features, classification schemes, and databases , 2011, Pattern Recognit..

[39]  Jonathan L Rees,et al.  Motion analysis: a validated method for showing skill levels in arthroscopy. , 2008, Arthroscopy : the journal of arthroscopic & related surgery : official publication of the Arthroscopy Association of North America and the International Arthroscopy Association.

[40]  Albino Nogueiras,et al.  Speech emotion recognition using hidden Markov models , 2001, INTERSPEECH.

[41]  David Burshtein,et al.  Support Vector Machine Training for Improved Hidden Markov Modeling , 2008, IEEE Transactions on Signal Processing.