Instructive Video Retrieval Based on Hybrid Ranking and Attribute Learning: A Case Study on Surgical Skill Training

Video-based systems have been increasingly used in various training tasks in applications like sports, dancing, and surgery. One key task to add automation to such systems is to automatically select reference videos for a given training video of a trainee. In this paper, we formulate a new problem of instructive video retrieval and propose a solution using both attribute learning and learning to rank. The method first evaluates a user's skill attributes by relative attribute learning. Then, the most critical skill attribute in need of improvement is selected and reported to the user. Finally, a hybrid ranking learning to rank method is employed to retrieve instructive videos from a dataset, which serve as reference for the user. Two main technical problems are solved in this method. First, we combine both skill and visual feature to characterize skill superiority and context similarity. Second, we propose a hybrid ranking approach that works with both pair-wise and point-wise labels of the data. The benefit of the proposed method over other heuristic methods is demonstrated by both objective and subjective experiments, using surgical training videos as a case study.

[1]  Terrance E. Boult,et al.  Fusing with context: A Bayesian approach to combining descriptive attributes , 2011, 2011 International Joint Conference on Biometrics (IJCB).

[2]  Terrance E. Boult,et al.  Multi-attribute spaces: Calibration for attribute fusion and similarity search , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Carlos Renjifo,et al.  The discounted cumulative margin penalty: Rank-learning with a list-wise loss and pair-wise margins , 2012, 2012 IEEE International Workshop on Machine Learning for Signal Processing.

[4]  J. Doyle,et al.  A universal global rating scale for the evaluation of technical skills in the operating room. , 2007, American journal of surgery.

[5]  Larry S. Davis,et al.  Image ranking and retrieval based on multi-attribute queries , 2011, CVPR 2011.

[6]  Tie-Yan Liu,et al.  Learning to rank for information retrieval , 2009, SIGIR.

[7]  Kristen Grauman,et al.  Relative attributes , 2011, 2011 International Conference on Computer Vision.

[8]  Noel E. O'Connor,et al.  Evaluating a dancer's performance using kinect-based skeleton tracking , 2011, ACM Multimedia.

[9]  Alexander J. Smola,et al.  IntervalRank: isotonic regression with listwise and pairwise constraints , 2010, WSDM '10.

[10]  P. Fitts,et al.  INFORMATION CAPACITY OF DISCRETE MOTOR RESPONSES. , 1964, Journal of experimental psychology.