Modeling eye movement patterns to characterize perceptual skill in image-based diagnostic reasoning processes

Experts have a remarkable capability of locating, perceptually organizing, identifying, and categorizing objects in images specific to their domains of expertise. In this article, we present a hierarchical probabilistic framework to discover the stereotypical and idiosyncratic viewing behaviors exhibited with expertise-specific groups. Through these patterned eye movement behaviors we are able to elicit the domain-specific knowledge and perceptual skills from the subjects whose eye movements are recorded during diagnostic reasoning processes on medical images. Analyzing experts' eye movement patterns provides us insight into cognitive strategies exploited to solve complex perceptual reasoning tasks. An experiment was conducted to collect both eye movement and verbal narrative data from three groups of subjects with different levels or no medical training (eleven board-certified dermatologists, four dermatologists in training and thirteen undergraduates) while they were examining and describing 50 photographic dermatological images. We use a hidden Markov model to describe each subject's eye movement sequence combined with hierarchical stochastic processes to capture and differentiate the discovered eye movement patterns shared by multiple subjects within and among the three groups. Independent experts' annotations of diagnostic conceptual units of thought in the transcribed verbal narratives are time-aligned with discovered eye movement patterns to help interpret the patterns' meanings. By mapping eye movement patterns to thought units, we uncover the relationships between visual and linguistic elements of their reasoning and perceptual processes, and show the manner in which these subjects varied their behaviors while parsing the images. We also show that inferred eye movement patterns characterize groups of similar temporal and spatial properties, and specify a subset of distinctive eye movement patterns which are commonly exhibited across multiple images. Based on the combinations of the occurrences of these eye movement patterns, we are able to categorize the images from the perspective of experts' viewing strategies in a novel way. In each category, images share similar lesion distributions and configurations. Our results show that modeling with multi-modal data, representative of physicians' diagnostic viewing behaviors and thought processes, is feasible and informative to gain insights into physicians' cognitive strategies, as well as medical image understanding.

[1]  Antonio Torralba,et al.  Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. , 2006, Psychological review.

[2]  Peter Brusilovsky,et al.  Inferring word relevance from eye-movements of readers , 2011, IUI '11.

[3]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[4]  I. Gauthier,et al.  Computational approaches to the development of perceptual expertise , 2004, Trends in Cognitive Sciences.

[5]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[6]  T. Foulsham,et al.  It depends on how you look at it: Scanpath comparison in multiple dimensions with MultiMatch, a vector-based approach , 2012, Behavior Research Methods.

[7]  Charles Sneiderman,et al.  Using UMLS metathesaurus concepts to describe medical images: dermatology vocabulary , 2006, Comput. Biol. Medicine.

[8]  Stephen M. Fiore,et al.  Perceptual (Re)learning: A Leverage Point for Human-Centered Computing , 2007, IEEE Intelligent Systems.

[9]  Dario D. Salvucci Inferring intent in eye-based interfaces: tracing eye movements with process models , 1999, CHI '99.

[10]  Joseph H. Goldberg,et al.  Scanpath clustering and aggregation , 2010, ETRA.

[11]  Michael I. Jordan,et al.  Bayesian Nonparametric Methods for Learning Markov Switching Processes , 2010, IEEE Signal Processing Magazine.

[12]  Michael I. Jordan,et al.  Hierarchical Beta Processes and the Indian Buffet Process , 2007, AISTATS.

[13]  George L. Malcolm,et al.  Searching in the dark: Cognitive relevance drives attention in real-world scenes , 2009, Psychonomic bulletin & review.

[14]  Karen Holtzblatt,et al.  Contextual design , 1997, INTR.

[15]  Cristian Sminchisescu,et al.  Action from Still Image Dataset and Inverse Optimal Control to Learn Task Specific Visual Scanpaths , 2013, NIPS.

[16]  P. Boersma Praat : doing phonetics by computer (version 5.1.05) , 2009 .

[17]  Paul Boersma,et al.  Praat, a system for doing phonetics by computer , 2002 .

[18]  Shelly Lotenberg,et al.  Evaluation of uterine cervix segmentations using ground truth from multiple experts , 2009, Comput. Medical Imaging Graph..

[19]  B. Velichkovsky,et al.  Time course of information processing during scene perception: The relationship between saccade amplitude and fixation duration , 2005 .

[20]  Jeff B. Pelz,et al.  Learning Image-Derived Eye Movement Patterns to Characterize Perceptual Expertise , 2012, CogSci.

[21]  A. Treisman,et al.  A feature-integration theory of attention , 1980, Cognitive Psychology.

[22]  Christopher M. Brown,et al.  Controlling eye movements with hidden Markov models , 2004, International Journal of Computer Vision.

[23]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[24]  Douglas DeCarlo,et al.  Robust clustering of eye movement recordings for quantification of visual interest , 2004, ETRA.

[25]  Tim K Marks,et al.  SUN: A Bayesian framework for saliency using natural statistics. , 2008, Journal of vision.

[26]  Guang-Zhong Yang,et al.  A Novel Framework for the Analysis of Eye Movements during Visual Search for Knowledge Gathering , 2011, Cognitive Computation.

[27]  Andrew Howes,et al.  The adaptation of visual search strategy to expected information gain , 2008, CHI.

[28]  Michael I. Jordan,et al.  Sharing Features among Dynamical Systems with Beta Processes , 2009, NIPS.

[29]  Guang-Zhong Yang,et al.  Analysis of visual search patterns with EMD metric in normalized anatomical space , 2006, IEEE Transactions on Medical Imaging.

[30]  Michael Erb,et al.  Mechanisms and neural basis of object and pattern recognition: a study with chess experts. , 2010, Journal of experimental psychology. General.

[31]  K. Fujii,et al.  Visualization for the analysis of fluid motion , 2005, J. Vis..

[32]  Marianne A. DeAngelus,et al.  Top-down control of eye movements: Yarbus revisited , 2009 .

[33]  E. Krupinski,et al.  Eye-movement study and human performance using telepathology virtual slides: implications for medical education and differences with experience. , 2006, Human pathology.

[34]  Michael L. Mack,et al.  Viewing task influences eye movement control during active scene perception. , 2009, Journal of vision.

[35]  Xiang Li,et al.  A Query-by-Example Content-Based Image Retrieval System of Non-melanoma Skin Lesions , 2009, MCBR-CDS.

[36]  T. Crawford,et al.  How do radiologists do it? The influence of experience and training on searching for chest nodules. , 2006 .

[37]  Joseph H. Goldberg,et al.  Identifying fixations and saccades in eye-tracking protocols , 2000, ETRA.

[38]  Sheriff Jolaoso,et al.  Scanpath comparison revisited , 2010, ETRA.

[39]  A. Mizuno,et al.  A change of the leading player in flow Visualization technique , 2006, J. Vis..

[40]  Michael I. Jordan,et al.  An HDP-HMM for systems with state persistence , 2008, ICML '08.

[41]  B. Velichkovsky,et al.  Distractor effect and saccade amplitudes: Further evidence on different modes of processing in free exploration of visual images , 2009 .

[42]  Yiwen Sun,et al.  Automatic analysis of eye tracking data for medical diagnosis , 2009, 2009 IEEE Symposium on Computational Intelligence and Data Mining.

[43]  Jeff B. Pelz,et al.  Annotation Schemes to Encode Domain Knowledge in Medical Narratives , 2012, LAW@ACL.

[44]  Yee Whye Teh,et al.  The Infinite Factorial Hidden Markov Model , 2008, NIPS.

[45]  H. Helmholtz Handbuch der physiologischen Optik , 2015 .

[46]  S Ullman,et al.  Shifts in selective visual attention: towards the underlying neural circuitry. , 1985, Human neurobiology.

[47]  Anne R. Haake,et al.  eyePatterns: software for identifying patterns and similarities across fixation sequences , 2006, ETRA.

[48]  Pengcheng Shi,et al.  Image Understanding from Experts' Eyes by Modeling Perceptual Skill of Diagnostic Reasoning Processes , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[49]  Florian Windhager,et al.  The Game Lies in the Eye of the Beholder: The Influence of Expertise on Watching Soccer , 2010 .

[50]  Analía Amandi,et al.  Recognition of User Intentions for Interface Agents with Variable Order Markov Models , 2009, UMAP.

[51]  Arthur F Kramer,et al.  Visual Skills in Airport-Security Screening , 2004, Psychological science.

[52]  Antonio Torralba,et al.  Top-down control of visual attention in object detection , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[53]  Carl E. Rasmussen,et al.  Factorial Hidden Markov Models , 1997 .

[54]  C. Cohen,et al.  HUMAN CENTRIC APPROACH TO INHOMOGENIOUS GEOSPATIAL DATA FUSION AND ACTUALIZATION , 2010 .

[55]  Michael I. Jordan,et al.  Hierarchical Dirichlet Processes , 2006 .

[56]  W. Gilks,et al.  Adaptive Rejection Metropolis Sampling Within Gibbs Sampling , 1995 .

[57]  Brian Lukoff,et al.  Testing for statistically significant differences between groups of scan patterns , 2008, ETRA.