EMIP Toolkit: A Python Library for Customized Post-processing of the Eye Movements in Programming Dataset

The use of eye tracking in the study of program comprehension in software engineering allows researchers to gain a better understanding of the strategies and processes applied by programmers. Despite the large number of eye tracking studies in software engineering, very few datasets are publicly available. The existence of the large Eye Movements in Programming Dataset (EMIP) opens the door for new studies and makes reproducibility of existing research easier. In this paper, a Python library (the EMIP Toolkit) for customized post-processing of the EMIP dataset is presented. The toolkit is specifically designed to make using the EMIP dataset easier and more accessible. It implements features for fixation detection and correction, trial visualization, source code lexical data enrichment, and mapping fixation data over areas of interest. In addition to the toolkit, a filtered token-level dataset with scored recording quality is presented for all Java trials (accounting for 95.8% of the data) in the EMIP dataset.

[1]  Susan Brennan,et al.  Another person's eye gaze as a cue in solving programming problems , 2004, ICMI '04.

[2]  Fabian Fagerholm,et al.  EMIP: The eye movements in programming dataset , 2020, Sci. Comput. Program..

[3]  Rainer Koschke,et al.  On the Comprehension of Program Comprehension , 2014, TSEM.

[4]  Yann-Gaël Guéhéneuc,et al.  An empirical study on the efficiency of different design pattern representations in UML class diagrams , 2010, Empirical Software Engineering.

[5]  Yann-Gaël Guéhéneuc,et al.  A practical guide on conducting eye tracking studies in software engineering , 2020, Empirical Software Engineering.

[6]  Yann-Gaël Guéhéneuc,et al.  An empirical study on requirements traceability using eye-tracking , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[7]  Markku Tukiainen,et al.  An eye-tracking methodology for characterizing program comprehension processes , 2006, ETRA.

[8]  Martha E. Crosby,et al.  How do we read algorithms? A case study , 1990, Computer.

[9]  Patrick Jermann,et al.  Looking AT versus Looking THROUGH: A Dual Eye-Tracking Study in MOOC Context , 2015, CSCL.

[10]  Marcus Nyström,et al.  An adaptive algorithm for fixation, saccade, and glissade detection in eyetracking data , 2010, Behavior research methods.

[11]  Bonita Sharif,et al.  Can the E-Z Reader Model Predict Eye Movements Over Code? Towards a Model of Eye Movements Over Source Code , 2020, ETRA Short Papers.

[12]  Yann-Gaël Guéhéneuc,et al.  Taupe: Visualizing and analyzing eye-tracking data , 2014, Sci. Comput. Program..

[13]  Markku Tukiainen,et al.  Analysing and Interpreting Quantitative Eye-Tracking Data in Studies of Programming: Phases of Debugging with Multiple Representations , 2007, PPIG.

[14]  Peter C.-H. Cheng,et al.  A Survey on the Usage of Eye-Tracking in Computer Programming , 2018, ACM Comput. Surv..

[15]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[16]  Murat Perit Çakir,et al.  How Does Software Visualization Contribute to Software Comprehension? A Grounded Theory Approach , 2013, Int. J. Hum. Comput. Interact..

[17]  Dawn J. Lawrie,et al.  The impact of identifier style on effort and comprehension , 2012, Empirical Software Engineering.

[18]  H. Sahraoui,et al.  Impact of the visitor pattern on program comprehension and maintenance , 2009, 2009 3rd International Symposium on Empirical Software Engineering and Measurement.

[19]  Bonita Sharif,et al.  An eye-tracking study assessing the comprehension of c++ and Python source code , 2014, ETRA.

[20]  Pablo Romero,et al.  Visual Attention and Representation Switching During Java Program Debugging: A Study Using the Restricted Focus Viewer , 2002, Diagrams.

[21]  Jonathan I. Maletic,et al.  Assessing the Comprehension of UML Class Diagrams via Eye Tracking , 2007, 15th IEEE International Conference on Program Comprehension (ICPC '07).

[22]  G Karthik,et al.  A Custom Implementation of the Velocity Threshold Algorithm for Fixation Identification , 2019, 2019 International Conference on Smart Systems and Inventive Technology (ICSSIT).

[23]  Yann-Gaël Guéhéneuc,et al.  TAUPE: towards understanding program comprehension , 2006, CASCON.

[24]  Yann-Gaël Guéhéneuc,et al.  A systematic literature review on the usage of eye-tracking in software engineering , 2015, Inf. Softw. Technol..

[25]  Silvia Wen-Yu Lee,et al.  A review of using eye-tracking technology in exploring learning from 2000 to 2012 , 2013 .

[26]  Jonathan I. Maletic,et al.  An Eye Tracking Study on camelCase and under_score Identifier Styles , 2010, 2010 IEEE 18th International Conference on Program Comprehension.

[27]  N. Hari Narayanan,et al.  Visual attention patterns during program debugging with an IDE , 2012, ETRA '12.

[28]  Jonathan I. Maletic,et al.  From Novice to Expert: Analysis of Token Level Effects in a Longitudinal Eye Tracking Study , 2021, 2021 IEEE/ACM 29th International Conference on Program Comprehension (ICPC).

[29]  Jonathan I. Maletic,et al.  Lightweight Transformation and Fact Extraction with the srcML Toolkit , 2011, 2011 IEEE 11th International Working Conference on Source Code Analysis and Manipulation.

[30]  Bonita Sharif,et al.  Eye movements in software traceability link recovery , 2017, Empirical Software Engineering.

[31]  Yann-Gaël Guéhéneuc,et al.  Professional status and expertise for UML class diagram comprehension: An empirical study , 2012, 2012 20th IEEE International Conference on Program Comprehension (ICPC).

[32]  Bonita Sharif,et al.  Capturing software traceability links from developers' eye gazes , 2014, ICPC 2014.

[33]  Andreas Busjahn,et al.  Analysis of code reading to gain more insight in program comprehension , 2011, Koli Calling.

[34]  M. Crosby,et al.  Code Scanning Patterns in Program Comprehension , 2005 .