Towards an automated estimation of English skill via TOEIC score based on reading analysis

Estimating automatically the degree of language skill by analyzing the eye movements is a promising way to help people from all over the world to learn a new language. In this study, we focus on the English skills of non-native speakers. Our aim is to provide an algorithm that can assess accurately and automatically the TOEIC score after reading English texts for few minutes. As a first step towards this direction, we propose an algorithm that can predict accurately this score after reading and answering some questions about the comprehension of few English texts. We use an eye tracker in order to record the eye gaze, i.e. the positions where the reader is looking at. Then we extract several features to characterize the behavior, and consequently the skill of the reader. We also add a feature based on the number of correct answers to the questions. By using a machine learning based on multivariate regression, the score is estimated user independently. A backward stepwise feature selection is used to select the relevant features and to optimize the estimation. As a main result, the TOEIC score is estimated with 21.7 points of mean absolute error for 21 subjects after reading and answering the questions of only 3 documents.

[1]  Kai Kunze,et al.  I know what you are reading: recognition of document types using mobile eye tracking , 2013, ISWC '13.

[2]  Kai Kunze,et al.  The Wordometer -- Estimating the Number of Words Read Using Document Image Retrieval and Mobile Eye Tracking , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[3]  Kai Kunze,et al.  The eye as the window of the language ability: Estimation of English skills by analyzing eye movement while reading documents , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[4]  K. Rayner Eye movements in reading and information processing: 20 years of research. , 1998, Psychological bulletin.

[5]  Koichi Kise,et al.  Eye gaze and text line matching for reading analysis , 2015, UbiComp/ISWC Adjunct.

[6]  Ron Kohavi,et al.  Irrelevant Features and the Subset Selection Problem , 1994, ICML.

[7]  Andreas Bulling,et al.  EyeTab: model-based gaze estimation on unmodified tablet computers , 2014, ETRA.

[8]  Kai Kunze,et al.  Implicit gaze based annotations to support second language learning , 2014, UbiComp Adjunct.

[9]  Haijun Kang Understanding online reading through the eyes of first and second language readers: An exploratory study , 2014, Comput. Educ..

[10]  Pascual Martínez-Gómez,et al.  Recognition of understanding level and language skill using measurements of reading behavior , 2014, IUI.

[11]  Kiyomi Chujo,et al.  How Many Words Do You Need to Know to Understand TOEIC, TOEFL & EIKEN? An Examination of Text Coverage and High Frequency Vocabulary , 2009 .

[12]  Tamás D. Gedeon,et al.  Predicting reading comprehension scores from eye movements using artificial neural networks and fuzzy output error , 2014, Artif. Intell. Res..

[13]  Kai Kunze,et al.  Towards inferring language expertise using eye tracking , 2013, CHI Extended Abstracts.

[14]  Kai Kunze,et al.  The augmented narrative: toward estimating reader engagement , 2015, AH.

[15]  K. Rayner,et al.  Eye Movements as Reflections of Comprehension Processes in Reading , 2006 .

[16]  Koichi Kise,et al.  A proposal of a document image reading-life log based on document image retrieval and eyetracking , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[17]  Koichi Kise,et al.  Reading similarity measure based on comparison of fixation sequences , 2015, UbiComp/ISWC Adjunct.