Predicting respondent difficulty in web surveys: A machine-learning approach based on mouse movement features

A central goal of survey research is to collect robust and reliable data from respondents. However, despite researchers' best efforts in designing questionnaires, respondents may experience difficulty understanding questions' intent and therefore may struggle to respond appropriately. If it were possible to detect such difficulty, this knowledge could be used to inform real-time interventions through responsive questionnaire design, or to indicate and correct measurement error after the fact. Previous research in the context of web surveys has used paradata, specifically response times, to detect difficulties and to help improve user experience and data quality. However, richer data sources are now available, in the form of the movements respondents make with the mouse, as an additional and far more detailed indicator for the respondent-survey interaction. This paper uses machine learning techniques to explore the predictive value of mouse-tracking data with regard to respondents' difficulty. We use data from a survey on respondents' employment history and demographic information, in which we experimentally manipulate the difficulty of several questions. Using features derived from the cursor movements, we predict whether respondents answered the easy or difficult version of a question, using and comparing several state-of-the-art supervised learning methods. In addition, we develop a personalization method that adjusts for respondents' baseline mouse behavior and evaluate its performance. For all three manipulated survey questions, we find that including the full set of mouse movement features improved prediction performance over response-time-only models in nested cross-validation. Accounting for individual differences in mouse movements led to further improvements.

[1]  F. Conrad,et al.  Use and non-use of clarification features in web surveys , 2006 .

[2]  Achim Zeileis,et al.  BMC Bioinformatics BioMed Central Methodology article Conditional variable importance for random forests , 2008 .

[3]  Michael J. Stern,et al.  The Use of Client-side Paradata in Analyzing the Effects of Visual Layout on Changing Responses in Web Surveys , 2008 .

[4]  Piet Sellke,et al.  Analyzing cognitive processes in CATI-surveys with response latencies : an empirical evaluation of the consequences of using different baseline speed measures , 2005 .

[5]  Mario Callegaro,et al.  Paradata in Web Surveys , 2013 .

[6]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[7]  Frederick G. Conrad,et al.  Using Mouse Movements to Predict Web Survey Response Difficulty , 2017 .

[8]  Ting Yan,et al.  Assessing Quality of Answers to a Global Subjective Well-being Question Through Response Times. , 2015, Survey research methods.

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  T. Johnson,et al.  The Impact of Question and Respondent Characteristics on Comprehension and Mapping Difficulties , 2006 .

[11]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[12]  R. Tourangeau,et al.  Fast times and easy questions: the effects of age, experience and question complexity on web survey response times , 2008 .

[13]  Dirk Heerwegh,et al.  Explaining Response Latencies and Changing Answers Using Client-Side Paradata from a Web Survey , 2003 .

[14]  Richard Simon,et al.  Bias in error estimation when using cross-validation for model selection , 2006, BMC Bioinformatics.

[15]  Andreas Christmann,et al.  Support vector machines , 2008, Data Mining and Knowledge Discovery Handbook.

[16]  R. Tibshirani,et al.  Improvements on Cross-Validation: The 632+ Bootstrap Method , 1997 .

[17]  Paul E. Stillman,et al.  Resisting Temptation: Tracking How Self-Control Conflicts Are Successfully Resolved in Real Time , 2017, Psychological science.

[18]  Jonas M. B. Haslbeck,et al.  Mouse-tracking: A practical guide to implementation and analysis , 2018 .

[19]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[20]  Vasja Vehovar,et al.  Investigating respondent multitasking in web surveys using paradata , 2016, Comput. Hum. Behav..

[21]  Asaph Young Chun,et al.  JOS Special Issue on Responsive and Adaptive Survey Design: Looking Back to See Forward – Editorial , 2017 .

[22]  David Hinkley,et al.  Bootstrap Methods: Another Look at the Jackknife , 2008 .

[23]  Denis O'Hora,et al.  Decisions in Motion: Decision Dynamics during Intertemporal Choice reflect Subjective Evaluation of Delayed Rewards , 2016, Scientific Reports.

[24]  Lars Kaczmirek,et al.  Cognitive burden of survey questions and response times: A psycholinguistic experiment , 2010 .

[25]  L. Breiman Arcing classifier (with discussion and a rejoinder by the author) , 1998 .

[26]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[27]  Ting Yan,et al.  Analyzing Paradata to Investigate Measurement Error , 2013 .

[28]  T. Yarkoni,et al.  Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning , 2017, Perspectives on psychological science : a journal of the Association for Psychological Science.

[29]  N. Ambady,et al.  A dynamic interactive theory of person construal. , 2011, Psychological review.

[30]  Roland L. Dunbrack,et al.  The Role of Balanced Training and Testing Data Sets for Binary Classifiers in Bioinformatics , 2013, PloS one.

[31]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[32]  Frauke Kreuter,et al.  Improving Surveys with Paradata: Analytic Uses of Process Information , 2013 .

[33]  J. Freeman,et al.  Advanced mouse-tracking analytic techniques for enhancing psychological science , 2015 .

[34]  G. Loosveldt,et al.  Fieldwork Monitoring for the European Social Survey: An illustration with Belgium and the Czech Republic in Round 7 , 2017 .

[35]  Kirstin Early,et al.  Dynamic Question Ordering: Obtaining Useful Information While Reducing User Burden , 2017 .

[37]  Thomas G. Dietterich An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization , 2000, Machine Learning.

[38]  Frauke Kreuter,et al.  Using paradata to explore item level response times in surveys , 2013 .

[39]  J. Friedman Greedy function approximation: A gradient boosting machine. , 2001 .

[40]  E. Leeuw,et al.  To mix or not to mix data collection modes in surveys. , 2005 .

[41]  Stephan Schlosser,et al.  Investigating the Adequacy of Response Time Outlier Definitions in Computer-Based Web Surveys Using Paradata SurveyFocus , 2018 .

[42]  Frederick G. Conrad,et al.  Bringing Features of Human Dialogue to Web Surveys , 2007 .

[43]  Paul E. Stillman,et al.  How Mouse-tracking Can Advance Social Cognitive Theory , 2018, Trends in Cognitive Sciences.

[44]  Frederick G. Conrad,et al.  Speeding in Web Surveys: The tendency to answer very fast and its association with straightlining , 2014 .

[45]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[46]  H. Chernoff,et al.  Why significant variables aren’t automatically good predictors , 2015, Proceedings of the National Academy of Sciences.

[47]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[48]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[49]  Daniela M. Witten,et al.  An Introduction to Statistical Learning: with Applications in R , 2013 .

[50]  Roger Tourangeau,et al.  The Science of Web Surveys , 2013 .

[51]  Frauke Kreuter,et al.  Learning from Mouse Movements: Improving Questionnaires and Respondents' User Experience Through Passive Data Collection , 2020 .

[52]  Brian D. Ripley,et al.  Neural networks and flexible regression and discrimination , 1994 .

[53]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[54]  Brian D. Ripley,et al.  Neural Networks and Related Methods for Classification , 1994 .

[55]  Wei-Yin Loh,et al.  Fifty Years of Classification and Regression Trees , 2014 .

[56]  Stefan Stieger,et al.  What are participants doing while filling in an online questionnaire: A paradata collection tool and an empirical study , 2010, Comput. Hum. Behav..

[57]  Mick P. Couper,et al.  A Typology of Web Survey Paradata for Assessing Total Survey Error , 2019 .